Good air quality is a critical determinant of public health, influencing life expectancy, respiratory health, work productivity, and the prevention of chronic diseases. This study presents a novel approach to classifying the Air Quality Index (AQI) using deep learning techniques, specifically convolutional neural networks (CNNs). We collected and curated a dataset comprising 11,000 digital images from three distinct regions in Indonesia—Jakarta, Malang, and Semarang—ensuring uniformity through standardized acquisition settings. The images were categorized into four air quality classes: good, moderate, unhealthy for sensitive groups, and unhealthy. We designed and implemented a CNN architecture optimized for AQI classification. The model achieved an impressive accuracy of 99.81% using K-fold cross-validation. In addition, the model’s interpretative capabilities were examined using techniques such as Grad-CAM, providing valuable insights into how the CNN identifies and classifies air quality conditions based on image features. These findings underscore the effectiveness of CNNs for AQI classification and highlight the potential for future work to incorporate a more diverse set of digital images captured from various perspectives to enhance dataset complexity and model robustness. The dataset is publicly accessible at https://doi.org/10.5281/zenodo.15727522.