We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Rapid urbanization poses several challenges, especially when faced with an uncontrolled urban development plan. Therefore, it often leads to anarchic occupation and expansion of cities, resulting in the phenomenon of urban sprawl (US). To support sustainable decision–making in urban planning and policy development, a more effective approach to addressing this issue through US simulation and prediction is essential. Despite the work published in the literature on the use of deep learning (DL) methods to simulate US indicators, almost no work has been published to assess what has already been done, the potential, the issues, and the challenges ahead. By synthesising existing research, we aim to assess the current landscape of the use of DL in modelling US. This article elucidates the complexities of US, focusing on its multifaceted challenges and implications. Through an examination of DL methodologies, we aim to highlight their effectiveness in capturing the complex spatial patterns and relationships associated with US. This work begins by demystifying US, highlighting its multifaceted challenges. In addition, the article examines the synergy between DL and conventional methods, highlighting the advantages and disadvantages. It emerges that the use of DL in the simulation and forecasting of US indicators is increasing, and its potential is very promising for guiding strategic decisions to control and mitigate this phenomenon. Of course, this is not without major challenges, both in terms of data and models and in terms of strategic city planning policies.
Wind speed at the sea surface is a key quantity for a variety of scientific applications and human activities. For its importance, many observation techniques exist, ranging from in situ to satellite observations. However, none of such techniques can capture the spatiotemporal variability of the phenomenon at the same time. Reanalysis products, obtained from data assimilation methods, represent the state-of-the-art for sea-surface wind speed monitoring but may be biased by model errors and their spatial resolution is not competitive with satellite products. In this work, we propose a scheme based on both data assimilation and deep learning concepts to process spatiotemporally heterogeneous input sources to reconstruct high-resolution time series of spatial wind speed fields. This method allows to us make the most of the complementary information conveyed by the different sea-surface information typically available in operational settings. We use synthetic wind speed data to emulate satellite images, in situ time series and reanalyzed wind fields. Starting from these pseudo-observations, we run extensive numerical simulations to assess the impact of each input source on the model reconstruction performance. We show that our proposed framework outperforms a deep learning–based inversion scheme and can successfully exploit the spatiotemporal complementary information of the different input sources. We also show that the model can learn the possible bias in reanalysis products and attenuate it in the output reconstructions.
The growing demand for global wind power production, driven by the critical need for sustainable energy sources, requires reliable estimation of wind speed vertical profiles for accurate wind power prediction and comprehensive wind turbine performance assessment. Traditional methods relying on empirical equations or similarity theory face challenges due to their restricted applicability beyond the surface layer. Although recent studies have utilized various machine learning techniques to vertically extrapolate wind speeds, they often focus on single levels and lack a holistic approach to predicting entire wind profiles. As an alternative, this study introduces a proof-of-concept methodology utilizing TabNet, an attention-based sequential deep learning model, to estimate wind speed vertical profiles from coarse-resolution meteorological features extracted from a reanalysis dataset. To ensure that the methodology is applicable across diverse datasets, Chebyshev polynomial approximation is employed to model the wind profiles. Trained on the meteorological features as inputs and the Chebyshev coefficients as targets, the TabNet more-or-less accurately predicts unseen wind profiles for different wind conditions, such as high shear, low shear/well-mixed, low-level jet, and high wind. Additionally, this methodology quantifies the correlation of wind profiles with prevailing atmospheric conditions through a systematic feature importance assessment.
For the pulse shaping system of the SG-II-up facility, we propose a U-shaped convolutional neural network that integrates multi-scale feature extraction capabilities, an attention mechanism and long short-term memory units, which effectively facilitates real-time denoising of diverse shaping pulses. We train the model using simulated datasets and evaluate it on both the simulated and experimental temporal waveforms. During the evaluation of simulated waveforms, we achieve high-precision denoising, resulting in great performance for temporal waveforms with frequency modulation-to-amplitude modulation conversion (FM-to-AM) exceeding 50%, exceedingly high contrast of over 300:1 and multi-step structures. The errors are less than 1% for both root mean square error and contrast, and there is a remarkable improvement in the signal-to-noise ratio by over 50%. During the evaluation of experimental waveforms, the model can obtain different denoised waveforms with contrast greater than 200:1. The stability of the model is verified using temporal waveforms with identical pulse widths and contrast, ensuring that while achieving smooth temporal profiles, the intricate details of the signals are preserved. The results demonstrate that the denoising model, trained utilizing the simulation dataset, is capable of efficiently processing complex temporal waveforms in real-time for experiments and mitigating the influence of electronic noise and FM-to-AM on the time–power curve.
Sea Surface Height Anomaly (SLA) is a signature of the mesoscale dynamics of the upper ocean. Sea surface temperature (SST) is driven by these dynamics and can be used to improve the spatial interpolation of SLA fields. In this study, we focused on the temporal evolution of SLA fields. We explored the capacity of deep learning (DL) methods to predict short-term SLA fields using SST fields. We used simulated daily SLA and SST data from the Mercator Global Analysis and Forecasting System, with a resolution of (1/12)° in the North Atlantic Ocean (26.5–44.42°N, −64.25–41.83°E), covering the period from 1993 to 2019. Using a slightly modified image-to-image convolutional DL architecture, we demonstrated that SST is a relevant variable for controlling the SLA prediction. With a learning process inspired by the teaching-forcing method, we managed to improve the SLA forecast at 5 days by using the SST fields as additional information. We obtained predictions of 12 cm (20 cm) error of SLA evolution for scales smaller than mesoscales and at time scales of 5 days (20 days) respectively. Moreover, the information provided by the SST allows us to limit the SLA error to 16 cm at 20 days when learning the trajectory.
This article addresses the challenges of assessing pedestrian-level wind conditions in urban environments using a deep learning approach. The influence of large buildings on urban wind patterns has significant implications for thermal comfort, pollutant transport, pedestrian safety, and energy usage. Traditional methods, such as wind tunnel testing, are time-consuming and costly, leading to a growing interest in computational methods like computational fluid dynamics (CFD) simulations. However, CFD still requires a significant time investment for such studies, limiting the available time for design modification prior to lockdown. This study proposes a deep learning surrogate model based on a MLP-mixer architecture to predict mean flow conditions for complex arrays of buildings. The model is trained on a diverse dataset of synthetic geometries and corresponding CFD simulations, demonstrating its effectiveness in capturing intricate wind dynamics. The article discusses the model architecture and data preparation and evaluates its performance qualitatively and quantitatively. Results show promising capabilities in replicating key wind features with a mean error of 0.3 m/s and rarely exceeding 0.75 m/s, making the proposed model a valuable tool for early-stage urban wind modelling.
Comprehensive housing stock information is crucial for informing the development of climate resilience strategies aiming to reduce the adverse impacts of extreme climate hazards in high-risk regions like the Caribbean. In this study, we propose an end-to-end workflow for rapidly generating critical baseline exposure data using very high-resolution drone imagery and deep learning techniques. Specifically, our work leverages the segment anything model (SAM) and convolutional neural networks (CNNs) to automate the generation of building footprints and roof classification maps. We evaluate the cross-country generalizability of the CNN models to determine how well models trained in one geographical context can be adapted to another. Finally, we discuss our initiatives for training and upskilling government staff, community mappers, and disaster responders in the use of geospatial technologies. Our work emphasizes the importance of local capacity building in the adoption of AI and Earth Observation for climate resilience in the Caribbean.
Stochastic generators are useful for estimating climate impacts on various sectors. Projecting climate risk in various sectors, e.g. energy systems, requires generators that are accurate (statistical resemblance to ground-truth), reliable (do not produce erroneous examples), and efficient. Leveraging data from the North American Land Data Assimilation System, we introduce TemperatureGAN, a Generative Adversarial Network conditioned on months, regions, and time periods, to generate 2 m above ground atmospheric temperatures at an hourly resolution. We propose evaluation methods and metrics to measure the quality of generated samples. We show that TemperatureGAN produces high-fidelity examples with good spatial representation and temporal dynamics consistent with known diurnal cycles.
Bias correction is a critical aspect of data-centric climate studies, as it aims to improve the consistency between observational data and simulations by climate models or estimates by remote sensing. Satellite-based estimates of climatic variables like precipitation often exhibit systematic bias when compared to ground observations. To address this issue, the application of bias correction techniques becomes necessary. This research work examines the use of deep learning to reduce the systematic bias of satellite estimations at each grid location while maintaining the spatial dependency across grid points. More specifically, we try to calibrate daily precipitation values of tropical rainfall measuring mission based TRMM_3B42_Daily precipitation data over Indian landmass with ground observations recorded by India Meteorological Department (IMD). We have focused on the precipitation estimates of the Indian Summer Monsoon Rainfall (ISMR) period (June–September) since India gets more than 75% of its annual rainfall in this period. We have benchmarked these deep learning methods against standard statistical methods like quantile mapping and quantile delta mapping on the above datasets. The comparative analysis shows the effectiveness of the deep learning architecture in bias correction.
Algorithmic automatic item generation can be used to obtain large quantities of cognitive items in the domains of knowledge and aptitude testing. However, conventional item models used by template-based automatic item generation techniques are not ideal for the creation of items for non-cognitive constructs. Progress in this area has been made recently by employing long short-term memory recurrent neural networks to produce word sequences that syntactically resemble items typically found in personality questionnaires. To date, such items have been produced unconditionally, without the possibility of selectively targeting personality domains. In this article, we offer a brief synopsis on past developments in natural language processing and explain why the automatic generation of construct-specific items has become attainable only due to recent technological progress. We propose that pre-trained causal transformer models can be fine-tuned to achieve this task using implicit parameterization in conjunction with conditional generation. We demonstrate this method in a tutorial-like fashion and finally compare aspects of validity in human- and machine-authored items using empirical data. Our study finds that approximately two-thirds of the automatically generated items show good psychometric properties (factor loadings above .40) and that one-third even have properties equivalent to established and highly curated human-authored items. Our work thus demonstrates the practical use of deep neural networks for non-cognitive automatic item generation.
Marginal maximum likelihood (MML) estimation is the preferred approach to fitting item response theory models in psychometrics due to the MML estimator’s consistency, normality, and efficiency as the sample size tends to infinity. However, state-of-the-art MML estimation procedures such as the Metropolis–Hastings Robbins–Monro (MH-RM) algorithm as well as approximate MML estimation procedures such as variational inference (VI) are computationally time-consuming when the sample size and the number of latent factors are very large. In this work, we investigate a deep learning-based VI algorithm for exploratory item factor analysis (IFA) that is computationally fast even in large data sets with many latent factors. The proposed approach applies a deep artificial neural network model called an importance-weighted autoencoder (IWAE) for exploratory IFA. The IWAE approximates the MML estimator using an importance sampling technique wherein increasing the number of importance-weighted (IW) samples drawn during fitting improves the approximation, typically at the cost of decreased computational efficiency. We provide a real data application that recovers results aligning with psychological theory across random starts. Via simulation studies, we show that the IWAE yields more accurate estimates as either the sample size or the number of IW samples increases (although factor correlation and intercepts estimates exhibit some bias) and obtains similar results to MH-RM in less time. Our simulations also suggest that the proposed approach performs similarly to and is potentially faster than constrained joint maximum likelihood estimation, a fast procedure that is consistent when the sample size and the number of items simultaneously tend to infinity.
Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven’s progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.
Cognitive diagnostic models (CDMs) are discrete latent variable models popular in educational and psychological measurement. In this work, motivated by the advantages of deep generative modeling and by identifiability considerations, we propose a new family of DeepCDMs, to hunt for deep discrete diagnostic information. The new class of models enjoys nice properties of identifiability, parsimony, and interpretability. Mathematically, DeepCDMs are entirely identifiable, including even fully exploratory settings and allowing to uniquely identify the parameters and discrete loading structures (the “\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\textbf{Q}$$\end{document}-matrices”) at all different depths in the generative model. Statistically, DeepCDMs are parsimonious, because they can use a relatively small number of parameters to expressively model data thanks to the depth. Practically, DeepCDMs are interpretable, because the shrinking-ladder-shaped deep architecture can capture cognitive concepts and provide multi-granularity skill diagnoses from coarse to fine grained and from high level to detailed. For identifiability, we establish transparent identifiability conditions for various DeepCDMs. Our conditions impose intuitive constraints on the structures of the multiple \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\textbf{Q}$$\end{document}-matrices and inspire a generative graph with increasingly smaller latent layers when going deeper. For estimation and computation, we focus on the confirmatory setting with known \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\textbf{Q}$$\end{document}-matrices and develop Bayesian formulations and efficient Gibbs sampling algorithms. Simulation studies and an application to the TIMSS 2019 math assessment data demonstrate the usefulness of the proposed methodology.
With the fast development of modern microscopes and bioimaging techniques, an unprecedentedly large amount of imaging data is being generated, stored, analyzed, and shared through networks. The size of the data poses great challenges for current data infrastructure. One common way to reduce the data size is by image compression. This study analyzes multiple classic and deep-learning-based image compression methods, as well as an empirical study on their impact on downstream deep-learning-based image processing models. We used deep-learning-based label-free prediction models (i.e., predicting fluorescent images from bright-field images) as an example downstream task for the comparison and analysis of the impact of image compression. Different compression techniques are compared in compression ratio, image similarity, and, most importantly, the prediction accuracy of label-free models on original and compressed images. We found that artificial intelligence (AI)-based compression techniques largely outperform the classic ones with minimal influence on the downstream 2D label-free tasks. In the end, we hope this study could shed light on the potential of deep-learning-based image compression and raise the awareness of the potential impacts of image compression on downstream deep-learning models for analysis.
This Element covers the interaction of two research areas: linguistic semantics and deep learning. It focuses on three phenomena central to natural language interpretation: reasoning and inference; compositionality; extralinguistic grounding. Representation of these phenomena in recent neural models is discussed, along with the quality of these representations and ways to evaluate them (datasets, tests, measures). The Element closes with suggestions on possible deeper interactions between theoretical semantics and language technology based on deep learning models.
This paper aims to explore alternative representations of the physical architecture using its real-world sensory data through artificial neural networks (ANNs). In the project developed for this research, a detailed 3-D point cloud model is produced by scanning a physical structure with LiDAR. Then, point cloud data and mesh models are divided into parts according to architectural references and part-whole relationships with various techniques to create datasets. A deep learning model is trained using these datasets, and new 3-D models produced by deep generative models are examined. These new 3-D models, which are embodied in different representations, such as point clouds, mesh models, and bounding boxes, are used as a design vocabulary, and combinatorial formations are generated from them.
Environmental enrichment programmes are widely used to improve welfare of captive and laboratory animals, especially non-human primates. Monitoring enrichment use over time is crucial, as animals may habituate and reduce their interaction with it. In this study we aimed to monitor the interaction with enrichment items in groups of rhesus macaques (Macaca mulatta), each consisting of an average of ten individuals, living in a breeding colony. To streamline the time-intensive task of assessing enrichment programmes we automated the evaluation process by using machine learning technologies. We built two computer vision-based pipelines to evaluate monkeys’ interactions with different enrichment items: a white drum containing raisins and a non-food-based puzzle. The first pipeline analyses the usage of enrichment items in nine groups, both when it contains food and when it is empty. The second pipeline counts the number of monkeys interacting with a puzzle across twelve groups. The data derived from the two pipelines reveal that the macaques consistently express interest in the food-based white drum enrichment, even several months after its introduction. The puzzle enrichment was monitored for one month, showing a gradual decline in interaction over time. These pipelines are valuable for assessing enrichment by minimising the time spent on animal observation and data analysis; this study demonstrates that automated methods can consistently monitor macaque engagement with enrichments, systematically tracking habituation responses and long-term effectiveness. Such advancements have significant implications for enhancing animal welfare, enabling the discontinuation of ineffective enrichments and the adaptation of enrichment plans to meet the animals’ needs.
Optical microrobots are activated by a laser in a liquid medium using optical tweezers. To create visual control loops for robotic automation, this work describes a deep learning-based method for orientation estimation of optical microrobots, focusing on detecting 3-D rotational movements and localizing microrobots and trapping points (TPs). We integrated and fine-tuned You Only Look Once (YOLOv7) and Deep Simple Online Real-time Tracking (DeepSORT) algorithms, improving microrobot and TP detection accuracy by $\sim 3$% and $\sim 11$%, respectively, at the 0.95 Intersection over Union (IoU) threshold in our test set. Additionally, it increased mean average precision (mAP) by 3% at the 0.5:0.95 IoU threshold during training. Our results showed a 99% success rate in trapping events with no false-positive detection. We introduced a model that employs EfficientNet as a feature extractor combined with custom convolutional neural networks (CNNs) and feature fusion layers. To demonstrate its generalization ability, we evaluated the model on an independent in-house dataset comprising 4,757 image frames, where microrobots executed simultaneous rotations across all three axes. Our method provided mean rotation angle errors of $1.871^\circ$, $2.308^\circ$, and $2.808^\circ$ for X (yaw), Y (roll), and Z (pitch) axes, respectively. Compared to pre-trained models, our model provided the lowest error in the Y and Z axes while offering competitive results for X-axis. Finally, we demonstrated the explainability and transparency of the model’s decision-making process. Our work contributes to the field of microrobotics by providing an efficient 3-axis orientation estimation pipeline, with a clear focus on automation.
Precise and efficient grasping detection is vital for robotic arms to execute stable grasping tasks in industrial and household applications. However, existing methods fail to consider refining different scale features and detecting critical regions, resulting in coarse grasping rectangles. To address these issues, we propose a real-time coarse and fine granularity residual attention (CFRA) grasping detection network. First, to enable the network to detect different sizes of objects, we extract and fuse the coarse and fine granularity features. Then, we refine these fused features by introducing a feature refinement module, which enables the network to distinguish between object and background features effectively. Finally, we introduce a residual attention module that handles different shapes of objects adaptively, achieving refined grasping detection. We complete training and testing on both Cornell and Jacquard datasets, achieving detection accuracy of 98.7% and 94.2%, respectively. Moreover, the grasping success rate on the real-world UR3e robot achieves 98%. These results demonstrate the effectiveness and superiority of CFRA.
Since the outbreak of the COVID-19 epidemic, it has posed a great crisis to the health and economy of the world. The objective is to provide a simple deep-learning approach for predicting, modelling, and evaluating the time evolutions of the COVID-19 epidemic. The Dove Swarm Search (DSS) algorithm is integrated with the echo state network (ESN) to optimize the weight. The ESN-DSS model is constructed to predict the evolution of the COVID-19 time series. Specifically, the self-driven ESN-DSS is created to form a closed feedback loop by replacing the input with the output. The prediction results, which involve COVID-19 temporal evolutions of multiple countries worldwide, indicate the excellent prediction performances of our model compared with several artificial intelligence prediction methods from the literature (e.g., recurrent neural network, long short-term memory, gated recurrent units, variational auto encoder) at the same time scale. Moreover, the model parameters of the self-driven ESN-DSS are determined which acts as a significant impact on the prediction performance. As a result, the network parameters are adjusted to improve the prediction accuracy. The prediction results can be used as proposals to help governments and medical institutions formulate pertinent precautionary measures to prevent further spread. In addition, this study is not only limited to COVID-19 time series forecasting but also applicable to other nonlinear time series prediction problems.