To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Underwater robots conducting inspections require autonomous obstacle avoidance capabilities to ensure safe operations. Training methods based on reinforcement learning (RL) can effectively develop autonomous obstacle avoidance strategies for underwater robots; however, training in real environments carries significant risks and can easily result in robot damage. This paper proposes a Sim-to-Real pipeline for RL-based training of autonomous obstacle avoidance in underwater robots, addressing the challenges associated with training and deploying RL methods for obstacle avoidance in this context. We establish a simulation model and environment for underwater robot training based on the mathematical model of the robot, comprehensively reducing the gap between simulation and reality in terms of system inputs, modeling, and outputs. Experimental results demonstrate that our high-fidelity simulation system effectively facilitates the training of autonomous obstacle avoidance algorithms, achieving a 94% success rate in obstacle avoidance and collision-free operation exceeding 5000 steps in virtual environments. Directly transferring the trained strategy to a real robot successfully performed obstacle avoidance experiments in a pool, validating the effectiveness of our method for autonomous strategy training and sim-to-real transfer in underwater robots.
In recent years, passive motion paradigms (PMPs), derived from the equilibrium point hypothesis and impedance control, have been utilised as manipulation methods for humanoid robots and robotic manipulators. These paradigms are typically achieved by creating a kinematic chain that enables the manipulator to perform goal-directed actions without explicitly solving the inverse kinematics. This approach leverages a kinematic model constructed through the training of artificial neural networks, aligning well with principles of cybernetics and cognitive computation by enabling adaptive and flexible control. Specifically, these networks model the relationship between joint angles and end-effector positions, facilitating the computation of the Jacobian matrix. Although this method does not require an accurate robot model, traditional neural networks often suffer from drawbacks such as overfitting and inefficient training, which can compromise the accuracy of the final PMP model. In this paper, we implement the method using a deep neural network and investigate the impact of activation functions and network depth on the performance of the kinematic model. Additionally, we propose a transfer learning approach to fine-tune the pre-trained model, enabling it to be transferred to other manipulator arms with different kinematic properties. Finally, we implement and evaluate the deep neural network-based PMP on the Universal Robots, comparing it with traditional kinematic controllers and assessing its physical interaction capabilities and accuracy.
Processing and extracting actionable information, such as fault or anomaly indicators originating from vibration telemetry, is both challenging and critical for an accurate assessment of mechanical system health and subsequent predictive maintenance. In the setting of predictive maintenance for populations of similar assets, the knowledge gained from any single asset should be leveraged to provide improved predictions across the entire population. In this paper, a novel approach to population-level health monitoring is presented adopting a transfer learning approach. The new methodology is applied to monitor multiple rotating plant assets in a power generation scenario. The focus is on the detection of statistical anomalies as a means of identifying deviations from the typical operating regime from a time series of telemetry data. This is a challenging task because the machine is observed under different operating regimes. The proposed methodology can effectively transfer information across different assets, automatically identifying segments with common statistical characteristics and using them to enrich the training of the local supervised learning models. The proposed solution leads to a substantial reduction in mean square error relative to a baseline model.
Prediction of dynamic environmental variables in unmonitored sites remains a long-standing challenge for water resources science. The majority of the world’s freshwater resources have inadequate monitoring of critical environmental variables needed for management. Yet, the need to have widespread predictions of hydrological variables such as river flow and water quality has become increasingly urgent due to climate and land use change over the past decades, and their associated impacts on water resources. Modern machine learning methods increasingly outperform their process-based and empirical model counterparts for hydrologic time series prediction with their ability to extract information from large, diverse data sets. We review relevant state-of-the art applications of machine learning for streamflow, water quality, and other water resources prediction and discuss opportunities to improve the use of machine learning with emerging methods for incorporating watershed characteristics and process knowledge into classical, deep learning, and transfer learning methodologies. The analysis here suggests most prior efforts have been focused on deep learning frameworks built on many sites for predictions at daily time scales in the United States, but that comparisons between different classes of machine learning methods are few and inadequate. We identify several open questions for time series predictions in unmonitored sites that include incorporating dynamic inputs and site characteristics, mechanistic understanding and spatial context, and explainable AI techniques in modern machine learning frameworks.
Surrogate models of turbulent diffusive flames could play a strategic role in the design of liquid rocket engine combustion chambers. The present article introduces a method to obtain data-driven surrogate models for coaxial injectors, by leveraging an inductive transfer learning strategy over a U-Net with available multifidelity Large Eddy Simulations (LES) data. The resulting models preserve reasonable accuracy while reducing the offline computational cost of data-generation. First, a database of about 100 low-fidelity LES simulations of shear-coaxial injectors, operating with gaseous oxygen and gaseous methane as propellants, has been created. The design of experiments explores three variables: the chamber radius, the recess-length of the oxidizer post, and the mixture ratio. Subsequently, U-Nets were trained upon this dataset to provide reasonable approximations of the temporal-averaged two-dimensional flow field. Despite the fact that neural networks are efficient non-linear data emulators, in purely data-driven approaches their quality is directly impacted by the precision of the data they are trained upon. Thus, a high-fidelity (HF) dataset has been created, made of about 10 simulations, to a much greater cost per sample. The amalgamation of low and HF data during the the transfer-learning process enables the improvement of the surrogate model’s fidelity without excessive additional cost.
In practice, nondestructive testing (NDT) procedures tend to consider experiments (and their respective models) as distinct, conducted in isolation, and associated with independent data. In contrast, this work looks to capture the interdependencies between acoustic emission (AE) experiments (as meta-models) and then use the resulting functions to predict the model hyperparameters for previously unobserved systems. We utilize a Bayesian multilevel approach (similar to deep Gaussian Processes) where a higher-level meta-model captures the inter-task relationships. Our key contribution is how knowledge of the experimental campaign can be encoded between tasks as well as within tasks. We present an example of AE time-of-arrival mapping for source localization, to illustrate how multilevel models naturally lend themselves to representing aggregate systems in engineering. We constrain the meta-model based on domain knowledge, then use the inter-task functions for transfer learning, predicting hyperparameters for models of previously unobserved experiments (for a specific design).
Transfer learning has been highlighted as a promising framework to increase the accuracy of the data-driven model in the case of data sparsity, specifically by leveraging pretrained knowledge to the training of the target model. The objective of this study is to evaluate whether the number of requisite training samples can be reduced with the use of various transfer learning models for predicting, for example, the chemical source terms of the data-driven reduced-order modeling (ROM) that represents the homogeneous ignition of a hydrogen/air mixture. Principal component analysis is applied to reduce the dimensionality of the hydrogen/air mixture in composition space. Artificial neural networks (ANNs) are used to regress the reaction rates of principal components, and subsequently, a system of ordinary differential equations is solved. As the number of training samples decreases in the target task, the ROM fails to predict the ignition evolution of a hydrogen/air mixture. Three transfer learning strategies are then applied to the training of the ANN model with a sparse dataset. The performance of the ROM with a sparse dataset is remarkably enhanced if the training of the ANN model is restricted by a regularization term that controls the degree of knowledge transfer from source to target tasks. To this end, a novel transfer learning method is introduced, Parameter control via Partial Initialization and Regularization (PaPIR), whereby the amount of knowledge transferred is systemically adjusted in terms of the initialization and regularization schemes of the ANN model in the target task.
With global wind energy capacity ramping up, accurately predicting damage equivalent loads (DELs) and fatigue across wind turbine populations is critical, not only for ensuring the longevity of existing wind farms but also for the design of new farms. However, the estimation of such quantities of interests is hampered by the inherent complexity in modeling critical underlying processes, such as the aerodynamic wake interactions between turbines that increase mechanical stress and reduce useful lifetime. While high-fidelity computational fluid dynamics and aeroelastic models can capture these effects, their computational requirements limits real-world usage. Recently, fast machine learning-based surrogates which emulate more complex simulations have emerged as a promising solution. Yet, most surrogates are task-specific and lack flexibility for varying turbine layouts and types. This study explores the use of graph neural networks (GNNs) to create a robust, generalizable flow and DEL prediction platform. By conceptualizing wind turbine populations as graphs, GNNs effectively capture farm layout-dependent relational data, allowing extrapolation to novel configurations. We train a GNN surrogate on a large database of PyWake simulations of random wind farm layouts to learn basic wake physics, then fine-tune the model on limited data for a specific unseen layout simulated in HAWC2Farm for accurate adapted predictions. This transfer learning approach circumvents data scarcity limitations and leverages fundamental physics knowledge from the source low-resolution data. The proposed platform aims to match simulator accuracy, while enabling efficient adaptation to new higher-fidelity domains, providing a flexible blueprint for wake load forecasting across varying farm configurations.
The disassembly of power batteries poses significant challenges due to their complex sources, diverse types, variations in design and manufacturing processes, and diverse service conditions. Human memory capacity and robot cognitive and understanding capabilities are limited when faced with different dismantling tasks for end-of-life power batteries. Insufficient human-computer interaction capabilities greatly hinder the efficiency of human-robot collaboration (HRC) operations. The existing HRC relies heavily on the experience of operators, while the existing disassembly system fails to update new disassembly strategies in real time when facing new battery varieties. Therefore, this paper proposes an augmented reality-assisted human-robot collaboration (AR-HRC) power battery dismantling system based on transfer learning. It consists of three modules: AR-HRC knowledge modeling, dismantling subgraph similarity assessment, and strategy transfer update. The AR-HRC knowledge modeling module aims to establish an intelligent mapping from tasks to collaborative strategies based on part features. Based on the evaluation of task similarity, the mobility assessment model divides subtasks into similar and dissimilar classes. For similar subtasks, the original dismantling strategy can be applied to the current task. However, for different subtasks, operators can issue instructions to the AR-HRC system through the human-computer interaction function of AR and develop new collaborative strategies based on actual conditions. Finally, a case study of power battery dismantling is conducted, and the results show that compared to traditional pre-programmed assembly, this system can improve dismantling efficiency and reduce cognitive burden.
This manuscript introduces deep learning models that simultaneously describe the dynamics of several yield curves. We aim to learn the dependence structure among the different yield curves induced by the globalization of financial markets and exploit it to produce more accurate forecasts. By combining the self-attention mechanism and nonparametric quantile regression, our model generates both point and interval forecasts of future yields. The architecture is designed to avoid quantile crossing issues affecting multiple quantile regression models. Numerical experiments conducted on two different datasets confirm the effectiveness of our approach. Finally, we explore potential extensions and enhancements by incorporating deep ensemble methods and transfer learning mechanisms.
Traditionally, electricity distribution networks were designed for unidirectional power flow without the need to accommodate generation installed at the point of use. However, with the increase in Distributed Energy Resources and other Low Carbon Technologies, the role of distribution networks is changing. This shift brings challenges, including the need for intensive metering and more frequent reconfiguration to identify threats from voltage and thermal violations. Mitigating action through reconfiguration is informed by State Estimation, which is especially challenging for low voltage distribution networks where the constraints of low observability, non-linear load relationships, and highly unbalanced systems all contribute to the difficulty of producing accurate state estimates. To counter low observability, this paper proposes the application of a novel transfer learning methodology, based upon the concept of conditional online Bayesian transfer, to make forward predictions of bus pseudo-measurements. Day ahead load forecasts at a fully observed point on the network are adjusted using the intraday residuals at other points in the network to provide them with load forecasts without the need for a complete set of forecast models at all substations. These form pseudo-measurements that then inform the state estimates at future time points. This methodology is demonstrated on both a representative IEEE Test network and on an actual GB 11 kV feeder network.
Developing an artificial design agent that mimics human design behaviors through the integration of heuristics is pivotal for various purposes, including advancing design automation, fostering human-AI collaboration, and enhancing design education. However, this endeavor necessitates abundant behavioral data from human designers, posing a challenge due to data scarcity for many design problems. One potential solution lies in transferring learned design knowledge from one problem domain to another. This article aims to gather empirical evidence and computationally evaluate the transferability of design knowledge represented at a high level of abstraction across different design problems. Initially, a design agent grounded in reinforcement learning (RL) is developed to emulate human design behaviors. A data-driven reward mechanism, informed by the Markov chain model, is introduced to reinforce prominent sequential design patterns. Subsequently, the design agent transfers the acquired knowledge from a source task to a target task using a problem-agnostic high-level representation. Through a case study involving two solar system designs, one dataset trains the design agent to mimic human behaviors, while another evaluates the transferability of these learned behaviors to a distinct problem. Results demonstrate that the RL-based agent outperforms a baseline model utilizing the first-order Markov chain model in both the source task without knowledge transfer and the target task with knowledge transfer. However, the model’s performance is comparatively lower in predicting the decisions of low-performing designers, suggesting caution in its application, as it may yield unsatisfactory results when mimicking such behaviors.
Domain adaptation is important in agriculture because agricultural systems have their own individual characteristics. Applying the same treatment practices (e.g., fertilization) to different systems may not have the desired effect due to those characteristics. Domain adaptation is also an inherent aspect of digital twins. In this work, we examine the potential of transfer learning for domain adaptation in pasture digital twins. We use a synthetic dataset of grassland pasture simulations to pretrain and fine-tune machine learning metamodels for nitrogen response rate prediction. We investigate the outcome in locations with diverse climates, and examine the effect on the results of including more weather and agricultural management practices data during the pretraining phase. We find that transfer learning seems promising to make the models adapt to new conditions. Moreover, our experiments show that adding more weather data on the pretraining phase has a small effect on fine-tuned model performance compared to adding more management practices. This is an interesting finding that is worth further investigation in future studies.
Based on Chapter 6, in this chapter we expand the discussion of neural networks to include networks that have more than one hidden layer. Common structures such as the convolutional neural network (CNN) or the Long Short-Term Memory network (LSTM) are explained and used along with Matlab’s Deep Network Designer App as well as Matlab script to implement and train such networks. Issues such as the vanishing or exploding gradient, normalization, and training strategies are discussed. Concepts that address overfitting and the vanishing or exploding gradient are introduced, including dropout and regularization. Transfer learning is discussed and showcased using Matlab’s DND App.
Supervised machine learning is an increasingly popular tool for analyzing large political text corpora. The main disadvantage of supervised machine learning is the need for thousands of manually annotated training data points. This issue is particularly important in the social sciences where most new research questions require new training data for a new task tailored to the specific research question. This paper analyses how deep transfer learning can help address this challenge by accumulating “prior knowledge” in language models. Models like BERT can learn statistical language patterns through pre-training (“language knowledge”), and reliance on task-specific data can be reduced by training on universal tasks like natural language inference (NLI; “task knowledge”). We demonstrate the benefits of transfer learning on a wide range of eight tasks. Across these eight tasks, our BERT-NLI model fine-tuned on 100 to 2,500 texts performs on average 10.7 to 18.3 percentage points better than classical models without transfer learning. Our study indicates that BERT-NLI fine-tuned on 500 texts achieves similar performance as classical models trained on around 5,000 texts. Moreover, we show that transfer learning works particularly well on imbalanced data. We conclude by discussing limitations of transfer learning and by outlining new opportunities for political science research.
NN models with more hidden layers than the traditional NN are referred to as deep neural network (DNN) or deep learning (DL) models, which are now widely used in environmental science. For image data, the convolutional neural network (CNN) has been developed, where in convolutional layers, a neuron is only connected to a small patch of neurons in the preceding layer, thereby greatly reducing the number of model weights. Popular architectures of DNN include the encoder-decoder and U-net models. For time series modelling, the long short-term memory (LSTM) network and temporal convolutional network have been developed. Generative adversarial network (GAN) produces highly realistic fake data.
Drug discovery uses high throughput screening to identify compounds that interact with a molecular target or that alter a phenotype favorably. The cautious selection of molecules used for such a screening is instrumental and is tightly related to the hit rate. In this work, we wondered if cell painting, a general-purpose image-based assay, could be used as an efficient proxy for compound selection, thus increasing the success rate of a specific assay. To this end, we considered cell painting images with 30,000 molecules treatments, and selected compounds that produced a visual effect close to the positive control of an assay, by using the Frechet Inception Distance. We then compared the hit rates of such a preselection with what was actually obtained in real screening campaigns. As a result, cell painting would have permitted a significant increase in the success rate and, even for one of the assays, would have allowed to reach 80% of the hits with 10 times fewer compounds to test. We conclude that images of a cell painting assay can be directly used for compound selection prior to screening, and we provide a simple quantitative approach in order to do so.
The milk production is strongly influenced by the dairy cow welfare related to a good nutrition and the analysis of the digestibility of feeds allows us to evaluate the health status of the animals. Through faeces’ visual examination it is possible to estimate the quality of diet fed in terms of lacking in fibre or too high in non-structural carbohydrates. The study was carried out in 2021, on four dairy farms in central Italy. The purpose of this work is the classification and evaluation of dairy cow faeces using RGB image analysis through an artificial intelligence (AI) (convolutional neural network (CNN)) algorithm. The main features to analyse are pH, colour and consistency. For the latter two RGB imaging was combined with deep learning and AI to reach objectivity in samples’ evaluation. The images have been captured with several smartphones and cameras, under various light conditions, collecting a data set of 441 images. Images acquired by RGB cameras are then analysed through CNN technology that extracts features and data previously standardized by a faecal score index assigned after a visual analysis and based on five classes. The results achieved with different training strategies show a training accuracy of 90% and a validation accuracy of 78% of the model which allow us to identify problems in bovine digestion and to intervene promptly in feed variation. The method used in this study eliminates subjectivity in field analysis and allows future improvement of increasing the data set to strengthen the model.
Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models, if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lot of training data which may not be available from reanalysis (observational data) products. Moreover, an accurate, noise-free, initial condition to start forecasting with a data-driven weather model is not available in realistic scenarios. Finally, deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift, which makes these data-driven models unsuitable for computing climate statistics. Given these challenges, previous studies have tried to pre-train deep learning-based weather forecasting models on a large amount of imperfect long-term climate model simulations and then re-train them on available observational data. In this article, we propose a convolutional variational autoencoder (VAE)-based stochastic data-driven model that is pre-trained on an imperfect climate model simulation from a two-layer quasi-geostrophic flow and re-trained, using transfer learning, on a small number of noisy observations from a perfect simulation. This re-trained model then performs stochastic forecasting with a noisy initial condition sampled from the perfect simulation. We show that our ensemble-based stochastic data-driven model outperforms a baseline deterministic encoder–decoder-based convolutional model in terms of short-term skills, while remaining stable for long-term climate simulations yielding accurate climatology.
Artificial intelligence has provided many breakthroughs in the field of computer vision. The fully convolutional networks U-Net in particular have provided very promising results in the problem of retrieving rain rates from space-borne observations, a challenge that has persisted over the past few decades. The rain intensity is estimated from the measurement of the brightness temperatures on different microwave channels. However, these channels are slightly different depending on the satellite. In the case where a retrieval model has been developed from a single satellite, it may be advantageous to use domain adaptation methods in order to make this model compatible with all the satellites of the constellation. In this proposed feasibility study, a Cycle Generative Adversarial Nets model is used for adapting one set of brightness temperature channels to another set. Results of a toy experiment show that this method is able to provide qualitatively good precipitation structure but still could be improved in terms of precision.