To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We prove large and moderate deviations for the output of Gaussian fully connected neural networks. The main achievements concern deep neural networks (i.e. when the model has more than one hidden layer) and hold for bounded and continuous pre-activation functions. However, for deep neural networks fed by a single input, we have results even if the pre-activation is ReLU. When the network is shallow (i.e. there is exactly one hidden layer), the large and moderate principles hold for quite general pre-activation functions.
In recent years, passive motion paradigms (PMPs), derived from the equilibrium point hypothesis and impedance control, have been utilised as manipulation methods for humanoid robots and robotic manipulators. These paradigms are typically achieved by creating a kinematic chain that enables the manipulator to perform goal-directed actions without explicitly solving the inverse kinematics. This approach leverages a kinematic model constructed through the training of artificial neural networks, aligning well with principles of cybernetics and cognitive computation by enabling adaptive and flexible control. Specifically, these networks model the relationship between joint angles and end-effector positions, facilitating the computation of the Jacobian matrix. Although this method does not require an accurate robot model, traditional neural networks often suffer from drawbacks such as overfitting and inefficient training, which can compromise the accuracy of the final PMP model. In this paper, we implement the method using a deep neural network and investigate the impact of activation functions and network depth on the performance of the kinematic model. Additionally, we propose a transfer learning approach to fine-tune the pre-trained model, enabling it to be transferred to other manipulator arms with different kinematic properties. Finally, we implement and evaluate the deep neural network-based PMP on the Universal Robots, comparing it with traditional kinematic controllers and assessing its physical interaction capabilities and accuracy.
Deep neural networks have become an important tool for use in actuarial tasks, due to the significant gains in accuracy provided by these techniques compared to traditional methods, but also due to the close connection of these models to the generalized linear models (GLMs) currently used in industry. Although constraining GLM parameters relating to insurance risk factors to be smooth or exhibit monotonicity is trivial, methods to incorporate such constraints into deep neural networks have not yet been developed. This is a barrier for the adoption of neural networks in insurance practice since actuaries often impose these constraints for commercial or statistical reasons. In this work, we present a novel method for enforcing constraints within deep neural network models, and we show how these models can be trained. Moreover, we provide example applications using real-world datasets. We call our proposed method ICEnet to emphasize the close link of our proposal to the individual conditional expectation model interpretability technique.
Deep neural networks are said to be opaque, impeding the development of safe and trustworthy artificial intelligence, but where this opacity stems from is less clear. What are the sufficient properties for neural network opacity? Here, I discuss five common properties of deep neural networks and two different kinds of opacity. Which of these properties are sufficient for what type of opacity? I show how each kind of opacity stems from only one of these five properties, and then discuss to what extent the two kinds of opacity can be mitigated by explainability methods.
Edited by
Jong Chul Ye, Korea Advanced Institute of Science and Technology (KAIST),Yonina C. Eldar, Weizmann Institute of Science, Israel,Michael Unser, École Polytechnique Fédérale de Lausanne
We provide a short, self-contained introduction to deep neural networks that is aimed at mathematically inclined readers. We promote the use of a vect--matrix formalism that is well suited to the compositional structure of these networks and that facilitates the derivation/description of the backpropagation algorithm. We present a detailed analysis of supervised learning for the two most common scenarios, (i) multivariate regression and (ii) classification, which rely on the minimization of least squares and cross-entropy criteria, respectively.
Understanding seasonal climatic conditions is critical for better management of resources such as water, energy, and agriculture. Recently, there has been a great interest in utilizing the power of Artificial Intelligence (AI) methods in climate studies. This paper presents cutting-edge deep-learning models (UNet++, ResNet, PSPNet, and DeepLabv3) trained by state-of-the-art global CMIP6 models to forecast global temperatures a month ahead using the ERA5 reanalysis dataset. ERA5 dataset was also used for fine-tuning as well performance analysis in the validation dataset. Ten different setups (with CMIP6 and CMIP6 + ERA5 fine-tuning) including six meteorological parameters (i.e., 2 m temperature, 10 m eastward component of wind, 10 m northward component of wind, geopotential height at 500 hPa, mean sea-level pressure, and precipitation flux) and elevation were used with both four different algorithms. For each model 14 different sequential and nonsequential temporal settings were used. The mean absolute error (MAE) analysis revealed that UNet++ with CMIP6 with 2 m temperature + elevation and ERA5 fine-tuning model with “Year 3 Month 2” temporal case provided the best outcome with an MAE of 0.7. Regression analysis over the validation dataset between the ERA5 data values and the corresponding AI model predictions revealed slope and $ {R}^2 $ values close to 1 suggesting a very good agreement. The AI model predicts significantly better than the mean CMIP6 ensemble between 2016 and 2021. Both models predict the summer months more accurately than the winter months.
Vision is one of the most complex proficiencies we possess, but its underpinnings are still shrouded in mystery. Many great scientific minds have been engaged in the enterprise of modeling vision. This chapter takes a look at some of the history of this effort, stretching from the times of the ancient Greeks to recent developments in neural networks, and discusses how current techniques may play a role in furthering our understanding of vision.
In this chapter, we review computer models of cognition that have focused on the use of neural networks. These architectures were inspired by research into how computation works in the brain. The approach is called connectionism because it proposes that processing is characterized by patterns of activation across simple processing units connected together into complex networks, with knowledge stored in the strength of the connections between units. We place connectionism in its historical context, describing the “three ages” of artificial neural network research: from the genesis of the first formal theories of computation in the 1930s and 1940s, to the parallel distributed processing (PDP) models of cognition of the 1980s and 1990s, and the advances in “deep” neural networks emerging in the mid-2000s. Transition between the ages has been triggered by new insights into how to create and train more powerful artificial neural networks. We discuss important foundational cognitive models that illustrate some of the key properties of connectionist systems, and indicate how the novel theoretical contributions of these models arose from their key computational properties. We consider how connectionist modeling has influenced wider theories of cognition, and how in the future, connectionist modeling of cognition may progress by integrating further constraints from neuroscience and neuroanatomy.
Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modeling approaches that focus on psychological data.
Building machines to converse with human beings through automatic speech recognition (ASR) and understanding (ASU) has long been a topic of great interest for scientists and engineers, and we have recently witnessed rapid technological advances in this area. Here, we first cast the ASR problem as a pattern-matching and channel-decoding paradigm. We then follow this with a discussion of the Hidden Markov Model (HMM), which is the most successful technique for modelling fundamental speech units, such as phones and words, in order to solve ASR as a search through a top-down decoding network. Recent advances using deep neural networks as parts of an ASR system are also highlighted. We then compare the conventional top-down decoding approach with the recently proposed automatic speech attribute transcription (ASAT) paradigm, which can better leverage knowledge sources in speech production, auditory perception and language theory through bottom-up integration. Finally we discuss how the processing-based speech engineering and knowledge-based speech science communities can work collaboratively to improve our understanding of speech and enhance ASR capabilities.
Machine learning components such as deep neural networks are used extensively in cyber-physical systems (CPS). However, such components may introduce new types of hazards that can have disastrous consequences and need to be addressed for engineering trustworthy systems. Although deep neural networks offer advanced capabilities, they must be complemented by engineering methods and practices that allow effective integration in CPS. In this paper, we proposed an approach for assurance monitoring of learning-enabled CPS based on the conformal prediction framework. In order to allow real-time assurance monitoring, the approach employs distance learning to transform high-dimensional inputs into lower size embedding representations. By leveraging conformal prediction, the approach provides well-calibrated confidence and ensures a bounded small error rate while limiting the number of inputs for which an accurate prediction cannot be made. We demonstrate the approach using three datasets of mobile robot following a wall, speaker recognition, and traffic sign recognition. The experimental results demonstrate that the error rates are well-calibrated while the number of alarms is very small. Furthermore, the method is computationally efficient and allows real-time assurance monitoring of CPS.
In this study, prediction of aircraft Estimated Time of Arrival (ETA) is proposed using machine learning algorithms. Accurate prediction of ETA is important for management of delay and air traffic flow, runway assignment, gate assignment, collaborative decision making (CDM), coordination of ground personnel and equipment, and optimisation of arrival sequence etc. Machine learning is able to learn from experience and make predictions with weak assumptions or no assumptions at all. In the proposed approach, general flight information, trajectory data and weather data were obtained from different sources in various formats. Raw data were converted to tidy data and inserted into a relational database. To obtain the features for training the machine learning models, the data were explored, cleaned and transformed into convenient features. New features were also derived from the available data. Random forests and deep neural networks were used to train the machine learning models. Both models can predict the ETA with a mean absolute error (MAE) less than 6min after departure, and less than 3min after terminal manoeuvring area (TMA) entrance. Additionally, a web application was developed to dynamically predict the ETA using proposed models.
In this paper, we develop a methodology to automatically classify claims using the information contained in text reports (redacted at their opening). From this automatic analysis, the aim is to predict if a claim is expected to be particularly severe or not. The difficulty is the rarity of such extreme claims in the database, and hence the difficulty, for classical prediction techniques like logistic regression to accurately predict the outcome. Since data is unbalanced (too few observations are associated with a positive label), we propose different rebalance algorithm to deal with this issue. We discuss the use of different embedding methodologies used to process text data, and the role of the architectures of the networks.
This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.
Dynamic movement primitives (DMP) are motion building blocks suitable for real-world tasks. We suggest a methodology for learning the manifold of task and DMP parameters, which facilitates runtime adaptation to changes in task requirements while ensuring predictable and robust performance. For efficient learning, the parameter space is analyzed using principal component analysis and locally linear embedding. Two manifold learning methods: kernel estimation and deep neural networks, are investigated for a ball throwing task in simulation and in a physical environment. Low runtime estimation errors are obtained for both learning methods, with an advantage to kernel estimation when data sets are small.
We develop a deep autoencoder architecture that can be used to find a coordinate transformation which turns a non-linear partial differential equation (PDE) into a linear PDE. Our architecture is motivated by the linearising transformations provided by the Cole–Hopf transform for Burgers’ equation and the inverse scattering transform for completely integrable PDEs. By leveraging a residual network architecture, a near-identity transformation can be exploited to encode intrinsic coordinates in which the dynamics are linear. The resulting dynamics are given by a Koopman operator matrix K. The decoder allows us to transform back to the original coordinates as well. Multiple time step prediction can be performed by repeated multiplication by the matrix K in the intrinsic coordinates. We demonstrate our method on a number of examples, including the heat equation and Burgers’ equation, as well as the substantially more challenging Kuramoto–Sivashinsky equation, showing that our method provides a robust architecture for discovering linearising transforms for non-linear PDEs.
Optimal grasping points for a robotic gripper were derived, based on object and hand geometry, using deep neural networks (DNNs). The optimal grasping cost functions were derived using probability density functions for each local cost function of the normal distribution. Using the DNN, the optimum height and width were set for the robot hand to grasp objects, whose geometric and mass centre points were also considered in obtaining the optimum grasping positions for the robot fingers and the object. The proposed algorithm was tested on 10 differently shaped objects and showed improved grip performance compared to conventional methods.
Deep neural networks (DNNs) have the same structure as the neocognitron proposed in 1979 but have much better performance, which is because DNNs include many heuristic techniques such as pre-training, dropout, skip connections, batch normalization (BN), and stochastic depth. However, the reason why these techniques improve the performance is not fully understood. Recently, two tools for theoretical analyses have been proposed. One is to evaluate the generalization gap, defined as the difference between the expected loss and empirical loss, by calculating the algorithmic stability, and the other is to evaluate the convergence rate by calculating the eigenvalues of the Fisher information matrix of DNNs. This overview paper briefly introduces the tools and shows their usefulness by showing why the skip connections and BN improve the performance.
Machine learning techniques have proven to be increasingly useful in astronomical applications over the last few years, for example in object classification, estimating redshifts and data mining. One example of object classification is classifying galaxy morphology. This is a tedious task to do manually, especially as the datasets become larger with surveys that have a broader and deeper search-space. The Kaggle Galaxy Zoo competition presented the challenge of writing an algorithm to find the probability that a galaxy belongs in a particular class, based on SDSS optical spectroscopy data. The use of convolutional neural networks (convnets), proved to be a popular solution to the problem, as they have also produced unprecedented classification accuracies in other image databases such as the database of handwritten digits (MNIST †) and large database of images (CIFAR ‡). We experiment with the convnets that comprised the winning solution, but using broad classifications. The effect of changing the number of layers is explored, as well as using a different activation function, to help in developing an intuition of how the networks function and to see how they can be applied to radio galaxy images.