Search

Large and moderate deviations for Gaussian neural networks
Part of
- Limit theorems
Claudio Macci, Barbara Pacchiarotti, Giovanni Luca Torrisi
Journal:

Journal of Applied Probability , First View

Published online by Cambridge University Press:

01 October 2025, pp. 1-20
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We prove large and moderate deviations for the output of Gaussian fully connected neural networks. The main achievements concern deep neural networks (i.e. when the model has more than one hidden layer) and hold for bounded and continuous pre-activation functions. However, for deep neural networks fed by a single input, we have results even if the pre-activation is ReLU. When the network is shallow (i.e. there is exactly one hidden layer), the large and moderate principles hold for quite general pre-activation functions.

Passive motion paradigm implementation via deep neural networks: analysis and verification
Fuli Wang, Vishwanathan Mohan, Ashutosh Tiwari
Journal:

Robotica / Volume 43 / Issue 5 / May 2025

Published online by Cambridge University Press:

21 April 2025, pp. 1766-1784
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In recent years, passive motion paradigms (PMPs), derived from the equilibrium point hypothesis and impedance control, have been utilised as manipulation methods for humanoid robots and robotic manipulators. These paradigms are typically achieved by creating a kinematic chain that enables the manipulator to perform goal-directed actions without explicitly solving the inverse kinematics. This approach leverages a kinematic model constructed through the training of artificial neural networks, aligning well with principles of cybernetics and cognitive computation by enabling adaptive and flexible control. Specifically, these networks model the relationship between joint angles and end-effector positions, facilitating the computation of the Jacobian matrix. Although this method does not require an accurate robot model, traditional neural networks often suffer from drawbacks such as overfitting and inefficient training, which can compromise the accuracy of the final PMP model. In this paper, we implement the method using a deep neural network and investigate the impact of activation functions and network depth on the performance of the kinematic model. Additionally, we propose a transfer learning approach to fine-tune the pre-trained model, enabling it to be transferred to other manipulator arms with different kinematic properties. Finally, we implement and evaluate the deep neural network-based PMP on the Universal Robots, comparing it with traditional kinematic controllers and assessing its physical interaction capabilities and accuracy.

Smoothness and monotonicity constraints for neural networks using ICEnet
Ronald Richman, Mario V. Wüthrich
Journal:

Annals of Actuarial Science / Volume 18 / Issue 3 / November 2024

Published online by Cambridge University Press:

01 April 2024, pp. 712-739
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Deep neural networks have become an important tool for use in actuarial tasks, due to the significant gains in accuracy provided by these techniques compared to traditional methods, but also due to the close connection of these models to the generalized linear models (GLMs) currently used in industry. Although constraining GLM parameters relating to insurance risk factors to be smooth or exhibit monotonicity is trivial, methods to incorporate such constraints into deep neural networks have not yet been developed. This is a barrier for the adoption of neural networks in insurance practice since actuaries often impose these constraints for commercial or statistical reasons. In this work, we present a novel method for enforcing constraints within deep neural network models, and we show how these models can be trained. Moreover, we provide example applications using real-world datasets. We call our proposed method ICEnet to emphasize the close link of our proposal to the individual conditional expectation model interpretability technique.

On the Opacity of Deep Neural Networks
Anders Søgaard
Journal:

Canadian Journal of Philosophy / Volume 53 / Issue 3 / April 2023

Published online by Cambridge University Press:

25 March 2024, pp. 224-239
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Deep neural networks are said to be opaque, impeding the development of safe and trustworthy artificial intelligence, but where this opacity stems from is less clear. What are the sufficient properties for neural network opacity? Here, I discuss five common properties of deep neural networks and two different kinds of opacity. Which of these properties are sufficient for what type of opacity? I show how each kind of opacity stems from only one of these five properties, and then discuss to what extent the two kinds of opacity can be mitigated by explainability methods.

1 - Formalizing Deep Neural Networks
from Part I - Theory of Deep Learning for Image Reconstruction
- By Michael Unser
Edited by Jong Chul Ye, Korea Advanced Institute of Science and Technology (KAIST), Yonina C. Eldar, Weizmann Institute of Science, Israel, Michael Unser, École Polytechnique Fédérale de Lausanne
Book:

Deep Learning for Biomedical Image Reconstruction

Published online:

15 September 2023

Print publication:

12 October 2023, pp 3-12
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We provide a short, self-contained introduction to deep neural networks that is aimed at mathematically inclined readers. We promote the use of a vect--matrix formalism that is well suited to the compositional structure of these networks and that facilitates the derivation/description of the backpropagation algorithm. We present a detailed analysis of supervised learning for the two most common scenarios, (i) multivariate regression and (ii) classification, which rely on the minimization of least squares and cross-entropy criteria, respectively.

Climate model-driven seasonal forecasting approach with deep learning
Part of
- Climate Informatics 2023
Alper Unal, Busra Asan, Ismail Sezen, Bugra Yesilkaynak, Yusuf Aydin, Mehmet Ilicak, Gozde Unal
Journal:

Environmental Data Science / Volume 2 / 2023

Published online by Cambridge University Press:

26 July 2023, e29
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Understanding seasonal climatic conditions is critical for better management of resources such as water, energy, and agriculture. Recently, there has been a great interest in utilizing the power of Artificial Intelligence (AI) methods in climate studies. This paper presents cutting-edge deep-learning models (UNet++, ResNet, PSPNet, and DeepLabv3) trained by state-of-the-art global CMIP6 models to forecast global temperatures a month ahead using the ERA5 reanalysis dataset. ERA5 dataset was also used for fine-tuning as well performance analysis in the validation dataset. Ten different setups (with CMIP6 and CMIP6 + ERA5 fine-tuning) including six meteorological parameters (i.e., 2 m temperature, 10 m eastward component of wind, 10 m northward component of wind, geopotential height at 500 hPa, mean sea-level pressure, and precipitation flux) and elevation were used with both four different algorithms. For each model 14 different sequential and nonsequential temporal settings were used. The mean absolute error (MAE) analysis revealed that UNet++ with CMIP6 with 2 m temperature + elevation and ERA5 fine-tuning model with “Year 3 Month 2” temporal case provided the best outcome with an MAE of 0.7. Regression analysis over the validation dataset between the ERA5 data values and the corresponding AI model predictions revealed slope and $ {R}^2 $ values close to 1 suggesting a very good agreement. The AI model predicts significantly better than the mean CMIP6 ensemble between 2016 and 2021. Both models predict the summer months more accurately than the winter months.

34 - Modeling Vision
from Part IV - Computational Modeling in Various Cognitive Fields
- By Lukas Vogelsang, Pawan Sinha
Edited by Ron Sun, Rensselaer Polytechnic Institute, New York
Book:

The Cambridge Handbook of Computational Cognitive Sciences

Published online:

21 April 2023

Print publication:

11 May 2023, pp 1113-1134
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Vision is one of the most complex proficiencies we possess, but its underpinnings are still shrouded in mystery. Many great scientific minds have been engaged in the enterprise of modeling vision. This chapter takes a look at some of the history of this effort, stretching from the times of the ancient Greeks to recent developments in neural networks, and discusses how current techniques may play a role in furthering our understanding of vision.

2 - Connectionist Models of Cognition
from Part II - Cognitive Modeling Paradigms
- By Michael S. C. Thomas, James L. McClelland
Edited by Ron Sun, Rensselaer Polytechnic Institute, New York
Book:

The Cambridge Handbook of Computational Cognitive Sciences

Published online:

21 April 2023

Print publication:

11 May 2023, pp 29-79
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we review computer models of cognition that have focused on the use of neural networks. These architectures were inspired by research into how computation works in the brain. The approach is called connectionism because it proposes that processing is characterized by patterns of activation across simple processing units connected together into complex networks, with knowledge stored in the strength of the connections between units. We place connectionism in its historical context, describing the “three ages” of artificial neural network research: from the genesis of the first formal theories of computation in the 1930s and 1940s, to the parallel distributed processing (PDP) models of cognition of the 1980s and 1990s, and the advances in “deep” neural networks emerging in the mid-2000s. Transition between the ages has been triggered by new insights into how to create and train more powerful artificial neural networks. We discuss important foundational cognitive models that illustrate some of the key properties of connectionist systems, and indicate how the novel theoretical contributions of these models arose from their key computational properties. We consider how connectionist modeling has influenced wider theories of cognition, and how in the future, connectionist modeling of cognition may progress by integrating further constraints from neuroscience and neuroanatomy.

FUZZY APPROACHES AGAINST OUTLIERS AND APPLICATIONS IN WIND ENERGY
Part of
- Applications
- Foundational and philosophical topics
SRINIVAS CHAKRAVARTY
Journal:

Bulletin of the Australian Mathematical Society / Volume 107 / Issue 2 / April 2023

Published online by Cambridge University Press:

03 February 2023, pp. 346-348

Print publication:

April 2023
- Article
- - You have access
- PDF
- HTML
- Export citation

Deep problems with neural network models of human vision
Jeffrey S. Bowers, Gaurav Malhotra, Marin Dujmović, Milton Llera Montero, Christian Tsvetkov, Valerio Biscione, Guillermo Puebla, Federico Adolfi, John E. Hummel, Rachel F. Heaton, Benjamin D. Evans, Jeffrey Mitchell, Ryan Blything
Journal:

Behavioral and Brain Sciences / Volume 46 / 2023

Published online by Cambridge University Press:

01 December 2022, e385
- Article
- - You have access
- PDF
- HTML
- Export citation
Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modeling approaches that focus on psychological data.

19 - Automatic Speech Recognition by Machines
from Section IV - Audition and Perception
- By Sabato Marco Siniscalchi, Chin-Hui Lee
Edited by Rachael-Anne Knight, City, University of London, Jane Setter, University of Reading
Book:

The Cambridge Handbook of Phonetics

Published online:

11 November 2021

Print publication:

02 December 2021, pp 480-500
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Building machines to converse with human beings through automatic speech recognition (ASR) and understanding (ASU) has long been a topic of great interest for scientists and engineers, and we have recently witnessed rapid technological advances in this area. Here, we first cast the ASR problem as a pattern-matching and channel-decoding paradigm. We then follow this with a discussion of the Hidden Markov Model (HMM), which is the most successful technique for modelling fundamental speech units, such as phones and words, in order to solve ASR as a search through a top-down decoding network. Recent advances using deep neural networks as parts of an ASR system are also highlighted. We then compare the conventional top-down decoding approach with the recently proposed automatic speech attribute transcription (ASAT) paradigm, which can better leverage knowledge sources in speech production, auditory perception and language theory through bottom-up integration. Finally we discuss how the processing-based speech engineering and knowledge-based speech science communities can work collaboratively to improve our understanding of speech and enhance ASR capabilities.

Assurance monitoring of learning-enabled cyber-physical systems using inductive conformal prediction based on distance learning
Dimitrios Boursinos, Xenofon Koutsoukos
Journal:

AI EDAM / Volume 35 / Issue 2 / May 2021

Published online by Cambridge University Press:

31 May 2021, pp. 251-264
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Machine learning components such as deep neural networks are used extensively in cyber-physical systems (CPS). However, such components may introduce new types of hazards that can have disastrous consequences and need to be addressed for engineering trustworthy systems. Although deep neural networks offer advanced capabilities, they must be complemented by engineering methods and practices that allow effective integration in CPS. In this paper, we proposed an approach for assurance monitoring of learning-enabled CPS based on the conformal prediction framework. In order to allow real-time assurance monitoring, the approach employs distance learning to transform high-dimensional inputs into lower size embedding representations. By leveraging conformal prediction, the approach provides well-calibrated confidence and ensures a bounded small error rate while limiting the number of inputs for which an accurate prediction cannot be made. We demonstrate the approach using three datasets of mobile robot following a wall, speaker recognition, and traffic sign recognition. The experimental results demonstrate that the error rates are well-calibrated while the number of alarms is very small. Furthermore, the method is computationally efficient and allows real-time assurance monitoring of CPS.

Prediction of aircraft estimated time of arrival using machine learning methods
O. Basturk, C. Cetek
Journal:

The Aeronautical Journal / Volume 125 / Issue 1289 / July 2021

Published online by Cambridge University Press:

17 March 2021, pp. 1245-1259
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this study, prediction of aircraft Estimated Time of Arrival (ETA) is proposed using machine learning algorithms. Accurate prediction of ETA is important for management of delay and air traffic flow, runway assignment, gate assignment, collaborative decision making (CDM), coordination of ground personnel and equipment, and optimisation of arrival sequence etc. Machine learning is able to learn from experience and make predictions with weak assumptions or no assumptions at all. In the proposed approach, general flight information, trajectory data and weather data were obtained from different sources in various formats. Raw data were converted to tidy data and inserted into a relational database. To obtain the features for training the machine learning models, the data were explored, cleaned and transformed into convenient features. New features were also derived from the available data. Random forests and deep neural networks were used to train the machine learning models. Both models can predict the ETA with a mean absolute error (MAE) less than 6min after departure, and less than 3min after terminal manoeuvring area (TMA) entrance. Additionally, a web application was developed to dynamically predict the ETA using proposed models.

Automatic analysis of insurance reports through deep neural networks to identify severe claims
Isaac Cohen Sabban, Olivier Lopez, Yann Mercuzot
Journal:

Annals of Actuarial Science / Volume 16 / Issue 1 / March 2022

Published online by Cambridge University Press:

09 March 2021, pp. 42-67
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper, we develop a methodology to automatically classify claims using the information contained in text reports (redacted at their opening). From this automatic analysis, the aim is to predict if a claim is expected to be particularly severe or not. The difficulty is the rarity of such extreme claims in the database, and hence the difficulty, for classical prediction techniques like logistic regression to accurately predict the outcome. Since data is unbalanced (too few observations are associated with a positive label), we propose different rebalance algorithm to deal with this issue. We discuss the use of different embedding methodologies used to process text data, and the role of the architectures of the networks.

Generation of geometric interpolations of building types with deep variational autoencoders
Part of
- Deep Learning for Design
Jaime de Miguel Rodríguez, Maria Eugenia Villafañe, Luka Piškorec, Fernando Sancho Caparrini
Journal:

Design Science / Volume 6 / 2020

Published online by Cambridge University Press:

28 December 2020, e34
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Motion Adaptation Based on Learning the Manifold of Task and Dynamic Movement Primitive Parameters
Yosef Cohen, Or Bar-Shira, Sigal Berman
Journal:

Robotica / Volume 39 / Issue 7 / July 2021

Published online by Cambridge University Press:

18 December 2020, pp. 1299-1315
- Article
- - You have access
  - Open access
- PDF
- Export citation
Dynamic movement primitives (DMP) are motion building blocks suitable for real-world tasks. We suggest a methodology for learning the manifold of task and DMP parameters, which facilitates runtime adaptation to changes in task requirements while ensuring predictable and robust performance. For efficient learning, the parameter space is analyzed using principal component analysis and locally linear embedding. Two manifold learning methods: kernel estimation and deep neural networks, are investigated for a ball throwing task in simulation and in a physical environment. Low runtime estimation errors are obtained for both learning methods, with an advantage to kernel estimation when data sets are small.

Deep learning models for global coordinate transformations that linearise PDEs
Part of
CRAIG GIN, BETHANY LUSCH, STEVEN L. BRUNTON, J. NATHAN KUTZ
Journal:

European Journal of Applied Mathematics / Volume 32 / Issue 3 / June 2021

Published online by Cambridge University Press:

24 September 2020, pp. 515-539
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We develop a deep autoencoder architecture that can be used to find a coordinate transformation which turns a non-linear partial differential equation (PDE) into a linear PDE. Our architecture is motivated by the linearising transformations provided by the Cole–Hopf transform for Burgers’ equation and the inverse scattering transform for completely integrable PDEs. By leveraging a residual network architecture, a near-identity transformation can be exploited to encode intrinsic coordinates in which the dynamics are linear. The resulting dynamics are given by a Koopman operator matrix K. The decoder allows us to transform back to the original coordinates as well. Multiple time step prediction can be performed by repeated multiplication by the matrix K in the intrinsic coordinates. We demonstrate our method on a number of examples, including the heat equation and Burgers’ equation, as well as the substantially more challenging Kuramoto–Sivashinsky equation, showing that our method provides a robust architecture for discovering linearising transforms for non-linear PDEs.

Stable Robotic Grasping of Multiple Objects using Deep Neural Networks
Dongeon Kim, Ailing Li, Jangmyung Lee
Journal:

Robotica / Volume 39 / Issue 4 / April 2021

Published online by Cambridge University Press:

20 July 2020, pp. 735-748
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Optimal grasping points for a robotic gripper were derived, based on object and hand geometry, using deep neural networks (DNNs). The optimal grasping cost functions were derived using probability density functions for each local cost function of the normal distribution. Using the DNN, the optimum height and width were set for the robot hand to grasp objects, whose geometric and mass centre points were also considered in obtaining the optimum grasping positions for the robot fingers and the object. The proposed algorithm was tested on 10 differently shaped objects and showed improved grip performance compared to conventional methods.

Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives
Yasutaka Furusho, Kazushi Ikeda
Journal:

APSIPA Transactions on Signal and Information Processing / Volume 9 / 2020

Published online by Cambridge University Press:

27 February 2020, e9

Print publication:

2020
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Deep neural networks (DNNs) have the same structure as the neocognitron proposed in 1979 but have much better performance, which is because DNNs include many heuristic techniques such as pre-training, dropout, skip connections, batch normalization (BN), and stochastic depth. However, the reason why these techniques improve the performance is not fully understood. Recently, two tools for theoretical analyses have been proposed. One is to evaluate the generalization gap, defined as the difference between the expected loss and empirical loss, by calculating the algorithmic stability, and the other is to evaluate the convergence rate by calculating the eigenvalues of the Fisher information matrix of DNNs. This overview paper briefly introduces the tools and shows their usefulness by showing why the skip connections and BN improve the performance.

Galaxy Classifications with Deep Learning
Vesna Lukic, Marcus Brüggen
Journal:

Proceedings of the International Astronomical Union / Volume 12 / Issue S325 / October 2016

Published online by Cambridge University Press:

30 May 2017, pp. 217-220

Print publication:

October 2016
- Article
- - You have access
- PDF
- Export citation
Machine learning techniques have proven to be increasingly useful in astronomical applications over the last few years, for example in object classification, estimating redshifts and data mining. One example of object classification is classifying galaxy morphology. This is a tedious task to do manually, especially as the datasets become larger with surveys that have a broader and deeper search-space. The Kaggle Galaxy Zoo competition presented the challenge of writing an algorithm to find the probability that a galaxy belongs in a particular class, based on SDSS optical spectroscopy data. The use of convolutional neural networks (convnets), proved to be a popular solution to the problem, as they have also produced unprecedented classification accuracies in other image databases such as the database of handwritten digits (MNIST †) and large database of images (CIFAR ‡). We experiment with the convnets that comprised the winning solution, but using broad classifications. The effect of changing the number of layers is explored, as well as using a different activation function, to help in developing an intuition of how the networks function and to see how they can be applied to radio galaxy images.

Search Results

Refine search

Refine search

Actions for selected content:

22 results

Large and moderate deviations for Gaussian neural networks

Passive motion paradigm implementation via deep neural networks: analysis and verification

Smoothness and monotonicity constraints for neural networks using ICEnet

On the Opacity of Deep Neural Networks

1 - Formalizing Deep Neural Networks

Summary

Climate model-driven seasonal forecasting approach with deep learning

34 - Modeling Vision

Summary

2 - Connectionist Models of Cognition

Summary

FUZZY APPROACHES AGAINST OUTLIERS AND APPLICATIONS IN WIND ENERGY

Deep problems with neural network models of human vision

19 - Automatic Speech Recognition by Machines

Summary

Assurance monitoring of learning-enabled cyber-physical systems using inductive conformal prediction based on distance learning

Prediction of aircraft estimated time of arrival using machine learning methods

Automatic analysis of insurance reports through deep neural networks to identify severe claims

Generation of geometric interpolations of building types with deep variational autoencoders

Motion Adaptation Based on Learning the Manifold of Task and Dynamic Movement Primitive Parameters

Deep learning models for global coordinate transformations that linearise PDEs

Stable Robotic Grasping of Multiple Objects using Deep Neural Networks

Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives

Galaxy Classifications with Deep Learning

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

22 results

Summary

Summary

Summary

Summary