To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The chapter begins with a discussion on standard mechanisms for training spiking neural networks ranging from – (a) unsupervised spike-timing-dependent plasticity, (b) backpropagation through time (BPTT) using surrogate gradient techniques, and (c) conversion techniques from conventional analog non-spiking networks. Subsequently, various local learning algorithms with different degrees of locality are discussed that have the potential to replace computationally expensive global learning algorithms such as BPTT. The chapter concludes with pointers to several emerging research directions in the neuromorphic algorithms domain ranging from stochastic computing, lifelong learning, and dynamical system-based approaches, among others. Finally, we also underscore the need for looking at hybrid neuromorphic algorithm design combining principles of conventional deep learning along with forging stronger connections with computational neuroscience.
The chapter introduces fundamental principles of deep learning. We discuss supervised learning of feedforward neural networks by considering a binary classification problem. Gradient descent techniques and backpropagation learning algorithms are introduced as means of training neural networks. The impact of neuron activations and convolutional and residual network architectures on the learning performance are discussed. Finally, regularization techniques such as batch normalization and dropout are introduced for improving the accuracy of trained models. The chapter is essential to connect advances in conventional deep learning algorithms to neuromorphic concepts.
Emphasizing how and why machine learning algorithms work, this introductory textbook bridges the gap between the theoretical foundations of machine learning and its practical algorithmic and code-level implementation. Over 85 thorough worked examples, in both Matlab and Python, demonstrate how algorithms are implemented and applied whilst illustrating the end result. Over 75 end-of-chapter problems empower students to develop their own code to implement these algorithms, equipping them with hands-on experience. Matlab coding examples demonstrate how a mathematical idea is converted from equations to code, and provide a jumping off point for students, supported by in-depth coverage of essential mathematics including multivariable calculus, linear algebra, probability and statistics, numerical methods, and optimization. Accompanied online by instructor lecture slides, downloadable Python code and additional appendices, this is an excellent introduction to machine learning for senior undergraduate and graduate students in Engineering and Computer Science.
The paper introduces a deep‐learning model fine‐tuned for detecting authoritarian discourse in political speeches. Set up as a regression problem with weak supervision logic, the model is trained for the task of classification of segments of text for being/not being associated with authoritarian discourse. Rather than trying to define what an authoritarian discourse is, the model builds on the assumption that authoritarian leaders inherently define it. In other words, authoritarian leaders talk like authoritarians. When combined with the discourse defined by democratic leaders, the model learns the instances that are more often associated with authoritarians on the one hand and democrats on the other. The paper discusses several evaluation tests using the model and advocates for its usefulness in a broad range of research problems. It presents a new methodology for studying latent political concepts and positions as an alternative to more traditional research strategies.
Pater's (2019) target article builds a persuasive case for establishing stronger ties between theoretical linguistics and connectionism (deep learning). This commentary extends his arguments to semantics, focusing in particular on issues of learning, compositionality, and lexical meaning.
Joe Pater's (2019) target article calls for greater interaction between neural network research and linguistics. I expand on this call and show how such interaction can benefit both fields. Linguists can contribute to research on neural networks for language technologies by clearly delineating the linguistic capabilities that can be expected of such systems, and by constructing controlled experimental paradigms that can determine whether those desiderata have been met. In the other direction, neural networks can benefit the scientific study of language by providing infrastructure for modeling human sentence processing and for evaluating the necessity of particular innate constraints on language acquisition.
The birthdate of both generative linguistics and neural networks can be taken as 1957, the year of the publication of foundational work by both Noam Chomsky and Frank Rosenblatt. This article traces the development of these two approaches to cognitive science, from their largely autonomous early development in the first thirty years, through their collision in the 1980s around the past-tense debate (Rumelhart & McClelland 1986, Pinker & Prince 1988) and their integration in much subsequent work up to the present. Although this integration has produced a considerable body of results, the continued general gulf between these two lines of research is likely impeding progress in both: on learning in generative linguistics, and on the representation of language in neural modeling. The article concludes with a brief argument that generative linguistics is unlikely to fulfill its promise of accounting for language learning if it continues to maintain its distance from neural and statistical approaches to learning.
This research paper addresses the hypothesis that sequence-based long short-term memory (LSTM) architectures improve the prediction of the next DO (days open) relative to a feed-forward multi-layer perceptron and a Cox model under strictly temporally valid predictors. Modern dairy farming can heavily benefit from optimising ‘days open’ for profitability and animal welfare. Machine learning can forecast this metric, improving farm management, disease prevention and culling decisions. This study used a dataset of 16,472 breeding records. The study compared the performance of feed-forward neural networks and two types of recurrent neural networks (RNNs). The results showed that LSTM most accurately forecasted the next ‘days open’. This demonstrates that RNN models, due to their ability to capture temporal patterns in the data, significantly outperform feed-forward and traditional statistical methods in terms of mean absolute error and concordance.
We present a critical survey on the consistency of uncertainty quantification used in deep learning and highlight partial uncertainty coverage and many inconsistencies. We then provide a comprehensive and statistically consistent framework for uncertainty quantification in deep learning that accounts for all major sources of uncertainty: input data, training and testing data, neural network weights, and machine-learning model imperfections, targeting regression problems. We systematically quantify each source by applying Bayes’ theorem and conditional probability densities and introduce a fast, practical implementation method. We demonstrate its effectiveness on a simple regression problem and a real-world application: predicting cloud autoconversion rates using a neural network trained on aircraft measurements from the Azores and guided by a two-moment bin model of the stochastic collection equation. In this application, uncertainty from the training and testing data dominates, followed by input data, neural network model, and weight variability. Finally, we highlight the practical advantages of this methodology, showing that explicitly modeling training data uncertainty improves robustness to new inputs that fall outside the training data, and enhances model reliability in real-world scenarios.
Fine-grained mortality forecasting has gained momentum in actuarial research due to its ability to capture localized, short-term fluctuations in death rates. This paper introduces MortFCNet, a deep-learning method that predicts weekly death rates using region-specific weather inputs. Unlike traditional Serfling-based methods and gradient-boosting models that rely on predefined fixed Fourier terms and manual feature engineering, MortFCNet automatically learns patterns from raw time-series data without needing explicitly defined Fourier terms or manual feature engineering. Extensive experiments across over 200 NUTS-3 regions in France, Italy, and Switzerland demonstrate that MortFCNet consistently outperforms both a standard Serfling-type baseline and XGBoost in terms of predictive accuracy. Our ablation studies further confirm its ability to uncover complex relationships in the data without feature engineering. Moreover, this work underscores a new perspective on exploring deep learning for advancing fine-grained mortality forecasting.
In deep learning, interval neural networks are used to quantify the uncertainty of a pre-trained neural network. Suppose we are given a computational problem $P$ and a pre-trained neural network $\Phi _P$ that aims to solve $P$. An interval neural network is then a pair of neural networks $(\underline {\phi }, \overline {\phi })$, with the property that $\underline {\phi }(y) \leq \Phi _P(y) \leq \overline {\phi }(y)$ for all inputs $y$, where the inequalities are meant componentwise. $(\underline {\phi }, \overline {\phi })$ are specifically trained to quantify the uncertainty of $\Phi _P$, in the sense that the size of the interval $[\underline {\phi }(y),\overline {\phi }(y)]$ quantifies the uncertainty of the prediction $\Phi _P(y)$. In this paper, we investigate the phenomenon when algorithms cannot compute interval neural networks in the setting of inverse problems. We show that in the typical setting of a linear inverse problem, the problem of constructing an optimal pair of interval neural networks is non-computable, even with the assumption that the pre-trained neural network $\Phi _P$ is an optimal solution. In other words, there exist classes of training sets $\Omega$, such that there is no algorithm, even randomised (with probability $p \geq 1/2$), that computes an optimal pair of interval neural networks for each training set ${\mathcal{T}} \in \Omega$. This phenomenon happens even when we are given a pre-trained neural network $\Phi _{{\mathcal{T}}}$ that is optimal for $\mathcal{T}$. This phenomenon is intimately linked to instability in deep learning.
Advances in deep learning and representation learning have transformed item factor analysis (IFA) in the item response theory (IRT) literature by enabling more efficient and accurate parameter estimation. Variational autoencoders (VAEs) are widely used to model high-dimensional latent variables in this context, but the limited expressiveness of their inference networks can still hinder performance. We introduce adversarial variational Bayes (AVB) and an importance-weighted extension (IWAVB) as more flexible inference algorithms for IFA. By combining VAEs with generative adversarial networks (GANs), AVB uses an auxiliary discriminator network to frame estimation as a two-player game and removes the restrictive standard normal assumption on the latent variables. Theoretically, AVB and IWAVB can achieve likelihoods that match or exceed those of VAEs and importance-weighted autoencoders (IWAEs). In exploratory analyses of empirical data, IWAVB attained higher likelihoods than IWAE, indicating greater expressiveness. In confirmatory simulations, IWAVB achieved comparable mean-square error in parameter recovery while consistently yielding higher likelihoods, and it clearly outperformed IWAE when the latent distribution was multimodal. These findings suggest that IWAVB can scale IFA to complex, large-scale, and potentially multimodal settings, supporting closer integration of psychometrics with modern multimodal data analysis.
This chapter educates the reader on the main ideas that have enabled various advancements in Artificial Intelligence (AI) and Machine Learning (ML). Using various examples, and taking the reader on a journey through history, it showcases how the main ideas developed by the pioneers of AI and ML are being used in our modern era to make the world a better place. It communicates that our lives are surrounded by algorithms that work based on a few main ideas. It also discusses recent advancements in Generative AI, including the main ideas that led to the creation of Large Language Models (LLMs) such as Chat GPT. The chapter also discusses various societal considerations in AI and ML and ends with various technological advancements that could further improve our abilities in using the main ideas.
An introduction to AI, including an overview of essential technologies such as machine learning and deep learning, and a discussion on generative AI and its potential limitations. The chapter includes an exploration of AI's history, including its relationship to cybernetics, its role as a codebreaker, periods of optimism and “AI winters,” and today's global development with generative AI. Chapter 1 also include an analysis of AI's role in the international and national context, focusing on potential conflicts of goals and threats that can arise from technology.
This chapter describes the important role of artificial intelligence (AI) in Big Data psychology research. First, we discuss the main goals of AI, and then delve into an example of machine learning and what is happening under the hood. The chapter then describes the Perceptron, a classic simple neural network, and how this has grown into deep learning AI which has become increasingly popular in recent years. Deep learning can be used both for prediction and generation, and has a multitude of applications for psychology and neuroscience. This chapter concludes with the ethical quandaries around fake data generated by AI and biases that exist in how we train systems, as well as some exciting clinical applications of AI relevant to psychology and neuroscience.
Extreme precipitation events are projected to increase both in frequency and intensity due to climate change. High-resolution climate projections are essential to effectively model the convective phenomena responsible for severe precipitation and to plan any adaptation and mitigation action. Existing numerical methods struggle with either insufficient accuracy in capturing the evolution of convective dynamical systems, due to the low resolution, or are limited by the excessive computational demands required to achieve kilometre-scale resolution. To fill this gap, we propose a novel deep learning regional climate model (RCM) emulator called graph neural networks for climate downscaling (GNN4CD) to estimate high-resolution precipitation. The emulator is innovative in architecture and training strategy, using graph neural networks (GNNs) to learn the downscaling function through a novel hybrid imperfect framework. GNN4CD is initially trained to perform reanalysis to observation downscaling and then used for RCM emulation during the inference phase. The emulator is able to estimate precipitation at very high resolution both in space ($ 3 $km) and time ($ 1 $h), starting from lower-resolution atmospheric data ($ \sim 25 $km). Leveraging the flexibility of GNNs, we tested its spatial transferability in regions unseen during training. The model trained on northern Italy effectively reproduces the precipitation distribution, seasonal diurnal cycles, and spatial patterns of extreme percentiles across all of Italy. When used as an RCM emulator for the historical, mid-century, and end-of-century time slices, GNN4CD shows the remarkable ability to capture the shifts in precipitation distribution, especially in the tail, where changes are most pronounced.
Vibration control in structures is essential to mitigate undesired dynamic responses, thereby enhancing stability, safety, and performance under varying loading conditions. Mechanical metamaterials have emerged as effective solutions, enabling tailored dynamic properties for vibration attenuation. This study introduces a convolutional autoencoder framework for the inverse design of local resonators embedded in mechanical metamaterials. The model learns from the dynamic behaviour of primary structures coupled with ideal absorbers to predict the geometric parameters of resonators that achieve desired vibration control performance. Unlike conventional approaches requiring full numerical models, the proposed method operates as a data-driven tool, where the target frequency to be mitigated is provided as input, and the model directly outputs the resonator geometry. A large dataset, generated through physics-informed simulations of ideal absorber dynamics, supports training while incorporating both spectral and geometric variability. Within the architecture, the encoder maps input receptance spectra to resonator geometries, while the decoder reconstructs the target receptance response, ensuring dynamic consistency. Once trained, the framework predicts resonator configurations that satisfy predefined frequency targets with high accuracy, enabling efficient design of passive controllers of the syntonized mass type. This study specifically demonstrates the application of the methodology to resonators embedded in wind turbine metastructures, a critical context for mitigating structural vibrations and improving operational efficiency. Results confirm strong agreement between predicted and target responses, underscoring the potential of deep learning techniques to support on-demand inverse design of mechanical metamaterials for smart vibration control in wind energy and related engineering applications.
Misinformation on social media is a recognized threat to societies. Research has shown that social media users play an important role in the spread of misinformation. It is crucial to understand how misinformation affects user online interaction behavior and the factors that contribute to it. In this study, we employ an AI deep learning model to analyze emotions in user online social media conversations about misinformation during the COVID-19 pandemic. We further apply the Stimuli–Organism–Response framework to examine the relationship between the presence of misinformation, emotions, and social bonding behavior. Our findings highlight the usefulness of AI deep learning models to analyze emotions in social media posts and enhance the understanding of online social bonding behavior around health-related misinformation.
This Element provides a comprehensive guide to deep learning in quantitative trading, merging foundational theory with hands-on applications. It is organized into two parts. The first part introduces the fundamentals of financial time-series and supervised learning, exploring various network architectures, from feedforward to state-of-the-art. To ensure robustness and mitigate overfitting on complex real-world data, a complete workflow is presented, from initial data analysis to cross-validation techniques tailored to financial data. Building on this, the second part applies deep learning methods to a range of financial tasks. The authors demonstrate how deep learning models can enhance both time-series and cross-sectional momentum trading strategies, generate predictive signals, and be formulated as an end-to-end framework for portfolio optimization. Applications include a mixture of data from daily data to high-frequency microstructure data for a variety of asset classes. Throughout, they include illustrative code examples and provide a dedicated GitHub repository with detailed implementations.