We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Diagnostic classification models are confirmatory in the sense that the relationship between the latent attributes and responses to items is specified or parameterized. Such models are readily interpretable with each component of the model usually having a practical meaning. However, parameterized diagnostic classification models are sometimes too simple to capture all the data patterns, resulting in significant model lack of fit. In this paper, we attempt to obtain a compromise between interpretability and goodness of fit by regularizing a latent class model. Our approach starts with minimal assumptions on the data structure, followed by suitable regularization to reduce complexity, so that readily interpretable, yet flexible model is obtained. An expectation–maximization-type algorithm is developed for efficient computation. It is shown that the proposed approach enjoys good theoretical properties. Results from simulation studies and a real application are presented.
Intensive longitudinal data (ILD) is an increasingly common data type in the social and behavioral sciences. Despite the many benefits these data provide, little work has been dedicated to realize the potential such data hold for forecasting dynamic processes at the individual level. To address this gap in the literature, we present the multi-VAR framework, a novel methodological approach allowing for penalized estimation of ILD collected from multiple individuals. Importantly, our approach estimates models for all individuals simultaneously and is capable of adaptively adjusting to the amount of heterogeneity present across individual dynamic processes. To accomplish this, we propose a novel proximal gradient descent algorithm for solving the multi-VAR problem and prove the consistency of the recovered transition matrices. We evaluate the forecasting performance of our method in comparison with a number of benchmark methods and provide an illustrative example involving the day-to-day emotional experiences of 16 individuals over an 11-week period.
In multidimensional tests, the identification of latent traits measured by each item is crucial. In addition to item–trait relationship, differential item functioning (DIF) is routinely evaluated to ensure valid comparison among different groups. The two problems are investigated separately in the literature. This paper uses a unified framework for detecting item–trait relationship and DIF in multidimensional item response theory (MIRT) models. By incorporating DIF effects in MIRT models, these problems can be considered as variable selection for latent/observed variables and their interactions. A Bayesian adaptive Lasso procedure is developed for variable selection, in which item–trait relationship and DIF effects can be obtained simultaneously. Simulation studies show the performance of our method for parameter estimation, the recovery of item–trait relationship and the detection of DIF effects. An application is presented using data from the Eysenck Personality Questionnaire.
PCA is a popular tool for exploring and summarizing multivariate data, especially those consisting of many variables. PCA, however, is often not simple to interpret, as the components are a linear combination of the variables. To address this issue, numerous methods have been proposed to sparsify the nonzero coefficients in the components, including rotation-thresholding methods and, more recently, PCA methods subject to sparsity inducing penalties or constraints. Here, we offer guidelines on how to choose among the different sparse PCA methods. Current literature misses clear guidance on the properties and performance of the different sparse PCA methods, often relying on the misconception that the equivalence of the formulations for ordinary PCA also holds for sparse PCA. To guide potential users of sparse PCA methods, we first discuss several popular sparse PCA methods in terms of where the sparseness is imposed on the loadings or on the weights, assumed model, and optimization criterion used to impose sparseness. Second, using an extensive simulation study, we assess each of these methods by means of performance measures such as squared relative error, misidentification rate, and percentage of explained variance for several data generating models and conditions for the population model. Finally, two examples using empirical data are considered.
Chapter 7 is dedicated to regularized regression methods, which – by penalizing models that are too complex – are capable of providing a reasonable tradeoff between bias and variance. Ridge regression implements L2 regularization, which results in more generalizable models, but does not perform any feature selection. L1 penalty used by the lasso allows, however, for simultaneous regularization and feature selection. The elastic net algorithm combines the two approaches by applying both L1 and L2 penalties, which allows for solutions combining the advantages of both ridge regression and the lasso. The chapter concludes by discussing a general class of Lq-regularized least squares optimization problems.
Chapter 13 discusses neural networks and deep learning; included is a presentation of deep convolutional networks that seem to have a great potential in the classification of medical images.
Multivariate biomarker discovery is increasingly important in the realm of biomedical research, and is poised to become a crucial facet of personalized medicine. This will prompt the demand for a myriad of novel biomarkers representing distinct 'omic' biosignatures, allowing selection and tailoring treatments to the various individual characteristics of a particular patient. This concise and self-contained book covers all aspects of predictive modeling for biomarker discovery based on high-dimensional data, as well as modern data science methods for identification of parsimonious and robust multivariate biomarkers for medical diagnosis, prognosis, and personalized medicine. It provides a detailed description of state-of-the-art methods for parallel multivariate feature selection and supervised learning algorithms for regression and classification, as well as methods for proper validation of multivariate biomarkers and predictive models implementing them. This is an invaluable resource for scientists and students interested in bioinformatics, data science, and related areas.
The previous chapter introduced feed-forward neural networks and demonstrated that, theoretically, implementing the training procedure for an arbitrary feed-forward neural network is relatively simple. Unfortunately, neural networks trained this way will suffer from several problems such as stability of the training process – that is, slow convergence due to parameters jumping around a good minimum – and overfitting. In this chapter, we will describe several practical solutions that mitigate these problems. In particular, we discuss minibatching, multiple optimization algorithms, other activation and cost functions, regularization, dropout, temporal averaging, and parameter initialization and normalization.
Based on Chapter 6, in this chapter we expand the discussion of neural networks to include networks that have more than one hidden layer. Common structures such as the convolutional neural network (CNN) or the Long Short-Term Memory network (LSTM) are explained and used along with Matlab’s Deep Network Designer App as well as Matlab script to implement and train such networks. Issues such as the vanishing or exploding gradient, normalization, and training strategies are discussed. Concepts that address overfitting and the vanishing or exploding gradient are introduced, including dropout and regularization. Transfer learning is discussed and showcased using Matlab’s DND App.
In this chapter we formulate the general regression problem relevant to function estimation. We begin with simple frequentist methods and quickly move to regression within the Bayesian paradigm. We then present two complementary mathematical formulations: one that relies on Gaussian process priors, appropriate for the regression of continuous quantities, and one that relies on Beta–Bernoulli process priors, appropriate for the regression of discrete quantities. In the context of the Gaussian process, we discuss more advanced topics including various admissible kernel functions, inducing point methods, sampling methods for nonconjugate Gaussian process prior-likelihood pairs, and elliptical slice samplers. For Beta–Bernoulli processes, we address questions of posterior convergence in addition to applications. Taken together, both Gaussian processes and Beta–Bernoulli processes constitute our first foray into Bayesian nonparametrics. With end of chapter projects, we explore more advanced modeling questions relevant to optics and microscopy.
Edited by
Alik Ismail-Zadeh, Karlsruhe Institute of Technology, Germany,Fabio Castelli, Università degli Studi, Florence,Dylan Jones, University of Toronto,Sabrina Sanchez, Max Planck Institute for Solar System Research, Germany
Abstract: Data assimilation has always been a particularly active area of research in glaciology. While many properties at the surface of glaciers and ice sheets can be directly measured from remote sensing or in situ observations (surface velocity, surface elevation, thinning rates, etc.), many important characteristics, such as englacial and basal properties, as well as past climate conditions, remain difficult or impossible to observe. Data assimilation has been used for decades in glaciology in order to infer unknown properties and boundary conditions that have important impact on numerical models and their projections. The basic idea is to use observed properties, in conjunction with ice flow models, to infer these poorly known ice properties or boundary conditions. There is, however, a great deal of variability among approaches. Constraining data can be of a snapshot in time, or can represent evolution over time. The complexity of the flow model can vary, from simple descriptions of lubrication flow or mass continuity to complex, continent-wide Stokes flow models encompassing multiple flow regimes. Methods can be deterministic, where only a best fit is sought, or probabilistic in nature. We present in this chapter some of the most common applications of data assimilation in glaciology, and some of the new directions that are currently being developed.
A good model aims to learn the underlying signal without overfitting (i.e. fitting to the noise in the data). This chapter has four main parts: The first part covers objective functions and errors. The second part covers various regularization techniques (weight penalty/decay, early stopping, ensemble, dropout, etc.) to prevent overfitting. The third part covers the Bayesian approach to model selection and model averaging. The fourth part covers the recent development of interpretable machine learning.
Many applications in geosciences require solving inverse problems to estimate the state of a physical system. Data assimilation provides a strong framework to do so when the system is partially observed and its underlying dynamics are known to some extent. In the variational flavor, it can be seen as an optimal control problem where initial conditions are the control parameters. Such problems are often ill-posed, regularization may be needed using explicit prior knowledge to enforce a satisfying solution. In this work, we propose to use a deep prior, a neural architecture that generates potential solutions and acts as implicit regularization. The architecture is trained in a fully-unsupervised manner using the variational data assimilation cost so that gradients are backpropagated through the dynamical model and then through the neural network. To demonstrate its use, we set a twin experiment using a shallow-water toy model, where we test various variational assimilation algorithms on an ocean-like circulation estimation.
Researchers of time series cross-sectional data regularly face the change-point problem, which requires them to discern between significant parametric shifts that can be deemed structural changes and minor parametric shifts that must be considered noise. In this paper, we develop a general Bayesian method for change-point detection in high-dimensional data and present its application in the context of the fixed-effect model. Our proposed method, hidden Markov Bayesian bridge model, jointly estimates high-dimensional regime-specific parameters and hidden regime transitions in a unified way. We apply our method to Alvarez, Garrett, and Lange’s (1991, American Political Science Review 85, 539–556) study of the relationship between government partisanship and economic growth and Allee and Scalera’s (2012, International Organization 66, 243–276) study of membership effects in international organizations. In both applications, we found that the proposed method successfully identify substantively meaningful temporal heterogeneity in parameters of regression models.
The jus temporis that is argued for in this chapter aims to explicate the value of human time that is to be found in the finite, irreversible, and unstoppable character of human time. To make the value of human time explicit, "rootedness" and "integration" are conceptually distinguished. The latter signifying qualified time, the former mere lapse of human time. Rootedness simply signifies the entanglement of presence on a territory with the lapse of finite and irreversible human time. This conception of rootedness is at the heart of jus temporis and its implications are not limited to questions of citizenship acquisition. It is argued that the value of rootedness equally applies to waiting time in procedures, endless forms of temporariness, and unlawful residence. Concretely, it is argued that this jus temporis implies two elements. The first is a certain openness to the future, the possibility that a certain situation will not last forever. The second element is that there should be end-terms at work in law: procedures may not last forever, temporariness may not continue eternally, and there should be a moment when long-term unlawful residences can become lawful.
In this paper, we concern with a backward problem for a nonlinear time fractional wave equation in a bounded domain. By applying the properties of Mittag-Leffler functions and the method of eigenvalue expansion, we establish some results about the existence and uniqueness of the mild solutions of the proposed problem based on the compact technique. Due to the ill-posedness of backward problem in the sense of Hadamard, a general filter regularization method is utilized to approximate the solution and further we prove the convergence rate for the regularized solutions.
In shape-from-focus (SFF) methods, a single focus measure is used to compute the focus volume. However, it seems that a single focus measure operator is not capable of computing accurate focus values for the images of diverse types of object shapes. Furthermore, most of the SFF methods try to improve the depth map without considering any additional structural or prior information. Consequently, the extracted shape of the object might lack important details. In this work, we address these problems and suggest a method in which depth hypotheses are combined for a more accurate 3D shape through 3D weighted least squares. First, depth hypotheses are obtained by applying a number of focus operators. Then, structural prior or guidance volume is extracted from the focus measure volumes. Finally, a 3D weighted least squares optimization technique is applied to the depth hypothesis volume, where weights are computed from the guidance volume. Thus, by inducing structural prior, an improved resultant depth map is obtained. The proposed method was tested using various image sequences of synthetic and microscopic real objects. Experimental results and comparative analysis demonstrated the effectiveness of the proposed method.
Wavelet theory is known to be a powerful tool for compressing and processing time series or images. It consists in projecting a signal on an orthonormal basis of functions that are chosen in order to provide a sparse representation of the data. The first part of this article focuses on smoothing mortality curves by wavelets shrinkage. A chi-square test and a penalized likelihood approach are applied to determine the optimal degree of smoothing. The second part of this article is devoted to mortality forecasting. Wavelet coefficients exhibit clear trends for the Belgian population from 1965 to 2015, they are easy to forecast resulting in predicted future mortality rates. The wavelet-based approach is then compared with some popular actuarial models of Lee–Carter type estimated fitted to Belgian, UK, and US populations. The wavelet model outperforms all of them.
In this chapter, we shall consider the design of neural nets, which are collections of perceptrons, or nodes, where the outputs of one rank (or layer of nodes becomes the inputs to nodes at the next layer. The last layer of nodes produces the outputs of the entire neural net. The training of neural nets with many layers requires enormous numbers of training examples, but has proven to be an extremely powerful technique, referred to as deep learning, when it can be used.We also consider several specialized forms of neural nets that have proved useful for special kinds of data. These forms are characterized by requiring that certain sets of nodes in the network share the same weights. Since learning all the weights on all the inputs to all the nodes of the network is in general a hard and time-consuming task, these special forms of network greatly simplify the process of training the network to recognize the desired class or classes of inputs. We shall study convolutional neural networks (CNNs), which are specially designed to recognize classes of images. We shall also study recurrent neural networks (RNNs) and long short-term memory networks (LSTMs), which are designed to recognize classes of sequences, such as sentences (sequences of words).