We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
I first review and critique the prevailing use of hypothesis tests to compare treatments. I then describe my application of statistical decision theory. I compare Bayes, maximin, and minimax regret decisions. I consider choice of sample size in randomized trials from the minimax regret perspective.
In Chapter 31 we study three commonly used techniques for proving minimax lower bounds, namely, Le Cam’s method, Assouad’s lemma, and Fano’s method. Compared to the results in Chapter 29, which are geared toward large-sample asymptotics in smooth parametric models, the approach here is more generic, less tied to mean-squared error, and applicable in non-asymptotic settings such as nonparametric or high-dimensional problems. The common rationale of all three methods is reducing statistical estimation to hypothesis testing.
In a variety of measurement situations, the researcher may wish to compare the reliabilities of several instruments administered to the same sample of subjects. This paper presents eleven statistical procedures which test the equality of m coefficient alphas when the sample alpha coefficients are dependent. Several of the procedures are derived in detail, and numerical examples are given for two. Since all of the procedures depend on approximate asymptotic results, Monte Carlo methods are used to assess the accuracy of the procedures for sample sizes of 50, 100, and 200. Both control of Type I error and power are evaluated by computer simulation. Two of the procedures are unable to control Type I errors satisfactorily. The remaining nine procedures perform properly, but three are somewhat superior in power and Type I error control.
Diagnostic classification models (DCMs) have seen wide applications in educational and psychological measurement, especially in formative assessment. DCMs in the presence of testlets have been studied in recent literature. A key ingredient in the statistical modeling and analysis of testlet-based DCMs is the superposition of two latent structures, the attribute profile and the testlet effect. This paper extends the standard testlet DINA (T-DINA) model to accommodate the potential correlation between the two latent structures. Model identifiability is studied and a set of sufficient conditions are proposed. As a byproduct, the identifiability of the standard T-DINA is also established. The proposed model is applied to a dataset from the 2015 Programme for International Student Assessment. Comparisons are made with DINA and T-DINA, showing that there is substantial improvement in terms of the goodness of fit. Simulations are conducted to assess the performance of the new method under various settings.
In the social sciences we are often interested in comparing models specified by parametric equality or inequality constraints. For instance, when examining three group means \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\{ \mu _1, \mu _2, \mu _3\}$$\end{document} through an analysis of variance (ANOVA), a model may specify that \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mu _1<\mu _2<\mu _3$$\end{document}, while another one may state that \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\{ \mu _1=\mu _3\} <\mu _2$$\end{document}, and finally a third model may instead suggest that all means are unrestricted. This is a challenging problem, because it involves a combination of nonnested models, as well as nested models having the same dimension. We adopt an objective Bayesian approach, requiring no prior specification from the user, and derive the posterior probability of each model under consideration. Our method is based on the intrinsic prior methodology, suitably modified to accommodate equality and inequality constraints. Focussing on normal ANOVA models, a comparative assessment is carried out through simulation studies. We also present an application to real data collected in a psychological experiment.
A paired composition is a response (upon a dependent variable) to the ordered pair <j, k> of stimuli, treatments, etc. The present paper develops an alternative analysis for the paired compositions layout previously treated by Bechtel's [1967] scaling model. The alternative model relaxes the previous one by including row and column scales that provide an expression of bias for each pair of objects. The parameter estimation and hypothesis testing procedures for this model are illustrated by means of a small group analysis, which represents a new approach to pairwise sociometrics and personality assessment.
A variety of distributional assumptions for dissimilarity judgments are considered, with the lognormal distribution being favored for most situations. An implicit equation is discussed for the maximum likelihood estimation of the configuration with or without individual weighting of dimensions. A technique for solving this equation is described and a number of examples offered to indicate its performance in practice. The estimation of a power transformation of dissimilarity is also considered. A number of likelihood ratio hypothesis tests are discussed and a small Monte Carlo experiment described to illustrate the behavior of the test of dimensionality in small samples.
The recent surge of interests in cognitive assessment has led to the development of cognitive diagnosis models. Central to many such models is a specification of the Q-matrix, which relates items to latent attributes that have natural interpretations. In practice, the Q-matrix is usually constructed subjectively by the test designers. This could lead to misspecification, which could result in lack of fit of the underlying statistical model. To test possible misspecification of the Q-matrix, traditional goodness of fit tests, such as the Chi-square test and the likelihood ratio test, may not be applied straightforwardly due to the large number of possible response patterns. To address this problem, this paper proposes a new statistical method to test the goodness fit of the Q-matrix, by constructing test statistics that measure the consistency between a provisional Q-matrix and the observed data for a general family of cognitive diagnosis models. Limiting distributions of the test statistics are derived under the null hypothesis that can be used for obtaining the test p-values. Simulation studies as well as a real data example are presented to demonstrate the usefulness of the proposed method.
A method for externally constraining certain distances in multidimensional scaling configurations is introduced and illustrated. The approach defines an objective function which is a linear composite of the loss function of the point configuration X relative to the proximity data P and the loss of X relative to a pseudo-data matrix R. The matrix R is set up such that the side constraints to be imposed on X’s distances are expressed by the relations among R’s numerical elements. One then uses a double-phase procedure with relative penalties on the loss components to generate a constrained solution X. Various possibilities for constructing actual MDS algorithms are conceivable: the major classes are defined by the specification of metric or nonmetric loss for data and/or constraints, and by the various possibilities for partitioning the matrices P and R. Further generalizations are introduced by substituting R by a set of R matrices, Ri, i = 1, …r, which opens the way for formulating overlapping constraints as, e.g., in patterns that are both row- and column-conditional at the same time.
The properties of nonmetric multidimensional scaling (NMDS) are explored by specifying statistical models, proving statistical consistency, and developing hypothesis testing procedures. Statistical models with errors in the dependent and independent variables are described for quantitative and qualitative data. For these models, statistical consistency often depends crucially upon how error enters the model and how data are collected and summarized (e.g., by means, medians, or rank statistics). A maximum likelihood estimator for NMDS is developed, and its relationship to the standard Shepard-Kruskal estimation method is described. This maximum likelihood framework is used to develop a method for testing the overall fit of the model.
Discusses statistical methods, covering random variables and variates, sample and population, frequency distributions, moments and moment measures, probability and stochastic processes, discrete and continuous probability distributions, return periods and quantiles, probability density functions, parameter estimation, hypothesis testing, confidence intervals, covariance, regression and correlation analysis, time-series analysis.
This chapter covers ways to explore your network data using visual means and basic summary statistics, and how to apply statistical models to validate aspects of the data. Data analysis can generally be divided into two main approaches, exploratory and confirmatory. Exploratory data analysis (EDA) is a pillar of statistics and data mining and we can leverage existing techniques when working with networks. However, we can also use specialized techniques for network data and uncover insights that general-purpose EDA tools, which neglect the network nature of our data, may miss. Confirmatory analysis, on the other hand, grounds the researcher with specific, preexisting hypotheses or theories, and then seeks to understand whether the given data either support or refute the preexisting knowledge. Thus, complementing EDA, we can define statistical models for properties of the network, such as the degree distribution, or for the network structure itself. Fitting and analyzing these models then recapitulates effectively all of statistical inference, including hypothesis testing and Bayesian inference.
This chapter elaborates on the calibration and validation procedures for the model. First, we describe our calibration strategy in which a customised optimisation algorithm makes use of a multi-objective function, preventing the loss of indicator-specific error information. Second, we externally validate our model by replicating two well-known statistical patterns: (1) the skewed distribution of budgetary changes and (2) the negative relationship between development and corruption. Third, we internally validate the model by showing that public servants who receive more positive spillovers tend to be less efficient. Fourth, we analyse the statistical behaviour of the model through different tests: validity of synthetic counterfactuals, parameter recovery, overfitting, and time equivalence. Finally, we make a brief reference to the literature on estimating SDG networks.
A quick introduction to the standard model of particle physics is given. The general concepts of elementary particles, interactions and fields are outlined. The experimental side of particle physics is also briefly discussed: how elementary particles are produced with accelerators or from cosmic rays and how to observe them with detectors via the interactions of particles with matter. The various detector technologies leading to particle identification are briefly presented. The way in which the data collected by the sensors is analysed is also presented: the most frequent probability density functions encountered in particle physics are outlined. How measurements can be used to estimate a quantity from some data and the question of the best estimate of that quantity and its uncertainty are explained. As measurements can also be used to test a hypothesis based on a particular model, the hypothesis testing procedure is explained.
Separation commonly occurs in political science, usually when a binary explanatory variable perfectly predicts a binary outcome. In these situations, methodologists often recommend penalized maximum likelihood or Bayesian estimation. But researchers might struggle to identify an appropriate penalty or prior distribution. Fortunately, I show that researchers can easily test hypotheses about the model coefficients with standard frequentist tools. While the popular Wald test produces misleading (even nonsensical) p-values under separation, I show that likelihood ratio tests and score tests behave in the usual manner. Therefore, researchers can produce meaningful p-values with standard frequentist tools under separation without the use of penalties or prior information.
For this book, we assume you’ve had an introductory statistics or experimental design class already! This chapter is a mini refresher of some critical concepts we’ll be using and lets you check you understand them correctly. The topics include understanding predictor and response variables, the common probability distributions that biologists encounter in their data, the common techniques, particularly ordinary least squares (OLS) and maximum likelihood (ML), for fitting models to data and estimating effects, including their uncertainty. You should be familiar with confidence intervals and understand what hypothesis tests and P-values do and don’t mean. You should recognize that we use data to decide, but these decisions can be wrong, so you need to understand the risk of missing important effects and the risk of falsely claiming an effect. Decisions about what constitutes an “important” effect are central.
Edited by
Alik Ismail-Zadeh, Karlsruhe Institute of Technology, Germany,Fabio Castelli, Università degli Studi, Florence,Dylan Jones, University of Toronto,Sabrina Sanchez, Max Planck Institute for Solar System Research, Germany
Abstract: In this chapter, I discuss an alternative perspective on interpreting the results of joint and constrained inversions of geophysical data. Typically such inversions are performed based on inductive reasoning (i.e. we fit a limited set of observations and conclude that the resulting model is representative of the Earth). While this has seen many successes, it is less useful when, for example, the specified relationship between different physical parameters is violated in parts of the inversion domain. I argue that in these cases a hypothesis testing perspective can help to learn more about the properties of the Earth. I present joint and constrained inversion examples that show how we can use violations of the assumptions specified in the inversion to study the subsurface. In particular I focus on the combination of gravity and magnetic data with seismic constraints in the western United States. There I see that high velocity structures in the crust are associated with relatively low density anomalies, a possible indication of the presence of melt in a strong rock matrix. The concepts, however, can be applied to other types of data and other regions and offer an extra dimension of analysis to interpret the results of geophysical inversion algorithms.
Finding one’s niche in any scientific domain is often challenging, but there are certain tips and steps that can foster a productive research program. In this chapter, we use terror management theory (TMT) as an exemplar of what designing a successful line of research entails. To this end, we present an overview of the development and execution of our research program, including testing of original hypotheses, direct and conceptual replications, identification of moderating and mediating variables, and how efforts to understand failures to replicate mortality salience effects led to important conceptual refinements of the theory. Our hope is that recounting the history of terror management theory and research will be useful for younger scholars in their own research pursuits in the social and behavioral sciences.
Writing the paper is one of the most challenging aspects of a project, and learning to write the report well is one of the most important skills to master for the success of the project and for sustaining a scholarly career. This chapter discusses challenges in writing and ways to overcome these challenges in the process of writing papers in the social and behavioral sciences. Two main principles emphasized are that writing is (a) a skill and (b) a form of communication. Skills are developed through instruction, modeling, and practice. In terms of communication, the research report can be conceived as a narrative that tells a story. Sections of the chapter focus on identifying common barriers to writing and ways to overcome them, developing a coherent and appropriate storyline, understanding the essential elements of a research paper, and valuing and incorporating feedback.
This chapter discusses the key elements involved when building a study. Planning empirical studies presupposes a decision about whether the major goal of the study is confirmatory (i.e., tests of hypotheses) or exploratory in nature (i.e., development of hypotheses or estimation of effects). Focusing on confirmatory studies, we discuss problems involved in obtaining an appropriate sample, controlling internal and external validity when designing the study, and selecting statistical hypotheses that mirror the substantive hypotheses of interest. Building a study additionally involves decisions about the to-be-employed statistical test strategy, the sample size required by this strategy to render the study informative, and the most efficient way to achieve this so that study costs are minimized without compromising the validity of inferences. Finally, we point to the many advantages of study preregistration before data collection begins.