To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter we introduce developing and interpreting multilevel models. We first define multilevel models and explore how this approach is an improvement on disaggregation and aggregation of data across multiple levels. We then work through four different multilevel models. We provide examples of what kinds of questions can be answered by each model and how to interpret the statistical output. We then explore some additional issues in fitting multilevel models in Stata and consider additional applications of multilevel models.
Logistic regression is not limited to the modeling of binary dependent variables. It may be extended to the modeling of dependent variables with three or more categories that are either ordered or are unordered. In this chapter we discuss logistic regression of a multi-categorical dependent variable with ordered categories. An ordinal variable is one that is multi-categorical, and its categories are ordered. For example, one’s quality of life might be classified as “excellent,” “very good,” “good,” “fair,” or “poor.” Although these categories might be coded consecutively, 1, 2, 3, 4, and so forth, the dependent variable is not continuous. The responses may be coded from 1 = “poor” to 5 = “excellent.” But we do not know that the distances between each contiguous pair of responses is the same. Even though the responses might be coded as 1 to 5, we should not use an OLS regression model to predict a dependent variable such as the person’s categorical response to a quality of life question. We should use a statistical model that does not assume that the distances between any pair of categories is not the same. This chapter focuses on ordinal logistic regression.
Review of mathematical and statistical concepts includes some foundational materials such as probability densities, Monte Carlo methods and Bayes’ rule are covered. We provide concept reviews that provide additional learning to the previous chapters. We aim to generate first an intuitive understanding of statistical concepts, then, if the student is interested, dive deeper into the mathemetical derivations. For example, principal component analysis can be taught by deriving the equations and making the link with eigenvalue decomposition of the covariance matrix. Instead, we start from simple two- and three-dimensional datasets and appeal to the student’s insight into the geometrical aspect: the study of an ellipse, and how we can transform it to a circle. This geometric aspect is explained without equations, but instead with plots and figures that appeal to intuition starting from geometry. In general, it is our experience that students in the geosciences retain much more practical knowledge when presented with material starting from case studies and intuitive reasoning.
Dental occlusion is the way in which teeth fit into the mouth, side by side in the same jaw and between their chewing surfaces when the jaws are closed. It shows much variation in living people, many of whom have their occlusion adjusted by dentists. In archaeological and fossil remains there is much less variation and there has been considerable research and discussion aimed at finding the explanation. This chapter provides a concise introduction to the clinical background and outlines methods for recording occlusal variation. It follows this with a critical review of the evidence for a cause of the high prevalence of occlusal anomalies today.
Extreme Value Statistics focuses on predicting extremes larger than observed in datasets. An important area of application is natural hazards. In the chapter, we use diamond sizes and volcano eruptions as two specific examples. We start this chapter by focusing on graphical techniques, in particular quantile–quantile plots to analyze extremal data. We show why the exponential quantile plot is an essential tool in extremal analysis. One challenge in extremal analysis is to select a suitable probability distribution model to estimate extremes. Instead of making derivations of theoretical models, we illustrate how these models emerge from intuitive Monte Carlo experiments. The key statistical parameter in extreme value statistical models is the extreme value index. We link this index to quantile plots such as the Pareto quantile plot. We conclude with practical examples of predicting rare large diamonds, as well the return period of large volcano eruptions from a historical dataset.
This first chapter introduces our unique approach to teaching statistics. We note that while we review the statistical formulas for each method, we focus on the practical component of statistical analysis. We teach the readers how to apply and interpret the statistical methods and results. We then briefly describe the book’s content, which includes a concise explanation of the statistical techniques covered in each chapter. We end the chapter with suggestions on using the book to gain maximum benefit.
This chapter covers how to develop and interpret statistical tables and cross-tabulations. We begin by exploring the basic structure and components of tables starting with univariate tables. We then describe how to develop and interpret bivariate tables and introduce multivariate tables. Finally, we conclude the chapter with general recommendations about table design and how best to communicate statistical information in table form.
This chapter reviews several methods for addressing address the statistical problem of missing data. We first explain how missing data can affect different components of the study design and the statistical analyses in such a way that the validity of the findings may become questionable. We next describe several methods to address the missing data problem and show why some may be problematic. We explain why multiple imputation (MI) and maximum likelihood (ML) are the preferred methods for addressing missing data issues. We then present an example using Stata, focusing on one of the preferred methods, multiple imputation. Lastly, within the context of an analysis of adolescent pregnancy, we use several methods to handle missing data and show how the analysis results may differ depending on which missing data method is used.
Heavily worn teeth are one of the most prominent features of archaeological and fossil hominid dentitions. This chapter describes in detail the different patterns of wear and distinguishes between wear resulting from deliberate modification and wear resulting from contact with food and other residues in the mouth. It shows how different aspects of this wear can be measured and described, on both macroscopic and microscopic scales. In addition, it explores the possible mechanisms of wear and critically reviews the evidence that wear pattern provides for diet and non-dietary uses of teeth in the past.
The most common dental diseases in people today are dental caries, or decay, and periodontal disease. Evidence for them is sparse in fossil hominds, although more cases have been found in Upper Palaeolithic contexts. They become more common in Neolithic and later contexts, but only reached modern levels in post-industrial societies. Both conditions result from the presence of dental plaque on the teeth, so this chapter starts with a concise introduction to plaque biology. The deeper layers of a plaque accumulation become mineralised to form the deposits known as dental calculus or tartar. This has become a focus of recent anthropological research, particularly in relation to past diet. The chapter goes on to summarise recent clinical evidence for the way in which the lesions of dental caries and periodontal disease develop, and describes their pattern of occurrence in living people. This is contrasted with the archaeological pattern. The effect of diet, particularly the carbohydrate component, is discussed. Ancient jaws from older individuals show the combined effects of bone loss due to periodontal disease, loss due to infections which follow exposure of the pulp chamber by caries or fracturing, and the body’s compensation for tooth wear by remodelling of the bone in the jaws. The chapter explores the ways in which these different factors can be disentangled.
Geostatistics provides tools for spatio-temporal data analysis. The subsurface application we cover in this book is sustainable farming in Denmark. Readers will learn about geophysical techniques for infering the redox conditions of the subsurface. A second application happens at the surface of the Earth: glaciers melting in Antarctica. We introduce a significant ongoing effort in radar imaging mapping of the Thwaites glacier in Antarctica. Both cases call for building spatial models from incomplete data: spatial interpolation. We cover geostatistical methods for capturing spatial variability with variograms and illustrate why variograms are essential to spatial interpolation, kriging. We introduce conditional simulation as a method for generating many interpolated maps that reproduce realistic variation. We show how these maps represent spatial uncertainty, and thereby affect prediction, such as predicting redox conditions in Danish agricultural areas. Finally, we introduce ways of spatial interpolating using training images. We show how using exisiting training image the exposed Arctic topography can help us interpolate Antarctica.
Fully revised and updated, this third edition includes three new chapters on neural networks and deep learning including generative AI, causality, and the social, ethical and regulatory impacts of artificial intelligence. All parts have been updated with the methods that have been proven to work. The book's novel agent design space provides a coherent framework for learning, reasoning and decision making. Numerous realistic applications and examples facilitate student understanding. Every concept or algorithm is presented in pseudocode and open source AIPython code, enabling students to experiment with and build on the implementations. Five larger case studies are developed throughout the book and connect the design approaches to the applications. Each chapter now has a social impact section, enabling students to understand the impact of the various techniques as they learn them. An invaluable teaching package for undergraduate and graduate AI courses, this comprehensive textbook is accompanied by lecture slides, solutions, and code.