To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this section, we will contruct a gauge-invariant operator, the Wilson loop, whose vacuum expectation value (VEV for short) can diagnose whether or not a gauge theory exhibits confinement. A theory is confining if all finite-energy states are invariant under a global gauge transformation. U(1) gauge theory—quantum electrodynamics—is not confining, because there are finite-energy states (such as the state of a single electron) that have nonzero electric charge, and hence change by a phase under a global gauge transformation.
Confinement is a nonperturbative phenomenon; it cannot be seen at any finite order in the kind of weak-coupling perturbation theory that we have been doing. (This is why we had no trouble calculating quark and gluon scattering amplitudes.) In this section, we will introduce lattice gauge theory, in which spacetime is replaced by a discrete set of points; the inverse lattice spacing 1/a then acts as an ultraviolet cutoff (see section 29). This cutoff theory can be analyzed at strong coupling, and, as we will see, in this regime the VEV of the Wilson loop is indicative of confinement. The outstanding question is whether this phenomenon persists as we simultaneously lower the coupling and increase the ultraviolet cutoff (with the relationship between the two governed by the beta function), or whether we encounter a phase transition, signaled by a sudden change in the behavior of the Wilson loop VEV.
Consider an educational study with data from students in many schools, predicting in each school the students' grades y on a standardized test given their scores on a pre-test x and other information. A separate regression model can be fit within each school, and the parameters from these schools can themselves be modeled as depending on school characteristics (such as the socioeconomic status of the school's neighborhood, whether the school is public or private, and so on). The student-level regression and the school-level regression here are the two levels of a multilevel model.
In this example, a multilevel model can be expressed in (at least) three equivalent ways as a student-level regression:
A model in which the coefficients vary by school (thus, instead of a model such as y = α + βx + error, we have y = αj + βjx + error, where the subscripts j index schools),
A model with more than one variance component (student-level and school-level variation),
A regression with many predictors, including an indicator variable for each school in the data.
More generally, we consider a multilevel model to be a regression (a linear or generalized linear model) in which the parameters—the regression coefficients—are given a probability model. This second-level model has parameters of its own—the hyperparameters of the model—which are also estimated from data.
The two key parts of a multilevel model are varying coefficients, and a model for those varying coefficients (which can itself include group-level predictors).
Statistical graphics are sometimes summarized as “exploratory data analysis” or “presentation” or “data display.” But these only capture part of the story. Graphs are a way to communicate graphical and spatial information to ourselves and others. Long before worrying about how to convince others, you first have to understand what's happening yourself.
Why to graph
Going back through the dozens of examples in this book, what are our motivations for graphing data and fitted models? Ultimately, the goal is communication (to self or others). More immediately, graphs are comparisons (to zero, to other graphs, to horizontal lines, and so forth). We “read” a graph both by pulling out the expected (for example, the slope of a fitted regression line, the comparisons of a series of confidence intervals to zero and each other) and the unexpected.
In our experience, the unexpected is usually not an “outlier” or aberrant point but rather a systematic pattern in some part of the data. For example, consider the binned residual plots in Section 5.6 for the well-switching models. There was an unexpectedly low rate of switching from wells that were just barely over the dangerous level for arsenic, possibly suggesting that people were moderating their decisions when in this ambiguous zone, or that there was other information not included in the model that could explain these decisions.
Analysis of variance (ANOVA) refers to a specific set of methods for data analysis and to a way of summarizing multilevel models:
As a tool for data analysis, ANOVA is typically used to learn the relative importance of different sources of variation in a dataset. For example, Figure 13.8 displays success rates of pilots at a flight simulator under five different treatments at eight different airports. How much of the variation in the data is explained by treatments, how much by airports, and how much remains after these factors have been included in a linear model?
If a multilevel model has already been fit, it can be summarized by the variation in each of its batches of coefficients. For example, in the radon modeling in Chapter 12, how much variation in radon levels is explained by floor of measurement and how much by geographical variation? Or, in the analysis of public opinion by state in Section 14.1, how much of the variation is explained by demographic factors (sex, age, ethnicity, education), and how much by states and regions?
These “analysis of variance” questions can be of interest even for models that are primarily intended for prediction, or for estimating particular regression coefficients.
The sections of this chapter address the different roles of ANOVA in multilevel data analysis. We begin in Section 22.1 with a brief review of the goals and methods of classical analysis of variance, outlining how they fit into our general multilevel modeling approach.