1. Introduction
Beef processors procure fed cattle through the negotiated cash market or alternative marketing arrangements (AMAs). In the negotiated cash market, cattle quality is only partially observable, with buyers relying on signals such as breed, days on feed, and seller reputation to estimate carcass outcomes. In contrast, AMAs reflect cattle quality through premiums and discounts paid for carcasses. A feedlot often has more information about cattle quality than packers. This informational advantage can result in sorting higher-quality cattle into AMAs and lower-quality cattle into cash markets, thereby creating a disparity in the distribution of quality characteristics (Whitley, Reference Whitley2002; Liu et al., Reference Liu, Muth, Koontz and Lawrence2009; Hildebrand and Chung, Reference Hildebrand and Chung2023). A feedlot can use informational advantages to make non-random decisions regarding where to sell cattle. Unobservable factors, such as carcass quality, influence the likelihood of selling lots in a specific market. For instance, a feedlot may channel cattle perceived as lower quality to the cash market while directing higher-quality cattle to AMAs. Peel et al. (Reference Peel, Anderson, Anderson, Bastian, Brown, Koontz and Maples2020) argued that AMAs do not alter the overall supply or demand for cattle. However, the profit-maximizing behavior of feedlots, which allocate higher-quality cattle to AMAs and lower-quality cattle to cash markets, can shift the distribution of cattle across these markets. Cattle producer revenues are impacted when the quality of cattle sold does not reflect the broader quality distribution in the market (Akerlof, Reference Akerlof1970; Koontz, Reference Koontz2015; Hildebrand and Chung, Reference Hildebrand and Chung2023).
Packers respond to carcass quality uncertainty by offering lower prices under less informative pricing systems, such as live- or dressed-weight markets (Fausti and Feuz, Reference Fausti and Feuz1995). Fausti et al. (Reference Fausti, Wang and Lange2013b) presented a micro-foundational model of seller behavior under asymmetric information. Their framework predicts that sellers sort cattle across marketing channels based on carcass quality and risk exposure expectations, particularly with Proposition 3 and Corollary 1 formally demonstrating that increased quality uncertainty leads to more intense sorting and reduced use of grid-based pricing. Although cattle quality is a crucial component in determining price, the distribution of cattle quality within cattle procurement has received limited empirical attention (Kootnz, Reference Koontz2010). Earlier livestock market studies modeled information asymmetry and adverse selection as a sample selection problem using Heckman and Roy’s generalized models (Wimmer and Chezum, Reference Wimmer and Chezum2003, Reference Wimmer and Chezum2006; Hildebrand and Chung, Reference Hildebrand and Chung2023). These studies analyzed adverse selection as a type of selectivity bias by examining the coefficient on the inverse Mills ratio (IMR) variable in the second stage of the Heckman and generalized Roy models. However, some authors have raised concerns about testing the statistical significance of the IMR coefficient as an indicator of selectivity bias, a common approach in earlier livestock market studies (Leung and Yu, Reference Leung and Yu1996; Puhani, Reference Puhani2000; Guo and Fraser, Reference Guo and Fraser2014; Wooldridge, Reference Wooldridge2010; Certo et al., Reference Certo, Busenbark, Woo and Semadeni2016). Wooldridge (Reference Wooldridge2010) highlighted that omitted variables can introduce bias into the effects of IMR on the outcome variable. The functional form of the outcome equation could also influence the reliability of the IMR coefficient as a measure of selection bias. Furthermore, the significance of the IMR coefficient does not necessarily indicate selection bias, as the heterogeneity in error correlations and the validity of exclusion restrictions (imposed in two-step estimation procedures) could also influence the significance of the IMR coefficient (Certo et al., Reference Certo, Busenbark, Woo and Semadeni2016). Thus, while the coefficient on the IMR variable may be an indicator of selectivity bias in cattle procurement markets, it could also be confounded by other unobserved factors.
Another potential issue affecting the IMR and its role in identifying selection bias is multicollinearity. The approximate linearity of the IMR over a wide range of values can introduce collinearity issues because it often shares many of the same covariates in the outcome equation (Puhani, Reference Puhani2000). The multicollinearity problem can reduce the power of testing the significance of the IMR coefficient (Leung and Yu, Reference Leung and Yu1996).
The objective of this study is to test for sample selection in the cattle procurement market by examining the empirical distribution of cattle quality in the AMA and cash markets using data from a single feedlot in Oklahoma. We extend earlier studies, in particular, Wimmer and Chezum (Reference Wimmer and Chezum2003), Wimmer and Chezum (Reference Wimmer and Chezum2006), Fausti et al. (Reference Fausti, Wang and Lange2013b), and Hildebrand and Chung (Reference Hildebrand and Chung2023), by directly testing distributions of cattle quality in the AMA and cash markets without relying on a significance test of the IMR coefficient. We use a heteroskedastic probit regression to generate the IMR, assuming that it is a proxy for cattle quality. Self-selection of lots into the cash market implies that unobservable quality factors influencing the probability of selection are captured in the IMR variable (Vella, Reference Vella1998; Hildebrand and Chung, Reference Hildebrand and Chung2023). To demonstrate that the IMR is a proxy of unobserved cattle quality, we regress it on observed cattle quality variables to examine the extent to which these variables explain variation in the IMR.
To address concerns about the validity of the IMR as a proxy for cattle quality, we apply a residual regression approach and incorporate additional quality metrics that capture carcass characteristics. Following De Paula et al. (Reference De Paula, Tedeschi, Paulino, Fernandes and Fonseca2013) and Jones et al. (Reference Jones, Takahashi, Fleming, Griffith, Harris and Lee2021), we use premiums and discounts from the AMA transactions as proxies for carcass quality and extend the analysis by predicting carcass quality for all transactions. These adjustments enable us to construct two refined measures of cattle quality that account for both observed and unobserved attributes. The second contribution to the literature involves the use of nonparametric procedures to detect differences in cattle quality between the AMA and cash markets. Nonparametric tests compare the distributions of cash and AMA quality variables without making strong distributional assumptions (Lachin, Reference Lachin2020; Faizi and Alvi, Reference Faizi and Alvi2023). The study uses a rare dataset from a single feedlot that includes detailed cattle transaction information such as prices, quantities, and quality data. The feedlot’s AMA transactions are post-harvest prices directly linked to known carcass attributes, typically only known between the feedlot and the packer, and are not publicly available. This type of data is seldom available in studies of the cattle procurement market.
2. Literature review
Quality signals influence cattle pricing, and failure to measure quality accurately in a market results in lower prices for higher-quality producers (Whitley, Reference Whitley2002; Fausti et al., Reference Fausti Scott, Wang, Qasmi and Diersen2014). Whitley, Reference Whitley2002) reported that price declines in negotiated cash markets result from producers removing higher-quality cattle from the market. Packers cannot directly observe cattle quality in the negotiated cash market, which potentially leads to persistent information asymmetry because feedlots are not required to disclose complete information about the quality of their cattle. In a market where sellers control information on quality, buyers price products only based on the average quality of products observed in the market (Wilson, Reference Wilson1980). A feedlot’s informational advantages include knowledge of calf genetics, origin, growth rate, and medical history. Commanding an informational advantage implies that a feedlot can decide which market to sell lots based on perceived animal quality. Specifically, a feedlot can strategically sort pens into lower- and higher-quality lots and send the higher-quality lots to AMA markets to capture premiums, while sending the lower-quality lots to the cash market to avoid discounts. Feedlots channeling lower-quality cattle into the cash market while reserving higher-quality cattle for the AMA market can impact the underlying distribution of cattle quality in both markets (Whitley, Reference Whitley2002).
Fausti and Feuz (Reference Feuz, Fausti and Wagner1995) developed a theoretical model that demonstrated packers respond to carcass quality uncertainty by offering lower prices – they charge a risk premium – under less informative pricing mechanisms. Their work, grounded in the economics of uncertainty, illustrated how buyers adapt strategically to mitigate quality risk. Building on that study, Fausti et al. (Reference Fausti, Wang and Lange2013b) developed a micro-foundational model of seller behavior under asymmetric information. Their framework showed that sellers sort cattle across marketing channels based on their private expectations of carcass quality and risk preferences. The model predicts that sellers will avoid grid-based pricing when quality uncertainty or perceived downside risk is high. Their empirical analysis supports this conjecture by showing that steers, which are typically lower-risk animals, are more often marketed on grids. At the same time, sellers disproportionately sold heifers on a live-weight basis.
Other previous studies on information asymmetry and adverse selection in agricultural markets focused on how these factors influence prices. Wimmer and Chezum (Reference Wimmer and Chezum2003) examined the impact of adverse selection on the thoroughbred horse industry using Heckman’s selection model. Like many others, their approach used self-selection bias arguments to proxy adverse selection by drawing conclusions based on the IMR coefficient. In a subsequent study, Wimmer and Chezum (Reference Wimmer and Chezum2006) used sample selection procedures to detect adverse selection, focusing on markets where sellers possessed informational advantages over buyers and influenced the distribution of horse quality. Wimmer and Chezum found a statistically significant negative coefficient on the IMR coefficient. They concluded that breeders focused on genetics related to racing traits tended to produce higher-quality horses, while sellers selectively marketed lower-quality ones. Hildebrand and Chung (Reference Hildebrand and Chung2023) modeled selectivity bias in a cattle procurement market using Heckman’s and generalized Roy models to test for adverse selection. They examined the impact of cattle producers’ selectivity bias between the cash and AMA markets on cattle prices. The authors identified adverse selection in the cattle procurement market by examining the sign and significance of the IMR coefficient. Feedlots’ self-selection into the cash market negatively affected prices and decreased producer revenue. None of these earlier studies empirically tested cattle quality differences stemming from informational advantages and self-selection into markets.
Earlier studies on adverse selection in livestock markets predominantly used Heckman’s or generalized Roy models, which rely on the significance of the IMR variable to account for sample selection bias. It is difficult to statistically test the IMR coefficient for sample selection when the model is mis-specified and/or the exclusion restrictions are improperly imposed (Wooldridge, Reference Wooldridge2010). Certo et al. (Reference Certo, Busenbark, Woo and Semadeni2016) found that a significant IMR coefficient does not always indicate sample selection bias. In contrast, an insignificant IMR coefficient can still coincide with the presence of bias if the model is mis-specified. This issue is particularly evident when sample sizes are small or the exclusion restrictions are weak. In these cases, selection models can struggle to produce significant IMR coefficients when selection bias is present because the tests are underpowered. Lennox et al. (Reference Lennox, Francis and Wang2012) suggest that conclusions regarding selection bias rely on researchers’ assumptions about the correct functional form and the choice of exclusion restrictions. Another limitation of using the IMR coefficient to test for adverse selection is the potential for multicollinearity between the IMR and instruments (Stolzenberg and Relles, Reference Stolzenberg and Relles1990; Moffitt, Reference Moffitt1999). Multicollinearity can also arise when there are no valid exclusion restrictions to identify the model (Leung and Yu, Reference Leung and Yu1996; Vella, Reference Vella1998; Bushway et al., Reference Bushway, Johnson and Slocum2007; Lennox et al., Reference Lennox, Francis and Wang2012; Certo et al., Reference Certo, Busenbark, Woo and Semadeni2016).
To address the drawbacks of relying on the statistical significance of the IMR coefficient for sample selection bias, we assess whether systematic differences in cattle quality exist between the cash and AMA markets – differences that may result from feedlot selection behavior. By focusing on cattle quality distributions rather than relying exclusively on the significance of the IMR coefficient, we provide a more robust assessment of selection bias in cattle procurement.
2.1. Data
The dataset includes the transactions of a single feedlot in Oklahoma from November 2018 to July 2019. It includes 399 lots of cattle, totaling 18,097 head. It also includes information on transaction dates, buyer/packer location, cattle transaction prices, pay weights, average daily gains (ADGs), and post-carcass attributes from AMA transactions, such as quality and yield adjustments in dollars. The market choice was between the cash market and AMA, which encompasses all other cattle marketing options, excluding the cash market.
Table 1 presents the descriptive statistics for the key variables used in this study. The total number of lots sold in the cash market is 163 and 236 in the AMA, constituting 7 and 93% of the total number of heads of cattle, respectivelyFootnote 1. The average lot size sold in the cash market and AMA was, respectively, 8 and 71, with an ADG of 1.561 lb and 3.082 lb. The weekly average price reported in the study region for cash market transactions, $122.20/cwt, was higher than the average AMA transaction price of $119.82/cwt. For cash market transactions, the average lot price ranges from $10.41/cwt to $136.82/cwt, with an average lot price of $66.55/cwt. The minimum and maximum lot prices for AMA transactions are $99.32/cwt and $134.70/cwt, respectively, with an average lot price of $122.45/cwt. These prices suggest that the feedlot received higher prices per lot from AMA transactions than from cash market transactions. The differences in average lot prices and daily gains suggest quality differences between the two markets. Among the lots sold in the cash and AMA markets, 85 and 28%, respectively, had a final weight of less than 1200 lb and greater than 1500 lb, while 15 and 72% had a final weight greater than 1200 lb and less than 1500 lb. A higher percentage of lots sold in the cash market consisted of steers (56%). In comparison, 49% of lots sold in the AMA consisted of heifers. Most of the lots sold in both markets had a medium frame size.
Table 1. Descriptive statistics of the variables used in the regressions

* Weekly average price indicates the average price reported in the region at the time of the transaction.
2.2. Methods and procedures
We propose a method to directly test for sample selection concerning cattle quality between the cash and AMA markets. We estimate an IMR from a heteroskedastic probit regression to proxy cattle quality. We supplement this step using a residual regression approach to proxy unobservable characteristics related to cattle quality. We predict cattle quality using carcass information available in the AMA transactions. These expected values are further used to predict an additional proxy based on the IMR and residual values. The distributions of cattle quality in the cash and AMA markets are then tested using nonparametric procedures to identify sample selection bias. This approach has not been previously used to identify sample selection in the cattle procurement market.
2.3. Heteroskedastic probit regression
A first-stage probit regression generates an IMR for each feedlot in the cash and AMA markets. The dependent variable is the feedlot’s market choice, which can be either the cash market or the AMA market. The independent variables include factors hypothesized to influence this choice. The latent, linear response model is:
where i = 1, …, N indexes the cattle lot, ε i ∼ N(0, σ i2) is an unobserved independent and identically distributed error term,
${\bf x}$
is a vector of covariates hypothesized to influence market choice,
$\boldsymbol{\beta}$
are corresponding parameters to be estimated, and E(σ i2) = 1 is a variance term. The market choice y i= 1 if the lot is sold in the cash market, and equals ‘0’ if a cattle lot is sold in the AMA market. The 1 × k covariate vector
${\bf x}_{i}$
includes lot size, ADG of lots, a dummy variable for lots consisting of steers, a dummy variable indicating whether the final weight of lots was outside the approved range (below 1200 lb or above 1500 lb), and quarterly dummy variables indicating when the transaction occurred (Quarter 3 is the reference category)Footnote 2.
The above model assumes a homoscedastic error term distribution, given the specified expectation. Violation of this assumption results in a mis-specified model, which could lead to inconsistent and biased maximum likelihood estimates (Greene, Reference Greene2011). The lot size may influence the probability of cattle being sold in the cash market. Smaller lots increase the likelihood of lots being sold in the cash market. In comparison, larger lots are sold in the AMA, suggesting that lot size could affect the variance of the selection probabilities. Lot heterogeneity can be modeled by relaxing the expectation that the error variance of equation 1 is constant (i.e., 1) across observations and parametrizing the observation-level variances as:
where
${\bf z}_{i}$
is a 1 × g vector of covariates with “1” included in the first position and the others hypothesized to affect the conditional variance for observation i (Harvey, Reference Harvey1976). When all the parameters in equation 2 equal zero, then E(σ i2) = 1. The probability that a lot is sorted into the cash market is;
Our central hypothesis is twofold: (1) information advantages allow a feedlot to sort cattle into lower- and higher-quality lots, and (2) a feedlot allocates lower-performing cattle into smaller lots, which are sent to the cash market. Including lower-quality cattle in larger, higher-quality lots risks reducing the premiums received under AMA pricing systems.
As the previous research suggests, the results of the Heckman and Roy sample selection models can be interpreted as evidence of adverse selection if the selection mechanism is based on private information that buyers cannot observe. ADG provides feedlots with private information that they can use to differentiate between lower-quality and higher-quality cattle. ADG is a crucial indicator of how well cattle perform on feed. Thompson et al. (Reference Thompson, DeVuyst, Brorsen and Lusk2014) reported that ADG helps feedlots distinguish between low-performing and high-performing cattle, allowing them to make informed marketing decisions. Mandell et al. (Reference Mandell, Gullett, Wilton, Allen and Osborne1997) found that higher ADG is associated with increased carcass weight and better marbling, reinforcing ADG as a plausible metric for cattle performance. However, breed characteristics influence ADG and carcass quality because genetic variations impact feed efficiency and growth rate (Detweiler et al., Reference Detweiler, Pringle, Rekaya, Wells and Segers2019). Given these factors, we hypothesize that ADG has a negative relationship with the cash market because better-performing cattle are more likely to enter AMA markets for feedlots to capture premiums based on carcass traits.
Cattle sex influences both carcass quality and marketing behavior. Heifers generally produce beef with a higher intramuscular fat content, resulting in better marbling and higher-quality grades than steers (Moore et al., Reference Moore, Gray, Hale, Kerth, Griffin, Savell and O’Connor2012). However, they also present greater financial risk due to higher pregnancy rates and dark cutter incidence, which can result in carcass discounts under grid pricing. Fausti et al. (Reference Fausti, Diersen, Qasmi, Li and Lange2013a) find that steers are more likely to be marketed through grid-based systems. At the same time, heifers are more often sold on a live-weight basis to avoid this quality uncertainty. Therefore, despite their superior marbling, heifers are less likely to be marketed through pricing systems that penalize carcass quality variability, such as AMA, and more likely to be sold in the cash market.
Seasonal dummy variables are included in the regression to control for seasonal variations in profitability. Price dynamics influence producers’ choice of market when selling their cattle (Poss et al., Reference Poss, Coatney, Rivera, Dinh, Little and Maples2022; Belasco et al., Reference Belasco, Taylor, Goodwin and Schroeder2009). Fausti and Qasmi (Reference Fausti and Qasmi2002) report that price differences between low- and high-quality cattle tend to widen in the third quarter of the calendar year. Their study also finds that selling cattle on the grid is more profitable during the third quarter. Therefore, the third quarter serves as the base quarter among the seasonal dummy variables.
2.4. IMR as a proxy for cattle quality
Self-selection is hypothesized to arise from unobservable private information owned by the feedlot. We hypothesize that the IMR captures a portion of this unobservability (Nguyen and Wang, Reference Nguyen and Wang2013; Maiga, Reference Maiga2014; Tan et al., Reference Tan, Chapple and Walsh2017). Private information creates a selection bias when informed parties self-select in favor of either low- or high-quality goods (Anton, Reference Anton, Augier and Teece2018). The estimated IMR from the selection equation is related to the probability of selection. The IMR captures unobserved factors as a generalized residual (Gourieroux et al., Reference Gourieroux, Monfort, Renault and Trognon1987; Vella, Reference Vella1993, Reference Vella1998). Carcass quality is unobserved in the cattle procurement market, particularly in the cash market. Quality is a critical component of the cattle price and market selection process. Feedlots may exploit their private information about cattle quality in the procurement market by sorting low-quality cattle into the cash market and high-quality cattle into the AMA market. The IMR, derived from the selection process, is an imperfect proxy for the unobserved quality of cattle for each lot. In the cattle procurement market, it is a monotonically decreasing function of the probability of a lot being sorted into the cash market based on a feedlot’s information advantage (Heckman, Reference Heckman1976; Liu and Yu, Reference Liu and Yu2022). Under the assumption that the IMR is a perfect proxy for quality, it increases when a lot is less likely to be sorted into the cash market. This outcome suggests that lots with a higher IMR are less likely to be sold in the cash market based on observed characteristics, potentially signaling higher perceived quality due to unobserved, favorable attributes. Conversely, and assuming that the IMR is a perfect proxy for unobserved quality factors, lots with a high probability of being placed in the cash market will have a lower IMR, indicating that their observed characteristics increase their likelihood of selection into the cash market, possibly signaling lower perceived quality. These observed characteristics reflect feedlots’ informational advantage over packers in market selection.
Thus, the IMR for each lot is hypothesized to reflect the feedlot’s perceived cattle quality – information unavailable to packers at the time of sale – based on observable and unobservable factors. This proxy for cattle quality, derived from the heteroskedastic probit regression estimates, is used to test for sample selection. Based on our assumption of a non-constant variance of the error terms, we estimate a scaled IMR as the ratio of the standard normal probability density function to the standard normal cumulative density function:
$$\lambda \left({\bf x}_{i}\widehat{\boldsymbol \beta,}{\bf z}_{i}\hat{\boldsymbol \gamma}\right)={\phi ({\bf x}_{i}\widehat{\boldsymbol \beta} /\!\exp ({\bf z}_{i}\widehat{\boldsymbol \gamma})) \over \Phi ({\bf x}_{i}\hat{\boldsymbol \beta} /\!\exp ({\bf z}_{i}\widehat{\boldsymbol \gamma}))},$$
where φ(.) is the standard normal probability density function, Φ(.) is the standard normal cumulative density function and
$\exp ({\bf z}_{i}\hat{\boldsymbol \gamma})$
is the heteroskedastic variance of the error term for lot i.
2.5. Residual regression
The IMR is a generalized residual encompassing all unobservables, including, but not limited to, cattle quality. Assuming the IMR completely embodies all unobserved cattle quality is a heroic assumption. We use a residual regression procedure to address this issue. This auxiliary regression helps to partial out any confounding variation in IMR unrelated to cattle quality. We regress the IMR on independent variables that are not directly related to cattle quality but influence market selection. Then, we use the residuals from that regression as another proxy for cattle quality. McGranahan et al. (Reference McGranahan, Wojan and Lambert2011) employed Glaeser et al.’s (Reference Glaeser, Kolko and Saiz2001) residual regression to estimate the additional value that residents would pay for housing by regressing house value on homeowner income. The residuals from this regression were used to proxy the additional value of a residence unexplained by income, which they attribute to an outdoor amenity value. Jung et al. (Reference Jung, Lee and Weber2014) used a residual regression procedure in their study on financial reporting quality to proxy accounting quality. Ali and Zhang (Reference Ali and Zhang2015) regressed the residuals from a firm-specific expense equation on assets, changes in capital, and revenue. They used these residuals to proxy managerial discretion.
As used here, the residual regression captures the portion of the IMR that remains unexplained by factors not directly related to observable quality variables. By regressing the IMR on non-quality-related variables, the effects of these variables are removed, ensuring that the residuals genuinely account for the portion of IMR that represents cattle quality. Additionally, the residuals from this regression capture the part of the IMR that is independent of non-quality-related factors and are assumed to indicate cattle quality. We specify a linear model for the residual regression as:
where
${\bf D}_{i}$
is a vector of covariates unrelated to cattle quality but related to market choice,
$\boldsymbol \delta$
are parameters to be estimated, and ui ∼ N(0, σ u2) is a random error term. Then, we consider the value of the residuals from equation (5) as an additional proxy for cattle quality:
2.6. Predicting cattle quality metrics linked to carcass quality
Our next approach is to estimate two additional cattle quality metrics that directly reflect carcass quality, as evaluated at packing plants. In the AMA transactions, lots sold in the AMA receive either premiums or discounts based on the carcass quality (quality adjustment and yield adjustment). First, following a similar approach to De Paula et al. (Reference De Paula, Tedeschi, Paulino, Fernandes and Fonseca2013) and Jones et al. (Reference Jones, Takahashi, Fleming, Griffith, Harris and Lee2021), we predict carcass quality for the entire sample using a truncated regression of carcass quality metrics and premiums/discounts in AMA transactions on physical quality traits. The truncated regression is specified as:
where qi is the observed carcass quality metric for each lot,
${\bf V}_{i}$
is a vector of covariates related to cattle physical quality traits,
$\boldsymbol \alpha$
are parameters to be estimated, and ω i ∼ N(0, σ ω2) is a random error term.
A linear model is then specified to assess the relationships between the two earlier estimated carcass quality measures – IMR and residuals in equation (5) – and the predicted carcass quality variable, premiums and discounts, from equation (7). In this step, the IMR and residuals are regressed on a quality adjustment variable – premium and discount obtained from the AMA market after cattle slaughter – to determine how much the regressor explains the variation in the IMR and residuals. These models are:
where
$\boldsymbol \theta$
and
$\boldsymbol \vartheta$
are parameters to be estimated,
$\widehat{q}_{i}$
is the predicted value of premiums and discounts, as discussed above, and v i and ξ i are error terms. The variability in carcass quality may lead to heteroskedasticity in equations (8) and (9). The Breusch–Pagan test is performed to address this, and the presence of heteroskedasticity is corrected using a generalized least squares (GLS) procedure. Subsequently, predicting the IMR and residuals from equations (8) and (9) generates two additional metrics of cattle quality: the carcass quality-adjusted IMR and residuals. As a result, four estimated measures of cattle quality are used to test for selectivity bias in the cattle procurement market: the IMR and residuals derived from equations (4) and (6), along with the carcass quality-adjusted IMR and residuals obtained from equations (8) and (9).
2.7. Testing for sample selection
Testing the IMR coefficient for adverse selection using Heckman and Roy’s models has limitations. Collinearity can inflate the standard error of the IMR estimate, which is sensitive to the variables excluded from the outcome equation (Bushway et al., Reference Bushway, Johnson and Slocum2007; Certo et al., Reference Certo, Busenbark, Woo and Semadeni2016; Lennox et al., Reference Lennox, Francis and Wang2012). To address these limitations, we compare the distributions of cattle quality from the cash and AMA markets for sample selection using nonparametric procedures. A parametric test, such as a t-test, could be used to compare the distributions of cattle quality in cash and AMA markets. However, parametric tests rely on two key assumptions: the quality of cattle should be normally distributed between the two markets, and the variance of cattle quality should be the same between the two markets (Portney and Watkins, Reference Portney and Watkins2009).
Fagerland (Reference Fagerland2012) suggests that studies investigating differences in distribution can use nonparametric tests, even when the assumptions of parametric tests are satisfied. Nonparametric tests compare differences in distributions without making strong assumptions about the population parameters, and they provide a valid test statistic even in small samples (Faizi and Alvi, Reference Faizi and Alvi2023). This study uses two nonparametric two-sample tests: the Kolmogorov–Smirnov (KS) and Mood median tests.
We use the KS test to determine whether the distribution of cattle quality (using both IMR, residuals, and their expected values as proxies for quality) in cash and AMA markets is from the same distribution. The test involves determining the cumulative frequency distribution for cattle quality for each lot sold in the procurement market, using a single interval for both distributions (Kolmogorov, Reference Kolmogorov1993). The conclusions of this test are based on the p-value for the test statistic (D). If the p-value is below a desired threshold, say 0.05, the test result suggests that the two groups sampled are not from the same population and are significantly different. The test statistic for this test is the maximum absolute difference between the empirical distribution functions (EDFs) of the two distributions (Conover, Reference Conover1999). The Mood median test utilizes Pearson’s chi-square statistic to evaluate the null hypothesis that the medians of two independent samples are identical, comparing the number of observations above and below the median in each sample. In our case, the median cattle quality is hypothesized to be identical for cash and AMA markets. Like the KS test, the Mood median test is robust to skewness or outliers. It makes no assumptions about the distribution (Chen, Reference Chen2014).
3. Results
Table 2 presents the heteroskedastic probit regression estimates for the factors that influence the selection of a lot into the cash market. The Wald chi-square statistic, which tests the significance of the coefficients in the mean equation, was significant at the 1% level, indicating that at least one of the coefficients of the predictors in the mean equation differs significantly from zero. The Belsley, Kuh, and Welsch’s collinearity condition index was used to detect collinearity among the covariates (Belsley et al., Reference Belsley, Kuh and Welsch1980; Kutner et al., Reference Kutner, Nachtsheim and Neter2004). The covariates in the heteroskedastic probit model had a collinearity condition index of 3.21, which is less than 30. We therefore concluded that collinearity does not appear to compromise statistical inference.
Table 2. Heteroskedastic probit estimates of the factors influencing lot selection into the cash market

*, **, and *** indicate the statistical significance at 10,5 and 1% levels, respectively.
a
${\partial \partial \Pr \left(y_{i}=1|{\bf x}_{i},{\bf z}_{i}\right) \over \partial x_{i}}=\phi \left[{\bf x}_{i}\widehat{\boldsymbol \beta} /\!\exp ({\bf z}_{i}\widehat{\boldsymbol \gamma})\right](1/\!\exp ({\bf z}_{i}\widehat{\boldsymbol \gamma})\left[\widehat{\boldsymbol \beta_{\boldsymbol{i}}}-\widehat{\boldsymbol{\gamma }_{\boldsymbol{i}}} .({\bf x}_{i}\widehat{\boldsymbol \beta})\right])$
.
b Standard errors were calculated using the delta method.
c Belsley, Kuh, and Welsch’s collinearity index (Belsey et al., Reference Belsley, Kuh and Welsch1980).
The marginal effect of lot size indicates that a unit increase in the lot size is associated with a 0.4 percentage point decrease in the probability of selecting a lot for sale in the cash market. Although the per-unit effect of lot size is relatively small, the cumulative effect becomes more meaningful over realistic changes in lot size. For instance, a 25-head increase in lot size would correspond to a 0.10 decrease in the probability of cash market selection, which may be economically relevant for feedlots managing large pens. This result aligns with the hypothesis that this feedlot strategically sorts low-quality cattle into small lots from high-quality pens and sells them differently in the cash market, as including these lots in AMA transactions can impact the premiums received for a pen. The estimate of ADG is significant at the 10% level. The probability of a lot being sold in the cash market decreases by 0.047 with a 1-pound increase in the ADG of the lot. This result suggests that lots with high ADG will likely be sold in the AMA market. In contrast, lots that do not gain much weight on average are sold in the cash market to avoid any discount based on the carcass quality traits.
The coefficient of steer lots was positive and suggests that steer lots are likely to be sold in the cash market compared to heifers and mixed lots. However, Moore et al. (Reference Moore, Gray, Hale, Kerth, Griffin, Savell and O’Connor2012) reported that beef from heifers tends to be of higher quality than that from steers due to greater intramuscular fat content in heifers, which could potentially result in steers being marketed through the cash market. Our finding contradicts the findings of Fausti et al. (Reference Fausti, Diersen, Qasmi, Li and Lange2013a), who found that steers are more likely to be marketed through grid-based pricing systems, while heifers – despite often receiving higher-quality grades due to greater marbling – are more likely to be sold on a live-weight basis to avoid carcass quality risk resulting from pregnancy and dark cutter discounts.
3.1. Truncated regression results
Table 3 presents the truncated regression results of the carcass quality metric (quality adjustment) on physical quality traits, which are used to predict the two additional proxies: carcass quality-adjusted IMR and residuals. The Wald Chi-square statistic indicated that the truncated regression model predicting carcass quality for the entire sample was statistically significant at the 1% level, resulting in the rejection of the null hypothesis that all the regression coefficients are jointly equal to zero. The predictors had a collinearity condition index of 6.42, indicating that collinearity did not influence inferences. Lots consisting of steers received lower premiums compared to heifers. This finding suggests that steer carcasses are of lower quality relative to heifers. It was further observed that a one-pound increase in the ADG of lots results in higher premiums received. Additionally, lots with final weights below and above 1200 lb and 1500 lb, respectively, are associated with lower premiums.
Table 3. Truncated regression estimates of carcass quality on physical traits

a Final weight below and above 1200 lb and 1500 lb, respectively.
b Belsley, Kuh, and Welsch’s collinearity index (Belsey et al., Reference Belsley, Kuh and Welsch1980).
** and ***indicate the statistical significance at 5 and 1% levels, respectively.
3.2. GLS regression results
Results from Table 4 show the regression of the IMR and residual values from equation (4) on the predicted carcass quality variables, premium/discount, from the AMA transactions. The variation of predicted values of quality adjustment explains 38% of the variation in IMR and 34% of the variation in the residuals. Additionally, the positive coefficient of predicted quality adjustment indicates that higher-quality cattle are more likely to be selected in the AMA, which aligns with our earlier discussions in the Methods and Procedure section. Notably, the coefficients of predicted quality adjustment in the IMR and residual equations are statistically significant at the 1% level, indicating that the IMR and residuals provide a better measure of cattle quality.
Table 4. Regression of IMR and residuals on predicted carcass quality traits

IMR = inverse Mills ratio.
*** indicates the statistical significance at 1% level, respectively.
3.3. Sample selection test
The cash and AMA market residuals, as well as the predicted residuals EDFs, were used to compare the distribution of IMR and the predicted IMR graphically. From Figure 1, it is evident that the EDFs for IMR and predicted IMR in the AMA market always lie to the right of the EDFs for the cash market, indicating that the distribution of cattle quality in the AMA market stochastically dominates that in the cash market at the first order across all quality proxies. This finding suggests that the IMR distributions between the two markets differ. The EDFs for both residuals and predicted residuals followed a similar pattern to the IMR and predicted IMR. The AMA EDFs generally rise to the right of the cash market EDFs, confirming that the distribution of cattle quality in the AMA market dominates that of the cash market at the first order. Overall, the EDFs for all the proxies (IMR, predicted IMR, residuals, and predicted residuals) demonstrate that the distributions differ, with the AMA EDFs stochastically dominating the cash market EDFs at the first order for all proxies. Across the entire distribution of cattle quality, first-order stochastic dominance indicates that for any given threshold value of cattle quality, the probability of obtaining a value above that threshold is higher under the AMA than under the cash market. This finding suggests that the AMA consistently outperforms the cash market in terms of higher cattle quality, as measured by the proxies used in this study. It also provides strong evidence that this feedlot strategically sorts higher-quality cattle into the AMA market.

Figure 1. Empirical density plots comparing the distribution of cattle quality variables by market. Kolmogorov–Smirnov and Mood median test statistics are reported within each panel. *** denotes statistical significance at the 1% level.
We also tested the statistical differences in the distribution of all four proxy variables in the AMA and cash markets. All test results are statistically significant at the 1% level, indicating that the distributions and medians of cattle quality, as measured by the IMR, predicted IMR, residuals, and predicted residuals, differ between the two markets. This result confirms the graphs in Figure 1. Using the IMR and predicted IMR as proxies for quality, we rejected the null hypothesis that the cattle quality distributions in the AMA and cash markets are from the same population. The Mood’s median test results provide evidence for rejecting the null hypothesis that the median for cattle quality, using IMR and predicted IMR as proxies, is the same in both markets. Corroborating the findings from the KS test, this result demonstrates a clear difference in the distribution of cattle quality between the two markets. Testing for distributional differences using residuals and predicted residuals as proxies for cattle quality further supports the IMR and predicted IMR findings, confirming that the distribution of cattle quality differs between the markets. These results provide evidence of quality differences in cattle procurement in the market analyzed, driven by the feedlot’s superior information about lot quality relative to packers.Footnote 3 This informational advantage leads to sorting lots based on quality, with lower-quality lots being sold in the cash market to avoid discounts and higher-quality lots being sold in the AMA to capture premiums.
4. Conclusions
This study tests for sample selection in the cattle procurement market using cattle quality variables generated from a heteroskedastic probit, residual, truncated, and GLS regressions. This research extends earlier work by directly testing sample selection using estimated cattle quality variables, without relying on the coefficient of IMR – an approach used in earlier studies with several limitations. Stochastic dominance is assessed graphically for the distribution of these quality variables using EDFs. Nonparametric tests are used to evaluate differences in the distribution and median quality of cattle between the cash and AMA markets.
Comparing the EDF plots of IMR and residual values, along with their respective predicted values, highlighted differences in the distribution of cattle quality between the cash and AMA markets, with the former exhibiting a concentration of lower-quality cattle. Nonparametric IMR, residual values, and predicted values tests also indicated that cattle quality differed between the two markets. The difference in cattle quality between the AMA and cash markets was statistically significant, providing evidence of sample selection. This feedlot strategically sells lower-quality lots in the cash market and higher-quality ones in the AMA. The results corroborate the findings of earlier studies (Hildebrand and Chung, Reference Hildebrand and Chung2023; Fausti et al., Reference Fausti, Wang and Lange2013b; Whitley, Reference Whitley2002) who reported that feedlots strategically sort cattle between markets to maximize profit. By extension, we found that this sorting behavior for this feedlot is based on unobserved quality and highlights a potential pathway to adverse selection in cattle procurement. Additionally, our use of nonparametric methods to test first- and second-order stochastic dominance offers a novel methodological contribution for testing sample selection in the cattle procurement market. Our findings extend the literature on information asymmetry by providing evidence of how feedlot informational advantages lead to sorting cattle between markets, thereby influencing the distribution of cattle quality.
The findings have implications for understanding the distribution of quality in cattle procurement. First, a feedlot that strategically sorts cattle during procurement may distort cattle quality, which is likely to impact prices in the cattle procurement market. AMA prices are determined based on the cash market price, plus any applicable premiums or discounts. If lower-quality cattle are predominantly traded in the cash market, the base price for higher-quality lots sold via the AMA may be relatively low. Hildebrand and Chung (Reference Hildebrand and Chung2023) provided evidence of how the revenue of feedlots, especially smaller feedlots, is affected by feedlots self-selecting lots into markets. If a feedlot strategically sorts cattle into markets, the random selection of a lot in a market cannot be used to accurately determine price since cattle quality will not be uniformly distributed. The sorting behavior observed in this feedlot’s data can influence not only prices but also the profit margins of feedlots, which incur additional costs to meet the AMA’s quality standards. Second, the generally low quality of cattle sold in the cash market could impact small feedlots that produce higher-quality cattle but lack the volume to enter into agreements under the AMA. These feedlots could suffer revenue losses if the overall prices do not accurately reflect the quality of the cattle they produce. Future studies could explore how a feedlot’s sorting of cattle into markets based on unobserved quality impacts prices and revenue distributions.
Private information about cattle quality held by a feeder facilitates strategic sorting, underscoring the importance of improving information flow to packers in the cattle procurement market. Therefore, it is recommended that initiatives be implemented to encourage feedlots to provide more quality-related information on the lots they sell in the cash market. This initiative could be achieved through quality certification programs or appropriate policy measures to enhance price discovery and accuracy. Such efforts would improve the competitiveness and accuracy of cash market transactions, facilitating the flow of market information and leading to more efficient price formation in both the cash and AMA markets.
The findings are based on the use of IMR and residual values (adjusted IMR) together with carcass quality-adjusted IMR and residual values as proxies for assessing cattle quality, which limits the scope of this study to some extent. These proxies are composite measures of cattle quality. They may not fully capture the multidimensional aspects of quality, such as genetic traits, feed efficiency, or health status, all of which are critical in determining market outcomes (Feuz et al., Reference Feuz, Fausti and Wagner1993). Future studies could consider semiparametric methods to estimate latent cattle quality based on the feedlot’s market choice (Bajari et al., Reference Bajari, Dalton, Hong and Khwaja2014).
While our analysis primarily focuses on sample selection issues due to unobserved cattle quality, it is important to acknowledge that real-world cattle transactions often involve repeated interactions between sellers and buyers, typically feedlots and nearby packers. Frequent and repeated transactions foster reputation building, significantly mitigating information asymmetries. Hence, in practical settings, buyer–seller relationships developed through repeated business interactions can substantially reduce the relevance or severity of sample selection issues. Although our approach emphasizes the implications of asymmetric information, future studies could explore in greater detail how relational contracting and reputation mechanisms influence market dynamics, potentially diminishing sample selection effects in cattle markets.
Additionally, IMR and residual values depend on the specification of the heteroskedastic probit regression, which assumes a normal distribution of the unobserved error term. If this assumption is violated, it could undermine the robustness of the results. Future research could address this issue using a copula approach to model the error distribution more flexibly (Winkelmann, Reference Winkelmann2012). Finally, our study is based on data from a single feedlot in Oklahoma, which is not representative of the broader US cattle feeding industry. Combining datasets from multiple feedlots across different geographical regions and periods could be used in future research to test for sample selection and validate our findings to address this shortcoming.
Data availability statement
Due to the confidentiality agreements, the data supporting the findings of this study cannot be shared.
Author contributions
Conceptualization, C.C., D.M.L.; methodology, C.C., D.M.L.; formal analysis, W.M.T.; data curation, C.C., W.M.T.; writing – original draft, W.M.T., writing – review and editing, C.C., D.M.L.; supervision, C.C.; funding acquisition, C.C.
Financial support
This study was supported by the Oklahoma State University Experiment Station and USDA/NIFA Hatch OKL03523.
Competing interests
The authors declare no competing interests.
AI contributions to research
AI was not used in any way in the generation of this paper.



