To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Radiocarbon dating is a widely used method in archaeology and earth sciences, but the precision of calibrated dates from single radiocarbon measurements can be difficult to understand. This study investigates the precision of calibrated radiocarbon dates depending on the uncertainties of the measurement and the details of the calibration curve. Using data for the Holocene epoch and the IntCal20 calibration curve, over 1,000,000 hypothetical radiocarbon measurements were calibrated and analyzed. The study shows that high-precision measurements can yield calibrated date ranges from less than 50 years to more than 200 years (at the 95.4% probability) depending on the specifics of the calibration curve. This research may serve as a tool for planning future studies and assessing whether high-precision measurements are beneficial for proposed case.
Plague and famine are two of the worst killers in human history. Both struck the Czech lands in the Middle Ages not long after each other (the famine of 1318 CE and the plague of 1348–1350 CE). The aim of our study was to try to relate the mass graves found in the vicinity of the Chapel of All Saints with an ossuary in the Kutná Hora–Sedlec site to these two specific events. For this purpose, we used stratigraphic and archaeological data, radiocarbon dating, and Bayesian modeling of 172 calibrated AMS ages obtained from teeth and bones of 86 individuals buried in the mass graves. Based on the stratigraphic and archaeological data, five mass graves were interpreted as famine graves and eight mass graves were interpreted as plague graves. Using these data and the calibration of the radiocarbon results of the tooth-bone pairs of each individual, we constructed the Bayesian model to interpret the remaining mass graves for which no contextual information was available (eight mass graves). In terms of Bayesian model results, the model fits stratigraphic data in 23 out of 34 cases and in all seven cases based on calibration data. To validate the model results on archaeologically and stratigraphically uninterpreted data, ancient DNA analysis is required to identify Yersinia pestis.
Deep geological repositories are critical for the long-term storage of hazardous materials, where understanding the mechanical behavior of emplacement drifts is essential for safety assurance. This study presents a surrogate modeling approach for the mechanical response of emplacement drifts in rock salt formations, utilizing Gaussian processes (GPs). The surrogate model serves as an efficient substitute for high-fidelity mechanical simulations in many-query scenarios, including time-dependent sensitivity analyses and calibration tasks. By significantly reducing computational demands, this approach facilitates faster design iterations and enhances the interpretation of monitoring data. The findings indicate that only a few key parameters are sufficient to accurately reflect in-situ conditions in complex rock salt models. Identifying these parameters is crucial for ensuring the reliability and safety of deep geological disposal systems.
Sudden annual rises in radiocarbon concentration have proven to be valuable assets for achieving exact-year calibration of radiocarbon measurements. These extremely precise calibrations have usually been obtained through the use of classical χ2 tests in conjunction with a local calibration curve of single-year resolution encompassing a rapid change in radiocarbon levels. As the latest Northern Hemisphere calibration curve, IntCal20, exhibits single-year resolution over the last 5000 years, in this study we investigate the possibility of performing calibration of radiocarbon dates using the classical χ2 test and achieving high-precision dating more extensively, examining scenarios without the aid of such abrupt changes in radiocarbon concentration. In order to perform a broad analysis, we simulated 171 sets of radiocarbon measurements over the last two millennia, with different set lengths and sample spacings, and tested the effectiveness of the χ2 test compared to the most commonly used Bayesian wiggle-matching technique for temporally ordered sequences of samples such as tree-rings sequences, the OxCal D_Sequence. The D_Sequence always produces a date range, albeit in certain cases very narrow; the χ2 test proves to be a viable alternative to Bayesian wiggle-matching, as it achieves calibrations of comparable precision, providing also a highest-likelihood estimate within the uncertainty range.
When large achievement tests are conducted regularly, items need to be calibrated before being used as operational items in a test. Methods have been developed to optimally assign pretest items to examinees based on their abilities. Most of these methods, however, are intended for situations where examinees arrive sequentially to be assigned to calibration items. In several calibration tests, examinees take the test simultaneously or in parallel. In this article, we develop an optimal calibration design tailored for such parallel test setups. Our objective is both to investigate the efficiency gain of the method as well as to demonstrate that this method can be implemented in real calibration scenarios. For the latter, we have employed this method to calibrate items for the Swedish national tests in Mathematics. In this case study, like in many real test situations, items are of mixed format and the optimal design method needs to handle that. The method we propose works for mixed-format tests and accounts for varying expected response times. Our investigations show that the proposed method considerably enhances calibration efficiency.
A distinction is proposed between measures and predictors of latent variables. The discussion addresses the consequences of the distinction for the true-score model, the linear factor model, Structural Equation Models, longitudinal and multilevel models, and item-response models. A distribution-free treatment of calibration and error-of-measurement is given, and the contrasting properties of measures and predictors are examined.
The conventional method of measuring ability, which is based on items with assumed true parameter values obtained from a pretest, is compared to a Bayesian method that deals with the uncertainties of such items. Computational expressions are presented for approximating the posterior mean and variance of ability under the three-parameter logistic (3PL) model. A 1987 American College Testing Program (ACT) math test is used to demonstrate that the standard practice of using maximum likelihood or empirical Bayes techniques may seriously underestimate the uncertainty in estimated ability when the pretest sample is only moderately large.
Overconfidence plays a role in a large number of individual decision biases and has been considered a ‘meta-bias’ for this reason. However, since overconfidence is measured behaviorally with respect to particular tasks (in which performance varies across individuals), it is unclear whether people generally vary in terms of their general overconfidence. We investigated this issue using a novel measure: the Generalized Overconfidence Task (GOT). The GOT is a difficult perception test that asks participants to identify objects in fuzzy (‘adversarial’) images. Critically, participants’ estimated performance on the task is not related to their actual performance. Instead, variation in estimated performance, we argue, arises from generalized overconfidence, that is, people indicating a cognitive skill for which they have no basis. In a series of studies (total N = 1,293), the GOT was more predictive when looking at a broad range of behavioral outcomes than two other overestimation tasks (cognitive and numeracy) and did not display substantial overlap with conceptually related measures (Studies 1a and 1b). In Studies 2a and 2b, the GOT showed superior reliability in a test–retest design compared to the other overconfidence measures (i.e., cognitive and numeracy measures), particularly when collecting confidence ratings after each image and an estimated performance score. Finally, the GOT is a strong predictor of a host of behavioral outcomes, including conspiracy beliefs, bullshit receptivity, overclaiming, and the ability to discern news headlines.
Emotion recognition in conversation (ERC) faces two major challenges: biased predictions and poor calibration. Classifiers often disproportionately favor certain emotion categories, such as neutral, due to the structural complexity of classifiers, the subjective nature of emotions, and imbalances in training datasets. This bias results in poorly calibrated predictions where the model’s predicted probabilities do not align with the true likelihood of outcomes. To tackle these problems, we introduce the application of conformal prediction (CP) into ERC tasks. CP is a distribution-free method that generates set-valued predictions to ensure marginal coverage in classification, thus improving the calibration of models. However, inherent biases in emotion recognition models prevent baseline CP from achieving a uniform conditional coverage across all classes. We propose a novel CP variant, class spectrum conformation, which significantly reduces coverage bias in CP methods. The methodologies introduced in this study enhance the reliability of prediction calibration and mitigate bias in complex natural language processing tasks.
This study suggests that there may be considerable difficulties in providing accurate calendar age estimates in the Roman period in Europe, between ca. AD 60 and ca. AD 230, using the radiocarbon calibration datasets that are currently available. Incorporating the potential for systematic offsets between the measured data and the calibration curve using the ΔR approach suggested by Hogg et al. (2019), only marginally mitigates the biases in calendar date estimates observed. At present, it clearly behoves researchers in this period to “caveat emptor” and validate the accuracy of their calibrated radiocarbon dates and chronological models against other sources of dating information.
This handbook provides a comprehensive, practical, and independent guide to all aspects of making weather observations. The second edition has been fully updated throughout with new material, new instruments and technologies, and the latest reference and research materials. Traditional and modern weather instruments are covered, including how best to choose and to site a weather station, how to get the best out of your equipment, how to store and analyse your records and how to share your observations. The book's emphasis is on modern electronic instruments and automatic weather stations. It provides advice on replacing 'traditional' mercury-based thermometers and barometers with modern digital sensors, following implementation of the UN Minamata Convention outlawing mercury in the environment. The Weather Observer's Handbook will again prove to be an invaluable resource for both amateur observers choosing their first weather instruments and professional observers looking for a comprehensive and up-to-date guide.
Instrument calibrations are both one of the most important, and yet sometimes one of the most neglected, areas of weather measurement. This chapter describes straightforward methods to check and adjust calibrations for the most common meteorological instruments – precipitation (rainfall), temperature, humidity and air pressure sensors. To reduce uncertainty in the measurements themselves, meteorological instruments need to be accurately calibrated, or at least regularly compared against instruments of known calibration to quantify and adjust for any differences, or error. Calibrations can and do drift over time, and therefore instrumental calibrations should be checked regularly, and adjusted if necessary.
Variable-Value axiologies avoid Parfit’s Repugnant Conclusion while satisfying some weak instances of the Mere Addition principle. We apply calibration methods to two leading members of the family of Variable-Value views conditional upon: first, a very weak instance of Mere Addition and, second, some plausible empirical assumptions about the size and welfare of the intertemporal world population. We find that such facts calibrate these two Variable-Value views to be nearly totalist, and therefore imply conclusions that should seem repugnant to anyone who opposes Total Utilitarianism only due to the Repugnant Conclusion.
Solvency II requires that firms with Internal Models derive the Solvency Capital Requirement directly from the probability distribution forecast generated by the Internal Model. A number of UK insurance undertakings do this via an aggregation model consisting of proxy models and a copula. Since 2016 there have been a number of industry surveys on the application of these models, with the 2019 Prudential Regulation Authority (“PRA”) led industry wide thematic review identifying a number of areas of enhancement. This concluded that there was currently no uniform best practice. While there have been many competing priorities for insurers since 2019, the Working Party expects that firms will have either already made changes to their proxy modelling approach in light of the PRA survey, or will have plans to do so in the coming years. This paper takes the PRA feedback into account and explores potential approaches to calibration and validation, taking into consideration the different heavy models used within the industry and relative materiality of business lines.
A novel thinned antenna element distribution for cancelling grating lobes (GLs) as well as for reducing phase shifters (PSs) is presented for a two-dimensional phased-array automotive radar application. First, an efficient clustering technique of vertical adjacent elements is employed with array thinning for a PS reduction of 66.7%. In the proposed distribution, several single-element radiators (non-clustered antenna elements) are placed in the vertical direction with specific spacing in a grid of 16 × 12 (192) elements with λ/2 pitch. This disrupts the periodicity of phase-centers after element-clustering and takes a role as steerable GL canceller with capabilities of tracking and nullifying the GL at any scan angle. The proposed distribution enables beam steering up to ±60° in the azimuth plane, as well as ±25° in the elevation plane with cancelled GL and sidelobes. Furthermore, the proposed distribution has been efficiently calibrated with all elements activated by introducing the code division multiple access technique. To the best of the authors’ knowledge, this work represents the first fully calibrated state-of-the-art thinned distribution phased-array including a novel steerable GL canceller to track and nullify GLs.
This chapter elaborates on the calibration and validation procedures for the model. First, we describe our calibration strategy in which a customised optimisation algorithm makes use of a multi-objective function, preventing the loss of indicator-specific error information. Second, we externally validate our model by replicating two well-known statistical patterns: (1) the skewed distribution of budgetary changes and (2) the negative relationship between development and corruption. Third, we internally validate the model by showing that public servants who receive more positive spillovers tend to be less efficient. Fourth, we analyse the statistical behaviour of the model through different tests: validity of synthetic counterfactuals, parameter recovery, overfitting, and time equivalence. Finally, we make a brief reference to the literature on estimating SDG networks.
Wiggle-match dating of tree-ring sequences is particularly promising for achieving high-resolution dating across periods with reversals and plateaus in the calibration curve, such as the entire post-Columbian period of North American history. Here we describe a modified procedure for wiggle-match dating that facilitates precise dating of wooden museum objects while minimizing damage due to destructive sampling. We present two case studies, a dugout canoe and wooden trough, both expected to date to the 18th–19th century. (1) Tree rings were counted and sampled for dating from exposed, rough cross-sections in the wood, with no or minimal surface preparation, to preserve these fragile objects; (2) dating focused on the innermost and outermost portions of the sequences; and (3) due to the crude counting and sampling procedures, the wiggle-match was approximated using a simple ordered Sequence, with gaps defined as Intervals. In both cases, the outermost rings were dated with precision of 30 years or better, demonstrating the potential of wiggle-match dating for post-European Contact canoes and other similar objects.
This study aimed to investigate the influence of calibration field size on the gamma passing rate (GPR) in patient-specific quality assurance (PSQA).
Methods:
Two independent detectors, PTW OCTAVIUS 4D (4DOCT) and Arc Check, were utilised in volumetric modulated arc therapy plans for 26 patients (14 with Arc Check and 12 with 4DOCT). Plans were administered using Varian Unique machine (with 4DOCT) and Varian TrueBeam (with Arc Check), each employing different calibration factors (CFs): 4 × 4, 6 × 6, 8 × 8, 10 × 10, 12 × 12 and 15 × 15 cm2 field sizes. Gamma analysis was conducted with 2%2mm, 2%3mm and 3%3mm gamma criteria.
Results:
GPR exhibited variations across different CFs. GPR demonstrated an increasing trend below 10 × 10 cm² CFs, while it displayed a decreasing trend above 10 × 10 cm². Both detectors exhibited similar GPR patterns. The correlation between 4DOCT and Arc Check was strong in tighter criteria (2%2mm) with an R² value of 0·9957, moderate criteria (2%3mm) with an R² value of 0·9868, but reduced in liberal criteria (3%3mm) with an R² value of 0·4226.
Conclusion:
This study demonstrates that calibration field sizes significantly influence GPR in PSQA. This study recommends the plan specific calibration field must obtain to calibrate the QA devices for modulated plans.
A laser stripe sensor has two kinds of calibration methods. One is based on the homography model between the laser stripe plane and the image plane, which is called the one-step calibration method. The other is based on the simple triangular method, which is named as the two-step calibration method. However, the geometrical meaning of each element in the one-step calibration method is not clear as that in the two-step calibration method. A novel mathematical derivation is presented to reveal the geometrical meaning of each parameter in the one-step calibration method, and then the comparative study of the one-step calibration method and the two-step calibration method is completed and the intrinsic relationship is derived. What is more, a one-step calibration method is proposed with 7 independent parameters rather than 11 independent parameters. Experiments are conducted to verify the accuracy and robust of the proposed calibration method.