To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Students will develop a practical understanding of data science with this hands-on textbook for introductory courses. This new edition is fully revised and updated, with numerous exercises and examples in the popular data science tool R, a new chapter on using R for statistical analysis, and a new chapter that demonstrates how to use R within a range of cloud platforms. The many practice examples, drawn from real-life applications, range from small to big data and come to life in a new end-to-end project in Chapter 11. New 'Data Science in Practice' boxes highlight how concepts introduced work within an industry context and many chapters include new sections on AI and Generative AI. A suite of online material for instructors provides a strong supplement to the book, including lecture slides, solutions, additional assessment material and curriculum suggestions. Datasets and code are available for students online. This entry-level textbook is ideal for readers from a range of disciplines wishing to build a practical, working knowledge of data science.
Students will develop a practical understanding of data science with this hands-on textbook for introductory courses. This new edition is fully revised and updated, with numerous exercises and examples in the popular data science tool Python, a new chapter on using Python for statistical analysis, and a new chapter that demonstrates how to use Python within a range of cloud platforms. The many practice examples, drawn from real-life applications, range from small to big data and come to life in a new end-to-end project in Chapter 11. New 'Data Science in Practice' boxes highlight how concepts introduced work within an industry context and many chapters include new sections on AI and Generative AI. A suite of online material for instructors provides a strong supplement to the book, including lecture slides, solutions, additional assessment material and curriculum suggestions. Datasets and code are available for students online. This entry-level textbook is ideal for readers from a range of disciplines wishing to build a practical, working knowledge of data science.
This chapter assesses the potential of technological tools to ensure voluntary compliance without coercion and improve the predictability of trustworthiness, focusing on the ethical challenges such differentiation might create.
Cutting-edge computational tools like artificial intelligence, data scraping, and online experiments are leading to new discoveries about the human mind. However, these new methods can be intimidating. This textbook demonstrates how Big Data is transforming the field of psychology, in an approachable and engaging way that is geared toward undergraduate students without any computational training. Each chapter covers a hot topic, such as social networks, smart devices, mobile apps, and computational linguistics. Students are introduced to the types of Big Data one can collect, the methods for analyzing such data, and the psychological theories we can address. Each chapter also includes discussion of real-world applications and ethical issues. Supplementary resources include an instructor manual with assignment questions and sample answers, figures and tables, and varied resources for students such as interactive class exercises, experiment demos, articles, and tools.
This textbook reflects the changing landscape of water management by combining the fields of satellite remote sensing and water management. Divided into three major sections, it begins by discussing the information that satellite remote sensing can provide about water, and then moves on to examine how it can address real-world management challenges, focusing on precipitation, surface water, irrigation management, reservoir monitoring, and water temperature tracking. The final part analyses governance and social issues that have recently been given more attention as the world reckons with social justice and equity aspects of engineering solutions. This book uses case studies from around the globe to demonstrate how satellite remote sensing can improve traditional water practices and includes end-of-chapter exercises to facilitate student learning. It is intended for advanced undergraduate and graduate students in water resource management, and as reference textbook for researchers and professionals.
Critics from across the political spectrum attack social media platforms for invading personal privacy. Social media firms famously suck in huge amounts of information about individuals who use their services (and sometimes others as well), and then monetize this data, primarily by selling targeted advertising. Many privacy advocates object to the very collection and use of this personal data by platforms, even if not shared with third parties. In addition, there is the ongoing (and reasonable) concern that the very existence of Big Data creates a risk of leaks. Further, aside from the problem of Big Data, the very existence of social media enables private individuals to invade the privacy of others by widely disseminating personal information. That social media firms’ business practices compromise privacy cannot be seriously doubted. But it is also true that Big Data lies at the heart of social media firms’ business models, permitting them to provide users with free services in exchange for data which they can monetize via targeted advertising. So unless regulators want to take free services away, they must tread cautiously in regulating privacy.
The area where social media has undoubtedly been most actively regulated is in their data and privacy practices. While no serious critic has proposed a flat ban on data collection and use (since that would destroy the algorithms that drive social media), a number of important jurisdictions including the European Union and California have imposed important restrictions on how websites (including social media) collect, process, and disclose data. Some privacy regulations are clearly justified, but insofar as data privacy laws become so strict as to threaten advertising-driven business models, the result will be that social media (and search and many other basic internet features) will stop being free, to the detriment of most users. In addition, privacy laws (and related rules such as the “right to be forgotten”) by definition restrict the flow of information, and so burden free expression. Sometimes that burden is justified, but especially when applied to information about public figures, suppressing unfavorable information undermines democracy. The chapter concludes by arguing that one area where stricter regulation is needed is protecting children’s data.
Physiologic data streaming and aggregation platforms such as Sickbay® and Etiometry are becoming increasingly used in the paediatric acute care setting. As these platforms gain popularity in clinical settings, there has been a parallel growth in scholarly interest. The primary aim of this study is to characterise research productivity utilising high-fidelity physiologic streaming data with Sickbay® or Etiometry in the acute care paediatric setting.
Methods:
A systematic review of the literature was conducted to identify paediatric publications using data from Sickbay® or Etiometry. The resulting publications were reviewed to characterise them and identify trends in these publications.
Results:
A total of 41 papers have been published over 9 years using either platform. This involved 179 authors across 21 institutions. Most studies utilised Sickbay®, involved cardiac patients, were single-centre, and did not utilise machine learning or artificial intelligence methods. The number of publications has been significantly increasing over the past 9 years, and the average number of citations for each publication was 7.9.
Conclusion:
A total of 41 papers have been published over 9 years using Sickbay® or Etiometry data in the paediatric setting. Although the majority of these are single-centre and pertain to cardiac patients, growth in publication volume suggests growing utilisation of high-fidelity physiologic data beyond clinical applications. Multicentre efforts may help increase the number of centres that can do such work and help drive improvements in clinical care.
In the previous chapters, we built the basic foundation of satellite remote sensing. In this chapter we will explore a relatively recent innovation in information technology called cloud computing that has dramatically improved data accessibility and the practicality of applying large satellite remote sensing datasets for water management. Future chapters on specific targets and water management themes will have hands-on examples and assignments based on actual satellite data. Most of these chapters will assume prior knowledge of cloud computing for understanding and completing assignments. Since cloud computing is gradually proliferating in all walks of water management practice, the aim of this chapter is to introduce readers to cloud computing concepts and specific tools currently available for dealing with the very large satellite data sets on water.
This is the first chapter of the book. The goal of this chapter is to introduce ourselves to the growing importance of using satellite remote sensing to manage our water. We will try to understand this in the context of the underlying challenges and new global forces shaping up this century that are expected to make traditional ways of managing water using in-situ data more challenging.
Retrofitting aircraft cabins is characterized by a large number of documents created and required, most of which are currently processed manually. Engineers need to identify which documents include the information that is required for a specific task. This paper proposes an approach that builds upon a digital knowledge base and moves towards automatically processing the quantity-on-hand documents to reduce the work required to identify the required documents without the labour-intensive creation of the knowledge base in beforehand. After describing the scenario this work faces, comparable approaches and promising techniques are discussed. A process-chain that builds upon these fundamentals is presented, including a selection of feasible techniques and algorithms. Finally, the steps towards an implementation as part of the transformation towards a data-driven value chain are presented.
This work develops a method to integrate operational data into system models following MBSE principles. Empirical analysis reveals significant obstacles to data-driven development, including heterogeneous and non-transparent data structures, poor metadata documentation, insufficient data quality, lack of references, and limited data-driven mindset. A method based on the RFLP chain links operating data structures to logical-level elements. Data analyses are aligned with specific requirements or functional/physical elements, enabling systematic data-driven modeling. This method improves efficiency, fosters system knowledge development, and connects technical systems with operational data.
Gracia de Luna conducted experiments with an HMD virtual environment in which human subjects were presented with surprise distractions. His collected data for head, dominant hand, and non-dominant hand included 6 DOF human subject trajectories. This paper examines this data from 57 human subject responses to those surprise virtual environment distractions using statistical trajectory clustering algorithms. The data is organized and processed with a Dynamic Time Warping (DTW) algorithm and then analyzed using the Density Based Spatial Clustering (DBSCAN) algorithm. The K-means method was used to determine the appropriate number of clusters. Chi Squared goodness of fit was used to determine statistical significance. For five of the data sets, a p value of less than 0.05 was found. These five data sets were found to have a limited relationship to the measured variables.
No study has evaluated the relationship between heavy rain disasters and influenza by comparing victims and non-victims, and we investigated the association between the 2018 western Japan heavy rain disaster and influenza.
Methods
All patients registered in the National Health Insurance Claims Database and treated in the Hiroshima, Okayama, and Ehime prefectures were included in this retrospective cohort study conducted 1-year post-disaster. A multivariate mixed-effects logistic regression analysis was used to assess the association between the disaster and anti-influenza drug prescribing. A difference-in-differences analysis was conducted to assess anti-influenza drug use for the 4-month period immediately before and every 4 months for a year post-disaster.
Results
This study included 6 176 300 individuals (victims: 36 076 [0.60%]); 2573 (7.1%) and 458 157 (7.4%) in the victim and non-victim groups, respectively, used anti-influenza drugs in the year following the flood. The victims were significantly more likely than non-victims to use anti-influenza drug (risk ratio 1.18; 95% confidence interval [CI] 1.12-1.42). The victims used significantly more anti-influenza drugs in the 4 months immediately post-disaster compared with just before the disaster (odds ratio 3.62; 95% CI 1.77-7.41).
Conclusions
Anti-influenza drug use was higher among victims of the 2018 Western Japan heavy rain disaster than among non-victims.
Adults with mood and/or anxiety disorders have increased risks of comorbidities, chronic treatments and polypharmacy, increasing the risk of drug–drug interactions (DDIs) with antidepressants.
Aims
To use primary care records from the UK Biobank to assess DDIs with citalopram, the most widely prescribed antidepressant in UK primary care.
Method
We classified drugs with pharmacokinetic or pharmacodynamic DDIs with citalopram, then identified prescription windows for these drugs that overlapped with citalopram prescriptions in UK Biobank participants with primary care records. We tested for associations of DDI status (yes/no) with sociodemographic and clinical characteristics and with cytochrome 2C19 activity, using univariate tests, then fitted multivariable models for variables that reached Bonferroni-corrected significance.
Results
In UK Biobank primary care data, 25 508 participants received citalopram prescription(s), among which 11 941 (46.8%) had at least one DDI, with an average of 1.96 interacting drugs. The drugs most commonly involved were proton pump inhibitors (40% of co-prescription instances). Individuals with DDIs were more often female and older, had more severe and less treatment-responsive depression, and had higher rates of psychiatric and physical disorders. In the multivariable models, treatment resistance and markers of severity (e.g. history of suicidal and self-harm behaviours) were strongly associated with DDIs, as well as comorbidity with cardiovascular disorders. Cytochrome 2C19 activity was not associated with the occurrence of DDIs.
Conclusions
The high frequency of DDIs with citalopram in fragile groups confirms the need for careful consideration before prescribing and periodic re-evaluation.
The secrecy of intelligence institutions might give the impression that intelligence is an ethics-free zone, but this is not the case. In The Ethics of National Security Intelligence Institutions, Adam Henschke, Seumas Miller, Andrew Alexandra, Patrick Walsh, and Roger Bradbury examine the ways that liberal democracies have come to rely on intelligence institutions for effective decision-making and look at the best ways to limit these institutions’ power and constrain the abuses they have the potential to cause. In contrast, the value of Amy Zegart’s and Miah Hammond-Errey’s research, in their respective books, Spies, Lies, and Algorithms: The History and Future of American Intelligence and Big Data, Emerging Technologies and Intelligence: National Security Disrupted, is the access each of them provides to the thoughts and opinions of the intelligence practitioners working in these secretive institutions. What emerges is a consensus that the fundamental moral purpose of intelligence institutions should be truth telling. In other words, intelligence should be a rigorous epistemic activity that seeks to improve decision-makers’ understanding of a rapidly changing world. Moreover, a key ethical challenge for intelligence practitioners in liberal democracies is how to do their jobs effectively in a way that does not undermine public trust. Measures recommended include better oversight and accountability mechanisms, adoption of a ‘risk of transparency’ principle, and greater understanding of and respect for privacy rights.
Since the 1990s, big data has rapidly grown, influencing business, government, and healthcare. Fueled by networked devices, social media, and affordable cloud storage, it features voluminous datasets with diverse types, rapid updates, and accuracy concerns. Applications span retail, manufacturing, transportation, finance, and education, yielding benefits like data-driven decisions, optimization, personalized marketing, scientific progress, and fraud detection. However, challenges arise from complexity, necessitating interdisciplinary collaboration, privacy issues, potential cyberattacks, and the need for robust data protections. Accurate interpretation is crucial, given the risk of costly misinterpretations. Moreover, significant resources for storage, processing, and analysis raise environmental concerns, while legal and ethical considerations add complexity. Overreliance on data may lead to missed opportunities, underscoring the importance of balancing insights with human judgment. In conclusion, big data offers immense potential but poses significant challenges. Navigating this landscape requires a nuanced understanding, fostering responsible data practices to unlock its potential for informed decision-making and advancements across diverse fields.
This chapter addresses how one could quantify and explore the impact of geopolitics on global businesses. Computational geopolitics is an attempt to integrate quantitative methods and geopolitical analysis to understand and predict trends. The explosive growth of data, improvements in computational power, and access to cloud computing have led to a proliferation of computational methods in analyzing geopolitics and its impact on companies. The chapter explores some tools and techniques used in computational geopolitics, including events-based approaches to measuring geopolitical tensions, textual approaches, and empirical approaches. In addition, it provides examples of ways in which analysts can quantify the impact of geopolitics on trade and foreign direct investment. It also introduces experimental methods to assess the effectiveness of companies’ strategic responses to geopolitical tensions. Large language models (LLMs) can be used for sentiment analysis, spotting trends, scenario building, risk assessment, and strategic recommendations. While they methods offer advances in quantifying the impact of geopolitics on global businesses, analysts should also be cautious about data quality and availability as well as the complexity of the phenomenon and the geopolitics of AI. The chapter concludes by pointing the reader to some widely used data sources for computational geopolitics.
Nowadays, both researchers and clinicians alike have to deal with increasingly larger datasets, specifically also in the context of mental health data. Sophisticated tools for dataset visualization of information from various item-based instruments, such as questionnaire data or data from digital applications or clinical documentations, are still lacking, specifically for an integration at multiple levels and for use in both data organization and appropriate construction for its valid use in subsequent analyses.
Methods
Here, we introduce ItemComplex, a Python-based framework for ex-post visualization of large datasets. The method exploits the comprehensive recognition of instrument alignments and the identification of new content networks and graphs based on item similarities and shared versus differential conceptual bases within and across data and studies.
Results
The ItemComplex framework was evaluated using four existing large datasets from four different cohort studies and demonstrated successful data visualization across multi-item instruments within and across studies. ItemComplex enables researchers and clinicians to navigate through big datasets reliably, informatively, and quickly. Moreover, it facilitates the extraction of new insights into construct representations and concept identifications within the data.
Conclusions
The ItemComplex app is an efficient tool in the field of big data management and analysis addressing the growing complexity of modern datasets to harness the potential hidden within these extensive collections of information. It is also easily adjustable for individual datasets and user preferences, both in the research and clinical field.
A distinction between types of methods (understanding and explanation) that generate different kinds of evidence relevant to the psychiatric assessment is characterised. The distinction is animated with both non-clinical and clinical examples and exercises. Scepticism about the distinction is addressed, and three influential systems of psychiatric knowledge which collapse understanding and explanation in different ways are discussed. The argument is made that the distinction (analogous to the romantic/classic distinction) resurfaces and is compelling. However, another challenge becomes important – holism in psychiatric assessment – which the understanding/explanation distinction leaves in an unsatisfactory state.