To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This study explores the role of ChatGPT in the completeness of collaborative computer-aided design (CAD) tasks requiring varying types of engineering knowledge. In the experiment involving 22 pairs of mechanical engineering students, three different collaborative CAD tasks were undertaken with and without ChatGPT support. The findings indicate that ChatGPT support hinders completeness in collaborative CAD-specific tasks reliant on CAD knowledge but demonstrates limited potential in assisting open-ended tasks requiring domain-specific engineering expertise. While ChatGPT mitigates task-specific challenges by providing general engineering knowledge, it fails to improve overall task completeness. The results underscore the complementary role of AI and human knowledge.
The emergence of ChatGPT as a leading artificial intelligence language model developed by OpenAI has sparked substantial interest in the field of applied linguistics, due to its extraordinary capabilities in natural language processing. Research on its use in service of language learning and teaching is on the horizon and is anticipated to grow rapidly. In this review article, we purport to capture its nascency, drawing on a literature corpus of 71 papers of a variety of genres – empirical studies, reviews, position papers, and commentaries. Our narrative review takes stock of current research on ChatGPT’s application in foreign language learning and teaching, uncovers both conceptual and methodological gaps, and identifies directions for future research.
The proliferation of Artificial Intelligence (AI) is significantly transforming conventional legal practice. The integration of AI into legal services is still in its infancy and faces challenges such as privacy concerns, bias, and the risk of fabricated responses. This research evaluates the performance of the following AI tools: (1) ChatGPT-4, (2) Copilot, (3) DeepSeek, (4) Lexis+ AI, and (5) Llama 3. Based on their comparison, the research demonstrates that Lexis+ AI outperforms the other AI solutions. All these tools still encounter hallucinations, despite claims that utilizing the Retrieval-Augmented Generation (RAG) model has resolved this issue. The RAG system is not the driving force behind the results; it is one component of the AI architecture that influences but does not solely account for the problems associated with the AI tools. This research explores RAG architecture and its inherent complexities, offering viable solutions for improving the performance of AI-powered solutions.
This empirical study explores three aspects of engagement (affective, behavioral, and cognitive) in language learning within an English as a Foreign Language context in Japan, examining their relationship with AI utilization. Previous research has demonstrated that motivation positively influences AI usage. This study expands on that by connecting motivation with engagement, where AI usage serves as an intermediary construct. A total of 174 students participated in the study. Throughout the semester, they were required to use Generative AI (GenAI) to receive feedback on their writing. To prevent overreliance or plagiarism, carefully crafted prompts were selected. Students were tasked with collaboratively constructing essays during the semester using GenAI. At the end of the semester, students completed a survey measuring their motivation and engagement. Structural Equation Modeling was employed to reaffirm the previous finding that motivation influences AI usage. The results showed that AI usage impacts all three aspects of engagement. Based on these findings, the study suggests the pedagogical feasibility of implementing GenAI in writing classes with proper teacher guidance. Rather than being a threat, the use of this technological tool complements the role of human teachers and supports learning engagement.
Generative artificial intelligence (AI) systems, notably ChatGPT, have emerged in legal practice, facilitating the completion of tasks, ranging from electronic communications to the drafting of documents. The generative capabilities of these systems underscore the duty of lawyers to competently represent their clients by keeping abreast of technological developments that can enhance the efficiency and effectiveness of their work. At the same time, the processing of clients’ information through generative AI systems threatens to compromise their confidentiality if disclosed to third parties, including the systems’ providers. The present paper aims to determine the impact of the use of generative AI systems by lawyers on the duties of competence and confidentiality. The findings derive from the application of doctrinal and empirical research on the legal practice and its digitalisation in Luxembourg. The paper finally reflects on the integration of generative AI systems in legal practice to raise the quality of legal services for clients.
After its launch on 30 November 2022 ChatGPT (or Chat Generative Pre-Trained Transformer) quickly became the fastest-growing app in history, gaining one hundred million users in just two months. Developed by the US-based artificial-intelligence firm OpenAI, ChatGPT is a free, text-based AI system designed to interact with the user in a conversational way. Capable of answering complex questions with sophistication and of conversing in a breezy and impressively human style, ChatGPT can also generate outputs in a seemingly endless variety of formats, from professional memos to Bob Dylan lyrics, HTML code to screenplays and five-alarm chilli recipes to five-paragraph essays. Its remarkable capability relative to earlier chatbots gave rise to both astonishment and concern in the tech sector. On 22 March 2023 a group of more than one thousand scientists and entrepreneurs published an open letter calling for a six-month moratorium on further human-competitive AI development – a moratorium that was not observed.
Since the publication of “What is the Current and Future Status of Digital Mental Health Interventions?” the exponential growth and widespread adoption of ChatGPT have underscored the importance of reassessing its utility in digital mental health interventions. This review critically examined the potential of ChatGPT, particularly focusing on its application within clinical psychology settings as the technology has continued evolving through 2023 and 2024. Alongside this, our literature review spanned US Medical Licensing Examination (USMLE) validations, assessments of the capacity to interpret human emotions, analyses concerning the identification of depression and its determinants at treatment initiation, and reported our findings. Our review evaluated the capabilities of GPT-3.5 and GPT-4.0 separately in clinical psychology settings, highlighting the potential of conversational AI to overcome traditional barriers such as stigma and accessibility in mental health treatment. Each model displayed different levels of proficiency, indicating a promising yet cautious pathway for integrating AI into mental health practices.
This study explored the effects of interacting with ChatGPT 4.0 on L2 learners’ motivation to write English argumentative essays. Conducted at a public university in a non-English-speaking country, the study had an experimental and mixed-methods design. It utilized both quantitative and qualitative data analyses to inform the development of effective AI-enhanced tailored interventions for teaching L2 essay writing. Overall, the results revealed that interacting with ChatGPT 4.0 had a positive lasting effect on learners’ motivation to write argumentative essays in English. However, a decline in their motivation at the delayed post-intervention stage suggested the need to maintain a balance between utilizing ChatGPT as a writing support tool and enhancing their independent writing capabilities. Learners attributed the increase in their motivation to several factors, including their perceived improvement in essay writing skills, the supportive learning environment created by ChatGPT as a tutor, positive interactions with it, and the development of meta-cognitive awareness by addressing their specific writing issues. The study highlights the potential of AI-based tools in enhancing L2 learners’ motivation in English classrooms.
The advent of generative artificial intelligence (AI) models holds potential for aiding teachers in the generation of pedagogical materials. However, numerous knowledge gaps concerning the behavior of these models obfuscate the generation of research-informed guidance for their effective usage. Here, we assess trends in prompt specificity, variability, and weaknesses in foreign language teacher lesson plans generated by zero-shot prompting in ChatGPT. Iterating a series of prompts that increased in complexity, we found that output lesson plans were generally high quality, though additional context and specificity to a prompt did not guarantee a concomitant increase in quality. Additionally, we observed extreme cases of variability in outputs generated by the same prompt. In many cases, this variability reflected a conflict between outdated (e.g. reciting scripted dialogues) and more current research-based pedagogical practices (e.g. a focus on communication). These results suggest that the training of generative AI models on classic texts concerning pedagogical practices may bias generated content toward teaching practices that have been long refuted by research. Collectively, our results offer immediate translational implications for practicing and training foreign language teachers on the use of AI tools. More broadly, these findings highlight trends in generative AI output that have implications for the development of pedagogical materials across a diversity of content areas.
Recent advances in large language models (LLMs), such as GPT-4, have spurred interest in their potential applications across various fields, including actuarial work. This paper introduces the use of LLMs in actuarial and insurance-related tasks, both as direct contributors to actuarial modelling and as workflow assistants. It provides an overview of LLM concepts and their potential applications in actuarial science and insurance, examining specific areas where LLMs can be beneficial, including a detailed assessment of the claims process. Additionally, a decision framework for determining the suitability of LLMs for specific tasks is presented. Case studies with accompanying code showcase the potential of LLMs to enhance actuarial work. Overall, the results suggest that LLMs can be valuable tools for actuarial tasks involving natural language processing or structuring unstructured data and as workflow and coding assistants. However, their use in actuarial work also presents challenges, particularly regarding professionalism and ethics, for which high-level guidance is provided.
Many books have been written on the topic of second language assessment, but few are easily accessible for both students and practicing language teachers. This textbook provides an up-to-date and engaging introduction to this topic, using anecdotal and real-world examples to illustrate key concepts and principles. It seamlessly connects qualitative and quantitative approaches and the use of technologies, including generative AI, to language assessment development and analysis for students with little background in these areas. Hands-on activities, exercises, and discussion questions provide opportunities for application and reflection, and the inclusion of additional resources and detailed appendices cements understanding. Ancillary resources are available including datasets and videos for students, PowerPoint teaching slides and a teacher's guide for instructors. Packed with pedagogy, this is an invaluable resource for both first and second language speakers of English, students on applied linguistics or teacher education courses, and practicing teachers of any language.
Multimodal imaging is crucial for diagnosis and treatment in paediatric cardiology. However, the proficiency of artificial intelligence chatbots, like ChatGPT-4, in interpreting these images has not been assessed. This cross-sectional study evaluates the precision of ChatGPT-4 in interpreting multimodal images for paediatric cardiology knowledge assessment, including echocardiograms, angiograms, X-rays, and electrocardiograms. One hundred multiple-choice questions with accompanying images from the textbook Pediatric Cardiology Board Review were randomly selected. The chatbot was prompted to answer these questions with and without the accompanying images. Statistical analysis was done using X2, Fisher’s exact, and McNemar tests. Results showed that ChatGPT-4 answered 41% of questions with images correctly, performing best on those with electrocardiograms (54%) and worst on those with angiograms (29%). Without the images, ChatGPT-4’s performance was similar at 37% (difference = 4%, 95% confidence interval (CI) –9.4% to 17.2%, p = 0.56). The chatbot performed significantly better when provided the image of an electrocardiogram than without (difference = 18, 95% CI 4.0% to 31.9%, p < 0.04). In cases of incorrect answers, ChatGPT-4 was more inconsistent with an image than without (difference = 21%, 95% CI 3.5% to 36.9%, p < 0.02). In conclusion, ChatGPT-4 performed poorly in answering image-based multiple-choice questions in paediatric cardiology. Its accuracy in answering questions with images was similar to without, indicating limited multimodal image interpretation capabilities. Substantial training is required before clinical integration can be considered. Further research is needed to assess the clinical reasoning skills and progression of ChatGPT in paediatric cardiology for clinical and academic utility.
The responsibilitiesand liability of the persons and organisations involved in the development of AI systems are not clearly identified. The assignment of liability will need government to mo e from a risk-based to a responsibility-based system. One possible approach would be to establish a pan-EU compensation fund for damages caused by digital technologies and AI, financed by the industry and insurance companies.
Large language models (LLMs) offer new research possibilities for social scientists, but their potential as “synthetic data” is still largely unknown. In this paper, we investigate how accurately the popular LLM ChatGPT can recover public opinion, prompting the LLM to adopt different “personas” and then provide feeling thermometer scores for 11 sociopolitical groups. The average scores generated by ChatGPT correspond closely to the averages in our baseline survey, the 2016–2020 American National Election Study (ANES). Nevertheless, sampling by ChatGPT is not reliable for statistical inference: there is less variation in responses than in the real surveys, and regression coefficients often differ significantly from equivalent estimates obtained using ANES data. We also document how the distribution of synthetic responses varies with minor changes in prompt wording, and we show how the same prompt yields significantly different results over a 3-month period. Altogether, our findings raise serious concerns about the quality, reliability, and reproducibility of synthetic survey data generated by LLMs.
This study provides insights into the capabilities and performance of generative AI, specifically ChatGPT, in engineering design. ChatGPT participated in a 48-hour hackathon by instructing two participants who acted out its instructions, successfully designing and prototyping a NERF dart launcher that finished second among six teams. The paper highlights the potential and limitations of generative AI as a tool for ideation, decision-making, and optimization in engineering tasks, demonstrating the practical applicability of generating viable design solutions under real-world constraints.
The general public and scientific community alike are abuzz over the release of ChatGPT and GPT-4. Among many concerns being raised about the emergence and widespread use of tools based on large language models (LLMs) is the potential for them to propagate biases and inequities. We hope to open a conversation within the environmental data science community to encourage the circumspect and responsible use of LLMs. Here, we pose a series of questions aimed at fostering discussion and initiating a larger dialogue. To improve literacy on these tools, we provide background information on the LLMs that underpin tools like ChatGPT. We identify key areas in research and teaching in environmental data science where these tools may be applied, and discuss limitations to their use and points of concern. We also discuss ethical considerations surrounding the use of LLMs to ensure that as environmental data scientists, researchers, and instructors, we can make well-considered and informed choices about engagement with these tools. Our goal is to spark forward-looking discussion and research on how as a community we can responsibly integrate generative AI technologies into our work.
OpenAI is a research organization founded by, among others, Elon Musk, and supported by Microsoft. In November 2022, it released ChatGPT, an incredibly sophisticated chatbot, that is, a computer system with which humans can converse. The capability of this chatbot is astonishing: as well as conversing with human interlocutors, it can answer questions about history, explain almost anything you might think to ask it, and write poetry. This level of achievement has provoked interest in questions about whether a chatbot might have something similar to human intelligence and even whether one could be conscious. Given that the function of a chatbot is to process linguistic input and produce linguistic output, we consider that the most interesting question in this direction is whether a sophisticated chatbot might have inner speech. That is: might it talk to itself, internally? We explored this via a conversation with ‘Playground’, a chatbot which is very similar to ChatGPT but more flexible in certain respects. We put to it questions which, plausibly, can only be answered if one first produces some inner speech. Here, we present our findings and discuss their philosophical significance.
Incarceration is a significant social determinant of health, contributing to high morbidity, mortality, and racialized health inequities. However, incarceration status is largely invisible to health services research due to inadequate clinical electronic health record (EHR) capture. This study aims to develop, train, and validate natural language processing (NLP) techniques to more effectively identify incarceration status in the EHR.
Methods:
The study population consisted of adult patients (≥ 18 y.o.) who presented to the emergency department between June 2013 and August 2021. The EHR database was filtered for notes for specific incarceration-related terms, and then a random selection of 1,000 notes was annotated for incarceration and further stratified into specific statuses of prior history, recent, and current incarceration. For NLP model development, 80% of the notes were used to train the Longformer-based and RoBERTa algorithms. The remaining 20% of the notes underwent analysis with GPT-4.
Results:
There were 849 unique patients across 989 visits in the 1000 annotated notes. Manual annotation revealed that 559 of 1000 notes (55.9%) contained evidence of incarceration history. ICD-10 code (sensitivity: 4.8%, specificity: 99.1%, F1-score: 0.09) demonstrated inferior performance to RoBERTa NLP (sensitivity: 78.6%, specificity: 73.3%, F1-score: 0.79), Longformer NLP (sensitivity: 94.6%, specificity: 87.5%, F1-score: 0.93), and GPT-4 (sensitivity: 100%, specificity: 61.1%, F1-score: 0.86).
Conclusions:
Our advanced NLP models demonstrate a high degree of accuracy in identifying incarceration status from clinical notes. Further research is needed to explore their scaled implementation in population health initiatives and assess their potential to mitigate health disparities through tailored system interventions.
This article examines the idea of mind-reading technology by focusing on an interesting case of applying a large language model (LLM) to brain data. On the face of it, experimental results appear to show that it is possible to reconstruct mental contents directly from brain data by processing via a chatGPT-like LLM. However, the author argues that this apparent conclusion is not warranted. Through examining how LLMs work, it is shown that they are importantly different from natural language. The former operates on the basis of nonrational data transformations based on a large textual corpus. The latter has a rational dimension, being based on reasons. Using this as a basis, it is argued that brain data does not directly reveal mental content, but can be processed to ground predictions indirectly about mental content. The author concludes that this is impressive but different in principle from technology-mediated mind reading. The applications of LLM-based brain data processing are nevertheless promising for speech rehabilitation or novel communication methods.
Usage of large language models and chat bots will almost surely continue to grow, since they are so easy to use, and so (incredibly) credible. I would be more comfortable with this reality if we encouraged more evaluations with humans-in-the-loop to come up with a better characterization of when the machine can be trusted and when humans should intervene. This article will describe a homework assignment, where I asked my students to use tools such as chat bots and web search to write a number of essays. Even after considerable discussion in class on hallucinations, many of the essays were full of misinformation that should have been fact-checked. Apparently, it is easier to believe ChatGPT than to be skeptical. Fact-checking and web search are too much trouble.