Search

Domain anchorage in LLMs: Lexicon profiling and unintended information leakage
Lekha Challappa, Zijin Zhang, Rajiv Garg
Journal:

Data & Policy / Volume 7 / 2025

Published online by Cambridge University Press:

27 October 2025, e73
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This study investigates unintended information flow in large language models (LLMs) by proposing a computational linguistic framework for detecting and analyzing domain anchorage. Domain anchorage is a phenomenon potentially caused by in-context learning or latent “cache” retention of prior inputs, which enables language models to infer and reinforce shared latent concepts across interactions, leading to uniformity in responses that can persist across distinct users or prompts. Using GPT-4 as a case study, our framework systematically quantifies the lexical, syntactic, semantic, and positional similarities between inputs and outputs to detect these domain anchorage effects. We introduce a structured methodology to evaluate the associated risks and highlight the need for robust mitigation strategies. By leveraging domain-aware analysis, this work provides a scalable framework for monitoring information persistence in LLMs, which can inform enterprise guardrails to ensure response consistency, privacy, and safety in real-world deployments.

11 - Big Language
Wilma A. Bainbridge, University of Chicago
Book:

Big Data in the Psychological Sciences

Published online:

23 October 2025

Print publication:

23 October 2025, pp 202-223
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Advances in natural language processing (NLP) and Big Data techniques have allowed us to learn about the human mind through one of its richest outputs – language. In this chapter, we introduce the field of computational linguistics and go through examples of how to find natural language and how to interpret the complexities that are present within it. The chapter discusses the major state-of-the-art methods being applied in NLP and how they can be applied to psychological questions, including statistical learning, N-gram models, word embedding models, large language models, topic modeling, and sentiment analysis. The chapter concludes with ethical discussions on the proliferation of chat “bots” that pervade our social networks, and the importance of balanced training sets for NLP models.

How to quickly select good in-context examples in large language models for data-to-text tasks?
Yulong Li, Jiaoyun Yang, Lili Jiang, Shuo Liu, Ning An
Journal:

Natural Language Processing ,

Published online by Cambridge University Press:

14 October 2025, pp. 1-35
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In the realm of data-to-text generation tasks, the use of large language models (LLMs) has become common practice, yielding fluent and coherent outputs. Existing literature highlights that the quality of in-context examples significantly influences the empirical performance of these models, making the efficient selection of high-quality examples crucial. We hypothesize that the quality of these examples is primarily determined by two properties: their similarity to the input data and their diversity from one another. Based on this insight, we introduce a novel approach, Double Clustering-based In-Context Example Selection, specifically designed for data-to-text generation tasks. Our method involves two distinct clustering stages. The first stage aims to maximize the similarity between the in-context examples and the input data. The second stage ensures diversity among the selected in-context examples. Additionally, we have developed a batched generation method to enhance the token usage efficiency of LLMs. Experimental results demonstrate that, compared to traditional methods of selecting in-context learning samples, our approach significantly improves both time efficiency and token utilization while maintaining accuracy.

Emerging trends: This is not cheating
Kenneth Ward Church
Journal:

Natural Language Processing / Volume 31 / Issue 6 / November 2025

Published online by Cambridge University Press:

10 October 2025, pp. 1470-1477
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Everyone is talking about bots. Much of the discussion has focused on downsides. It is too easy to use bots to cheat, but there are also many ways to use bots to improve your writing. Good writers use thesauruses. It is not cheating to use bots as a modern version of a thesaurus. It is also not cheating to use recommendation systems in a responsible way.

Using large language models to directly screen electronic databases as an alternative to traditional search strategies such as the Cochrane highly sensitive search for filtering randomized controlled trials in systematic reviews
Viet-Thi Tran, Carolina Grana Possamai, Isabelle Boutron, Philippe Ravaud
Journal:

Research Synthesis Methods ,

Published online by Cambridge University Press:

10 October 2025, pp. 1-7
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
A critical step in systematic reviews involves the definition of a search strategy, with keywords and Boolean logic, to filter electronic databases. We hypothesize that it is possible to screen articles in electronic databases using large language models (LLMs) as an alternative to search equations. To investigate this matter, we compared two methods to identify randomized controlled trials (RCTs) in electronic databases: filtering databases using the Cochrane highly sensitive search and an assessment by an LLM.
We retrieved studies indexed in PubMed with a publication date between September 1 and September 30, 2024 using the sole keyword “diabetes.” We compared the performance of the Cochrane highly sensitive search and the assessment of all titles and abstracts extracted directly from the database by GPT-4o-mini to identify RCTs. Reference standard was the manual screening of retrieved articles by two independent reviewers.
The search retrieved 6377 records, of which 210 (3.5%) were primary reports of RCTs. The Cochrane highly sensitive search filtered 2197 records and missed one RCT (sensitivity 99.5%, 95% CI 97.4% to100%; specificity 67.8%, 95% CI 66.6% to 68.9%). Assessment of all titles and abstracts from the electronic database by GPT filtered 1080 records and included all 210 primary reports of RCTs (sensitivity 100%, 95% CI 98.3% to100%; specificity 85.9%, 95% CI 85.0% to 86.8%).
LLMs can screen all articles in electronic databases to identify RCTs as an alternative to the Cochrane highly sensitive search. This calls for the evaluation of LLMs as an alternative to rigid search strategies.

9 - Human–AI Collaboration for Identifying Health Information Wants
- By Zhuochun Li, Zhimeng Luo, Ning Zou, Bo Xie, Daqing He
Edited by Dan Wu, Wuhan University, China, Shaobo Liang, Wuhan University, China
Book:

Human-AI Interaction and Collaboration

Published online:

19 September 2025

Print publication:

09 October 2025, pp 213-238
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Informal caregivers such as family members or friends provide much care to people with physical or cognitive impairment. To address challenges in care, caregivers often seek information online via social media platforms for their health information wants (HIWs), the types of care-related information that caregivers wish to have. Some efforts have been made to use Artificial Intelligence (AI) to understand caregivers’ information behaviors on social media. In this chapter, we present achievements of research with a human–AI collaboration approach in identifying caregivers’ HIWs, focusing on dementia caregivers as one example. Through this collaboration, AI techniques such as large language models (LLMs) can be used to extract health-related domain knowledge for building classification models, while human experts can benefit from the help of AI to further understand caregivers’ HIWs. Our approach has implications for the caregiving of various groups. The outcomes of human–AI collaboration can provide smart interventions to help caregivers and patients.

LLM agents for interactive exploration of historical cadastre data: framework and application to Venice
Tristan Karch, Jakhongir Saydaliev, Isabella Di Lenardo, Frederic Kaplan
Journal:

Computational Humanities Research / Volume 1 / 2025

Published online by Cambridge University Press:

01 October 2025, e11
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Cadastral data reveal key information about the historical organization of cities but are often non-standardized due to diverse formats and human annotations, complicating large-scale analysis. We explore as a case study Venice’s urban history during the critical period from 1740 to 1808, capturing the transition following the fall of the ancient Republic and the Ancien Régime. This era’s complex cadastral data, marked by its volume and lack of uniform structure, presents unique challenges that our approach adeptly navigates, enabling us to generate spatial queries that bridge past and present urban landscapes. We present a text-to-programs framework that leverages large language models to process natural language queries as executable code for analyzing historical cadastral records. Our methodology implements two complementary techniques: a SQL agent for handling structured queries about specific cadastral information, and a coding agent for complex analytical operations requiring custom data manipulation. We propose a taxonomy that classifies historical research questions based on their complexity and analytical requirements, mapping them to the most appropriate technical approach. This framework is supported by an investigation into the execution consistency of the system, alongside a qualitative analysis of the answers it produces. By ensuring interpretability and minimizing hallucination through verifiable program outputs, we demonstrate the system’s effectiveness in reconstructing past population information, property features and spatiotemporal comparisons in Venice.

Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts
Andrew Halterman, Katherine A. Keith
Journal:

Political Analysis , First View

Published online by Cambridge University Press:

19 September 2025, pp. 1-17
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Codebooks—documents that operationalize concepts and outline annotation procedures—are used almost universally by social scientists when coding political texts. To code these texts automatically, researchers are increasingly turning to generative large language models (LLMs). However, there is limited empirical evidence on whether “off-the-shelf” LLMs faithfully follow real-world codebook operationalizations and measure complex political constructs with sufficient accuracy. To address this, we gather and curate three real-world political science codebooks—covering protest events, political violence, and manifestos—along with their unstructured texts and human-coded labels. We also propose a five-stage framework for codebook-LLM measurement: Preparing a codebook for both humans and LLMs, testing LLMs’ basic capabilities on a codebook, evaluating zero-shot measurement accuracy (i.e., off-the-shelf performance), analyzing errors, and further (parameter-efficient) supervised training of LLMs. We provide an empirical demonstration of this framework using our three codebook datasets and several pre-trained 7–12 billion open-weight LLMs. We find current open-weight LLMs have limitations in following codebooks zero-shot, but that supervised instruction-tuning can substantially improve performance. Rather than suggesting the “best” LLM, our contribution lies in our codebook datasets, evaluation framework, and guidance for applied researchers who wish to implement their own codebook-LLM measurement projects.

Evaluating large language models with a word-level translation alignment task between Ancient Greek and English
Part of
- CHR Expanding the Toolkit: Large Language Models in Humanities Research
Peter M. Nadel, Gregory Crane
Journal:

Computational Humanities Research / Volume 1 / 2025

Published online by Cambridge University Press:

17 September 2025, e12
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In this article, we evaluate several large language models (LLMs) on a word-level translation alignment task between Ancient Greek and English. Comparing model performance to a human gold standard, we examine the performance of four different LLMs, two open-weight and two proprietary. We then take the best-performing model and generate examples of word-level alignments for further finetuning of the open-weight models. We observe significant improvement of open-weight models due to finetuning on synthetic data. These findings suggest that open-weight models, though not able to perform a certain task themselves, can be bolstered through finetuning to achieve impressive results. We believe that this work can help inform the development of more such tools in the digital classics and the computational humanities at large.

Automating the data extraction process for systematic reviews using GPT-4o and o3
Yuki Kataoka, Tomohiro Takayama, Keisuke Yoshimura, Ryuhei So, Yasushi Tsujimoto, Yosuke Yamagishi, Shiro Takagi, Yuki Furukawa, Masatsugu Sakata, Đorđe Bašić, Andrea Cipriani, Pim Cuijpers, Eirini Karyotaki, Mathias Harrer, Stefan Leucht, Ava Homiar, Edoardo G. Ostinelli, Clara Miguel, Alessandro Rodolico, Toshi A. Furukawa
Journal:

Research Synthesis Methods ,

Published online by Cambridge University Press:

17 September 2025, pp. 1-21
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Large language models have shown promise for automating data extraction (DE) in systematic reviews (SRs), but most existing approaches require manual interaction. We developed an open-source system using GPT-4o to automatically extract data with no human intervention during the extraction process. We developed the system on a dataset of 290 randomized controlled trials (RCTs) from a published SR about cognitive behavioral therapy for insomnia. We evaluated the system on two other datasets: 5 RCTs from an updated search for the same review and 10 RCTs used in a separate published study that had also evaluated automated DE. We developed the best approach across all variables in the development dataset using GPT-4o. The performance in the updated-search dataset using o3 was 74.9% sensitivity, 76.7% specificity, 75.7 precision, 93.5% variable detection comprehensiveness, and 75.3% accuracy. In both datasets, accuracy was higher for string variables (e.g., country, study design, drug names, and outcome definitions) compared with numeric variables. In the third external validation dataset, GPT-4o showed a lower performance with a mean accuracy of 84.4% compared with the previous study. However, by adjusting our DE method, while maintaining the same prompting technique, we achieved a mean accuracy of 96.3%, which was comparable to the previous manual extraction study. Our system shows potential for assisting the DE of string variables alongside a human reviewer. However, it cannot yet replace humans for numeric DE. Further evaluation across diverse review contexts is needed to establish broader applicability.

Do Specialized Medical LLMs Demand a Radically New Approach Under the EU’s Medical Device Regulation?
Hannah Louise Smith, W. Nicholson Price II
Journal:

Journal of Law, Medicine & Ethics / Volume 53 / Issue 3 / Fall 2025

Published online by Cambridge University Press:

15 September 2025, pp. 465-466

Print publication:

Fall 2025
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We examine the arguments made by Onitiu and colleagues concerning the need to adopt a “backward-walking logic” to manage the risks arising from the use of Large Language Models (LLMs) adapted for a medical purpose. We examine what lessons can be learned from existing multi-use technologies and applied to specialized LLMs, notwithstanding their novelty, and explore the appropriate respective roles of device providers and regulators within the ecosystem of technological oversight.

Augmented design automation: leveraging parametric designs using large language models
Fabian Schöfer, Arthur Seibel
Journal:

Proceedings of the Design Society / Volume 5 / August 2025

Published online by Cambridge University Press:

27 August 2025, pp. 671-680
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Traditional design automation enables parameterized customization but struggles with adapting to abstract or context-based user requirements. Recent advances in integrating large language models with script-driven CAD kernels provide a novel framework for context-sensitive, natural-language-driven design processes. Here, we present augmented design automation, enhancing parametric workflows with a semantic layer to interpret and execute functional, constructional, and effective user requests. Using CadQuery, experiments on a sandal model demonstrate the system’s capability to generate diverse and meaningful design variations from abstract prompts. This approach overcomes traditional limitations, enabling flexible and user-centric product development. Future research should focus on addressing complex assemblies and exploring generative design capabilities to expand the potential of this approach.

ReqGPT: a fine-tuned large language model for generating requirements documents
Kata Amanda Schiller, Meno-Said Haddad, Arthur Seibel
Journal:

Proceedings of the Design Society / Volume 5 / August 2025

Published online by Cambridge University Press:

27 August 2025, pp. 2741-2750
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Effective product development relies on creating a requirements document that defines the product’s technical specifications, yet traditional methods are labor-intensive and depend heavily on expert input. Large language models (LLMs) offer the potential for automation but struggle with limitations in prompt engineering and contextual sensitivity. To overcome these challenges, we developed ReqGPT, a domain-specific LLM fine-tuned on Mistral-7B-Instruct-v0.2 using 107 curated requirements lists. ReqGPT employs a standardized prompt to generate high-quality documents and demonstrated superior performance over GPT-4 and Mistral in multiple criteria based on ISO 29148. Our results underscore ReqGPT’s efficiency, accuracy, cost-effectiveness, and alignment with industry standards, making it an ideal choice for localized use and safeguarding data privacy in technical product development.

Case study: is there a space for TRIZ in the era of ChatGPT?
Vanja Čok, Luka Samsa, Miha Brojan, Jože Tavčar, Nikola Vukašinović
Journal:

Proceedings of the Design Society / Volume 5 / August 2025

Published online by Cambridge University Press:

27 August 2025, pp. 871-880
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This study investigates the integration of Large Language Models with the TRIZ to improve problem solving and innovation in industrial product development. By combining the structured problem-solving framework of TRIZ with LLMs to process large amounts of data and generate ideas, this hybrid approach seeks to overcome the limitations of traditional TRIZ and optimize solution generation. In a case study conducted in an industrial setting, the effectiveness of this integration was investigated by comparing team-generated solutions with those derived using LLMs and TRIZ-enhanced LLMs. The results show that while LLMs accelerate idea generation and provide practical solutions, the additional structure of TRIZ can provide unique insights, however depending on the application context.

Large language models for combinatorial optimization of design structure matrix
Shuo Jiang, Min Xie, Jianxi Luo
Journal:

Proceedings of the Design Society / Volume 5 / August 2025

Published online by Cambridge University Press:

27 August 2025, pp. 2201-2210
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Combinatorial optimization (CO) is essential for improving efficiency and performance in engineering applications. Traditional algorithms based on pure mathematical reasoning are limited and incapable to capture the contextual nuances for optimization. This study explores the potential of Large Language Models (LLMs) in solving engineering CO problems by leveraging their reasoning power and contextual knowledge. We propose a novel LLM-based framework that integrates network topology and contextual domain knowledge to optimize the sequencing of Design Structure Matrix (DSM) —a common CO problem. Our experiments on various DSM cases demonstrate that the proposed method achieves faster convergence and higher solution quality than benchmark methods. Moreover, results show that incorporating contextual domain knowledge significantly improves performance despite the choice of LLMs.

Functional decomposition of technical products based on large language models and Monte Carlo tree search
Meno-Said Haddad, Arthur Seibel
Journal:

Proceedings of the Design Society / Volume 5 / August 2025

Published online by Cambridge University Press:

27 August 2025, pp. 1913-1922
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Functional decomposition (FD) is essential for simplifying complex systems in engineering design but remains a resource-intensive task reliant on expert knowledge. Despite advances in artificial intelligence, the automation of FD remains underexplored. This study introduces the use of GPT-4o, enhanced with a proposed Monte Carlo tree search for functional decomposition (MCTS-FD) algorithm, to automate FD. The approach is evaluated qualitatively by comparing outputs with those of graduate engineering students and quantitatively by assessing metrics such as structural integrity and semantic accuracy. The results show that GPT-4o, enhanced by MCTS-FD, outperforms smaller models in error rates and graph connectivity, highlighting the potential of large language models to automate FD with human-like accuracy.

Exploring LLM-based agents for need analysis of knowledge management practice
Yixuan Su, Reza Mirafzal, Julie Stal-Le Cardinal
Journal:

Proceedings of the Design Society / Volume 5 / August 2025

Published online by Cambridge University Press:

27 August 2025, pp. 1675-1684
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Need analysis is essential for organisations to design efficient knowledge management (KM) practices, especially in contexts where knowledge is a critical asset and evolving fast. The research explores the application of large language model (LLM)-based agents in automating need analysis for KM practices. A two-layered model using Retrieval-Augmented Generation (RAG) architecture was developed and tested on datasets, including interviews with managers and consultants. The system automates NLP analysis, identifies stakeholder needs, and generates insights comparable to manual methods. Results demonstrate high efficiency and accuracy, with the model aligning with expert conclusions and offering actionable recommendations. This study highlights the potential of LLM-based systems to enhance KM processes, addressing challenges faced by non-technical professionals and optimising workflows.

Examining the role of semiotics in social media-driven information campaigns
Mayor Inna Gurung, Nitin Agarwal
Journal:

Data & Policy / Volume 7 / 2025

Published online by Cambridge University Press:

22 August 2025, e58
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
The rise of visually driven platforms like Instagram has reshaped how information is shared and understood. This study examines the role of social, cultural, and political (SCP) symbols in Instagram posts during Taiwan’s 2024 election, focusing on their influence in anti-misinformation efforts. Using large language models (LLMs)—GPT-4 Omni and Gemini Pro Vision—we analyzed thousands of posts to extract and classify symbolic elements, comparing model performance in consistency and interpretive depth. We evaluated how SCP symbols affect user engagement, perceptions of fairness, and content spread. Engagement was measured by likes, while diffusion patterns followed the SEIZ epidemiological model. Findings show that posts featuring SCP symbols consistently received more interaction, even when follower counts were equal. Although political content creators often had larger audiences, posts with cultural symbols drove the highest engagement, were perceived as more fair and trustworthy, and spread more rapidly across networks. Our results suggest that symbolic richness influences online interactions more than audience size. By integrating semiotic analysis, LLM-based interpretation, and diffusion modeling, this study offers a novel framework for understanding how symbolic communication shapes engagement on visual platforms. These insights can guide designers, policymakers, and strategists in developing culturally resonant, symbol-aware messaging to combat misinformation and promote credible narratives.

Impact of retrieval augmented generation and large language model complexity on undergraduate exams created and taken by AI agents
Erick Tyndall, Colleen Gayheart, Alexandre Some, Joseph Genz, Torrey Wagner, Brent Langhals
Journal:

Data & Policy / Volume 7 / 2025

Published online by Cambridge University Press:

22 August 2025, e57
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
The capabilities of large language models (LLMs) have advanced to the point where entire textbooks can be queried using retrieval-augmented generation (RAG), enabling AI to integrate external, up-to-date information into its responses. This study evaluates the ability of two OpenAI models, GPT-3.5 Turbo and GPT-4 Turbo, to create and answer exam questions based on an undergraduate textbook. 14 exams were created with four true-false, four multiple-choice, and two short-answer questions derived from an open-source Pacific Studies textbook. Model performance was evaluated with and without access to the source material using text-similarity metrics such as ROUGE-1, cosine similarity, and word embeddings. Fifty-six exam scores were analyzed, revealing that RAG-assisted models significantly outperformed those relying solely on pre-trained knowledge. GPT-4 Turbo also consistently outperformed GPT-3.5 Turbo in accuracy and coherence, especially in short-answer responses. These findings demonstrate the potential of LLMs in automating exam generation while maintaining assessment quality. However, they also underscore the need for policy frameworks that promote fairness, transparency, and accessibility. Given regulatory considerations outlined in the European Union AI Act and the NIST AI Risk Management Framework, institutions using AI in education must establish governance protocols, bias mitigation strategies, and human oversight measures. The results of this study contribute to ongoing discussions on responsibly integrating AI in education, advocating for institutional policies that support AI-assisted assessment while preserving academic integrity. The empirical results suggest not only performance benefits but also actionable governance mechanisms, such as verifiable retrieval pipelines and oversight protocols, that can guide institutional policies.

Utilizing large language models (LLMs) for quantitative reasoning-intensive tasks within the (re)insurance sector
Yilin Hao, Xiaojuan Tian, Haoran Zhao, Luca Baldassarre
Journal:

Annals of Actuarial Science , First View

Published online by Cambridge University Press:

12 August 2025, pp. 1-22
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
The rise of large language models (LLMs) has marked a substantial leap toward artificial general intelligence. However, the utilization of LLMs in (re)insurance sector remains a challenging problem because of the gap between general capabilities and domain-specific requirements. Two prevalent methods for domain specialization of LLMs involve prompt engineering and fine-tuning. In this study, we aim to evaluate the efficacy of LLMs, enhanced with prompt engineering and fine-tuning techniques, on quantitative reasoning tasks within the (re)insurance domain. It is found that (1) compared to prompt engineering, fine-tuning with task-specific calculation dataset provides a remarkable leap in performance, even exceeding the performance of larger pre-trained LLMs; (2) when acquired task-specific calculation data are limited, supplementing LLMs with domain-specific knowledge dataset is an effective alternative; and (3) enhanced reasoning capabilities should be the primary focus for LLMs when tackling quantitative tasks, surpassing mere computational skills. Moreover, the fine-tuned models demonstrate a consistent aptitude for common-sense reasoning and factual knowledge, as evidenced by their performance on public benchmarks. Overall, this study demonstrates the potential of LLMs to be utilized as powerful tools to serve as AI assistants and solve quantitative reasoning tasks in (re)insurance sector.

Search Results

Refine search

Refine search

Actions for selected content:

71 results

Domain anchorage in LLMs: Lexicon profiling and unintended information leakage

11 - Big Language

Summary

How to quickly select good in-context examples in large language models for data-to-text tasks?

Emerging trends: This is not cheating

Using large language models to directly screen electronic databases as an alternative to traditional search strategies such as the Cochrane highly sensitive search for filtering randomized controlled trials in systematic reviews

9 - Human–AI Collaboration for Identifying Health Information Wants

Summary

LLM agents for interactive exploration of historical cadastre data: framework and application to Venice

Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts

Evaluating large language models with a word-level translation alignment task between Ancient Greek and English

Automating the data extraction process for systematic reviews using GPT-4o and o3

Do Specialized Medical LLMs Demand a Radically New Approach Under the EU’s Medical Device Regulation?

Augmented design automation: leveraging parametric designs using large language models

ReqGPT: a fine-tuned large language model for generating requirements documents

Case study: is there a space for TRIZ in the era of ChatGPT?

Large language models for combinatorial optimization of design structure matrix

Functional decomposition of technical products based on large language models and Monte Carlo tree search

Exploring LLM-based agents for need analysis of knowledge management practice

Examining the role of semiotics in social media-driven information campaigns

Impact of retrieval augmented generation and large language model complexity on undergraduate exams created and taken by AI agents

Utilizing large language models (LLMs) for quantitative reasoning-intensive tasks within the (re)insurance sector

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

71 results

Summary

Summary