1. Introduction
Automobiles have been central to transportation since their mass production in the 1910s. Technological advances have made them more functional and safer. Despite these improvements, road fatalities remain high, with 1.19 million deaths globally in 2021 (World Health Organization, 2023) highlighting the importance of safety-critical automotive systems. The technological evolution in software and hardware has led to enhanced functionality and safety of automotive systems while increasing their complexity (Reference Berger and CarlsonBerger & Carlson, 2022). Sensors, GPS, and wireless networks, along with data collection, contribute to the transportation system transforming them into a system of systems (SoS) (Reference Zhang, Wang, Wang, Lin, Xu and ChenZhang et al., 2011). An SoS comprises of independent functioning heterogeneous and interdependent systems (Reference JamshidiJamshidi, 2017). Examples include networked IoT devices, aviation, transportation, and energy systems. These are Complex Products and Systems (CoPS) (Reference HobdayHobday, 1998), which are hierarchical, technology-focused, and customized critical infrastructure systems.
Systems engineering (SE) is an interdisciplinary approach for developing and managing CoPS systematically (Reference Sage and RouseSage & Rouse, 2011). To address SE development complexities, models to represent different aspects of the system and its evolution are becoming commonplace (Reference Madni, Augustine and SieversMadni et al., 2023; Reference Ramos, Ferreira and BarcelóRamos et al., 2012). Model-based systems are insufficient as they require updates and modifications to changing conditions that are observed during system operations over time (Reference Rhodes and RossRhodes & Ross, 2010). Characterizing system behavior through output data measurement is an important part of developing such dynamic systems (Reference LjungLjung, 2010). With advancements in AI and machine learning, opportunities arise for organizations to employ data-driven approaches in systems development (Reference HutchinsonHutchinson, 2021). This opening leads to CoPS expand their functionality and capabilities, transforming into Complex Intelligent Systems (CoIS) (Reference Lakemond, Holmberg and PetterssonLakemond et al., 2024).
However, few studies have explored the intertwining of model-based and data-driven approaches in CoIS development. Through a comparative case study of embedded and cloud-based systems within an automotive system developing organization, the purpose of this research is to understand this intertwining of model-based and data-driven approaches and the evolution of emerging CoIS. The key research questions are
-
How do model-based and data-driven approaches contribute to the evolution of emerging CoIS?
-
How do these approaches coexist and complement each other?
In the rest of the paper, Section 2 introduces the theoretical background of CoPS development, the role of models and data, and perspectives on CoIS. Section 3 outlines the methodology and research choices made. Section 4 presents empirical findings from interviews and secondary data. Section 5 compares and explores the deep intertwining of the approaches. Finally, section 6 highlights insights and contributions to understanding CoIS.
2. Theoretical background
The paper relies on three theoretical strands. First, it examines the nature of Complex Products and Systems (CoPS) in terms of the complexity of these systems and the range of activities that span from requirements gathering to project management. The second theoretical basis is derived from systems engineering, complemented by a framework of the process of engineering the CoPS systems involving models and data. The third theoretical framework addresses the incorporation of AI and data-driven approaches in the development of intelligence embedded in the modeling and design of Complex Intelligent Systems (CoIS)
2.1. Complex products and systems and model-based systems engineering
Complex products and systems (CoPS) are high-cost, engineering-intensive products developed in small batches or as single units (Reference HobdayHobday, 1998, p.690). CoPS are interconnected subsystems with complex, hierarchical architectures, high degree of novelty and customization (Reference Davies and HobdayDavies & Hobday, 2005). Increasing complexity creates emergent behaviors and feedback loops for system redesign (Reference NightingaleNightingale, 2000). Further, in systems development, the inclusion of embedded software increases uncertainties, necessitating dynamic adaptation of system architectures (Davies & Hobday, Reference Davies and Hobday2005; Takeishi & Fujimoto, Reference Takeishi, Fujimoto, Prencipe, Davies and Hobday2011). The development of CoPS encompasses design, engineering, integration, and project management (Reference HobdayHobday, 1998).
SE is defined as a transdisciplinary and integrative approach that covers the lifecycle of an engineered system using technological and management methods rooted in system principles (INCOSE, 2023). It involves sequential, iterative and evolutionary approaches to mature a system using process models such as the Vee, incremental spiral or DevOps (INCOSE, 2023), covering development phases such as requirements definition, architecture design, analysis, integration, verification, validation and implementation (Reference Forsberg, Mooz and CottermanForsberg et al., 2005). The formalized application of modeling throughout the development and later phases of the lifecycle is defined as Model Based System Engineering (MBSE), where models facilitate collaboration across disciplines and stakeholders (Reference Madni, Augustine and SieversMadni et al., 2023).
2.2. Importance of data and complex intelligent systems
MBSE addresses system hierarchy and system-subsystem interrelationships (structural and behavioral aspects). However, it does not adequately cover the more dynamic contextual, temporal, and perceptual aspects (Reference Rhodes and RossRhodes & Ross, 2010). This limitation is significant in the development of CoPS, without data from testing and prior operations (Ramos et al., Reference Ramos, Ferreira and Barceló2012; Rhodes & Ross, Reference Rhodes and Ross2010). Models need to be dynamically maintained, enabled by the data gathered from sensors using data analytics to support the development lifecycle (Reference Jiang, Yin, Li, Luo and KaynakJiang et al., 2021; Reference Zimmerman, Gilbert and SalvatoreZimmerman et al., 2019).
Data-driven approaches use statistical methods to build models from data. They are used in developing control systems and critical aspects in complex modeling, e.g., aviation systems often use flight test data for subsystem models for system identification (Reference LjungLjung, 2010). There is an increased focus on models that are dynamically updated in complex software, including those for SoS that are safety critical, to build self-adaptation and contextual awareness (Reference Bencomo and SongBencomo et al., 2019). The boundaries between development-time and run-time are shortening (Reference Baresi and GhezziBaresi & Ghezzi, 2010) resulting in a shift from reliance on traditional processes such as the Vee model to more evolutionary and agile approaches, to facilitate continuous integration and deployment (Reference Balachandran, Holmberg and LakemondBalachandran et al., 2024). Increased computational power and AI have enabled data-driven approaches such as machine learning (Reference Bishop and NasrabadiBishop & Nasrabadi, 2006) to dynamically learn new models from data (Reference Pillonetto, Aravkin, Gedon, Ljung, Ribeiro and SchönPillonetto et al., 2025) aiding the digital transformation of CoPS (Reference Lakemond, Holmberg and PetterssonLakemond et al., 2024).
The digital transformation of CoPS into complex intelligent systems (CoIS), has increased the intertwining of technology and management in system development (Reference Lakemond, Holmberg and PetterssonLakemond et al., 2024). In aviation, digital transformation has led to complex platform-based architectures for resource sharing and reuse between modules (Reference Lakemond, Holmberg and PetterssonLakemond & Holmberg, 2022). Safety-critical, CoPS and emerging CoIS potentially expand their boundaries while keeping critical functionalities stable (Reference Yu, Lakemond and HolmbergYu et al., 2024).
To explore the intertwining of model-based and data-driven approaches in the development of CoIS, the paper reports an empirical study, situated in an automotive systems developing organization. The study aims to understand how the two approaches coexist and completement the safety-criticality of automobiles and the maintenance of the road infrastructure as a CoIS. In the reported empirical study, both are considered as constituents of transportation system of systems.
3. Research design
This section outlines the methodology of this study, the data sources, case context and data analysis.
3.1. Research methodology and design
This research intends to explore the intertwining of model-based and data-driven approaches in CoIS. A phenomena-based approach that includes AI in general (Reference Von KroghVon Krogh, 2018) and CoIS in particular (Reference Yu, Lakemond and HolmbergYu et al., 2024) impacting organizations is adopted. This approach connects existing and new theoretical perspectives, with epistemic opportunities through abductive reasoning (Reference Sætre and Van de VenSætre & Van de Ven, 2022).
We use a case study methodology up close and in-depth, in a real-world setting (Reference YinYin, 2018). The main selection criteria for the case were: i) both models and data as vital parts of the system ii) it should be an emerging complex intelligent system, and iii) linked to the future of transportation and mobility. A comparative case design was adopted to study model-based and data-driven approaches, improving the robustness of the study and relying on replication logic to unravel the anticipated as well as unexpected, contrasting results (Reference YinYin, 2018).
3.2. Case context
Takeishi and Fujimoto (Reference Takeishi, Fujimoto, Prencipe, Davies and Hobday2011) predicted that information and communication systems would enhance modularization by separating hardware and software in automobiles. Over time, increasing digitalization in automobiles with advanced electronics, AI, and sensors has resulted in software-defined vehicles (Reference Panchal and WangPanchal & Wang, 2023). The increased complexity of technologies has made development challenging, emphasizing continuous integration and evolving architecture (Reference Berger and CarlsonBerger & Carlsson, 2022). To manage the new digitalization challenges, systems engineering practices have become essential in the automotive industry (O’Niel, 2023). Today, vehicles can communicate with other vehicles, people, and infrastructure (Reference Coppola, Silvestri, Coppola and Esztergár-KissCoppola & Silvestri, 2019), making automobiles complex and intelligent systems.
NIRA Dynamics, Linköping, Sweden, focuses on high-tech systems and innovative solutions for vehicle safety, driver support, and road maintenance. The company uses sensor data from various parts of the automobile, in conjunction with OEMs, to support its products and systems. Both model-based and data-driven approaches are employed in their development. These solutions are categorized into two types: embedded systems and cloud-based systems. In embedded systems, there are four products - tire pressure indicator (TPI), loose wheel indicator (LWI), tread wear indicator (TWI) and tire grip indicator (TGI), detailed in section 4.1. Cloud-based systems, as detailed in section 4.2, utilize aggregated data collected from two million vehicles. The first product, road surface alerts (RSA), provides drivers with precise, location-specific hazard warnings, including safety alerts such as aquaplaning or black ice. The second product, road surface conditions (RSC), uses data to map road conditions and degradation, for real-time insights for road maintenance contractors and municipalities. See Figure 1 for details.

Figure 1. Embedded and cloud-based systems
3.3. Data collection and analysis
To understand complex and emergent phenomena, engaging reflective practitioners is essential (Reference VenVan de Ven, 2007). Interviews are ideal to scope their technical and managerial expertise (Reference FlickFlick, 2009). We use semi-structured interviews to guide and constitute primary data. The selection of the interview participants was guided by purposeful sampling taking into consideration their role, expertise, experience, and involvement in the product development. See Table 1 for details.
Table 1. Interview details

The selected respondent profiles include product manager, product and system architects, design lead, software developer, interface developer, test lead, and team manager to ensure diversity of perspectives. To avoid bias, the selection list ensured a balance between the representatives from embedded and cloud-based systems. Questions focused on aspects such as system architecture, detailed product description, system interconnections, feedback loops, and the use of models and data in development and operation.
A key strength of case study research is its reliance on multiple sources of evidence to enhance the validity of the study (Reference EisenhardtEisenhardt, 1989). For this purpose, secondary data (Table 2) were collected. Apart from information from the company website, product and seminar videos were analyzed to understand the systems and context. In addition, research articles based on NIRA’s embedded and cloud-based systems were reviewed. Patents were considered as an important data source due to its technical and highly structured nature. The selection process of patents was guided by interview insights to ensure relevance to embedded and cloud-based systems. From an initial screening of 49 patents, 11 were shortlisted. Five patents, covering the two types of systems, were chosen based on their coverage of aspects such as the models, data, networked vehicles, use of machine learning and feedback loops.
Table 2. Secondary data

Analysis of the interviews followed a structured process proposed by Gioia et al. (Reference Gioia, Corley and Hamilton2013). We use this to identify the second order concepts and aggregate dimensions from the first order codes of the interview data. A sample of data structure is shown in Table 3. The emerging concepts were analyzed and organized thematically to guide the secondary data analysis.
Table 3. Data structure

4. Findings
In this section, we describe the key characteristics of embedded and cloud-based systems and interlinkages. We discuss these interconnections emphasizing the critical role of model-based and data-driven approaches in relation to the evolution of CoIS.
4.1. Embedded systems – reliance on models and data
Automobile safety relies on embedded systems, with modern vehicles using up to 100 electronic control units (ECUs) for various functions, including braking. Among them, the brake ECU plays a key role in ensuring vehicle stability through the anti-lock braking system (ABS) which detects wheel lockup using signals from the wheel speed sensor (WSS). Embedded systems, utilizing data from Original Equipment Manufacturer’s (OEM) hardware such as the brake ECU, offer a cost-effective alternative to physical sensors. They also enable the incorporation of future technologies like autonomous driving and inter-vehicle communication. Empirical data provide an overview of these systems, highlighting how they embody model-based and data-driven approaches and how these approaches are interlinked.
TPI detects tire pressure loss indirectly through relative roll radius measurement and spectral analysis of wheel speed signals, using WSS and inertial sensor data. By measuring relative wheel speed and the decrease in rolling radius, the tire is modeled as a spring-damper system. Frequency variations in the measurement indicate pressure loss in a tire. Spectrum analysis of the wheel speeds is also employed to further analyze the frequencies and create a model of vibration. This can be used to scan eigenfrequencies and monitor resonance changes. A preprocessing step facilitates the two types of measurements which are subsequently combined to evaluate and create a warning signal. LWI detects anomalous behavior of a wheel by comparing the wheel speed from two reference signals. It detects the frequencies and identifies the amplified signals that correspond to a loose wheel through pattern matching of the speed spectrum. By measuring the variance against these detection signals, a threshold violation is used to send a warning signal.
TGI continuously monitors friction between the road and tire through signal analysis. It compares the signals with a guideline friction map that considers the tire properties and road conditions. TGI measures the slip from the WSS data and tire stiffness value by iteratively incorporating slip data and tire characteristics. It uses a friction algorithm that dynamically adapts the friction estimation based on real-time data from GPS, traction control, and networked vehicles. The tire models can thus be dynamically updated to accommodate the road conditions. TWI estimates tread wear by measuring roll radius from tire pressure, wheel speed, and GPS data. It calculates the relative roll radius, using the tire pressure and vertical load on the tire. It also compensates for temperature and force applied to tires, during acceleration and braking. The system model uses data from other vehicles in the network to self-adapt and estimate real behavior.
A model-based approach is ideal for modeling physical phenomena involving friction, speed, position and vehicle characteristics. The respondents claim that deterministic model-based approaches are necessary for ensuring system safety. They emphasize the automotive industry’s reliance on safety standards, highlighting the importance of verifying and validating system functionalities, often using a V process model. Application development involves breaking down requirements to model functional components represented by codes in the algorithms, which are later verified and validated to ensure predictable system behavior.
From the empirical accounts, the embedded systems use a model-based approach as a foundation with a critical role for data-driven approaches. In the case of TPI, relative roll radius, one of the indirect methods of measurement, is model-based. There are several physical phenomena that affect the relative roll radius such as the twisting of the axle due to engine torque on wheels. In the case of spectrum-based analysis, the indirect method is partly model-based and partly data-driven. Here instead of the tire, vibration is modeled from measured data. The final output is combined from the two methods using neural networks to detect threshold signals and to issue warnings. The limitations of traditional indirect TPI monitoring can be addressed using machine learning models trained on vehicle and tire pressure data, capturing complex relationships for better assessment and adaptability. The respondents highlight the challenges of using purely model-based approaches and the need to complement them with data-driven approaches.
“The most dominant factor in our product is the tire, and the tire is very difficult to model.” - Respondent C
“ When we have a model but with unknown parameters, then we use the data to tune that. And I think that if you have a model that is somewhat correct, or at least quite good, why not use that? Because you can still be data-driven in how you adapt the parameters.” – Respondent I
Embedded system development, while model-based, benefits from data-driven approaches using extensive testing and operational data to refine algorithms. For example, TGI uses cloud data from other vehicles to adapt the model, enhancing safety and customer value beyond a single automobile. Such a combined approach highlights how the system is evolving through a combination of approaches. The cloud-based systems extend this evolution by harnessing real-time data to expand the system.
4.2. Cloud-based systems
Cloud-based systems use aggregated data to generate warnings for vehicle users and alerts for infrastructure maintenance. Data from embedded systems like TGI form a major part of the input to the cloud systems. The specification of the products defines the time window for creating customer value dynamically, affecting the architectural requirements of the system. RSA require quick response times with precise alerts, while RSC rely on data gathered over longer periods, such as a day. Depending on the customer, data aggregation varies. Aggregation functions hold the state of each individual road, and the aggregated data can be used in different aspects and timescales, enabling outputs to be tailored either for immediate vehicle alerts or for long-term roughness analysis, depending on customer needs. Respondent D, an architect for data streaming, explains the logic in this quote:
“In live sessions you don’t care about if the road was slippery yesterday, you want to know if it’s slippery now. But for infrastructure, you want to know if this road becomes slippery every half an hour, every day during the entire winter because you maybe need to do some maintenance on the road to decrease the slipperiness recurrence.”
The system architect explains that raw data gathered from the network of vehicles, covering 6 million roads, is uploaded to the cloud, which NIRA accesses and aggregates into a baseline. A preprocessing step filters out junk data, removes broken signals and produces data sets based on aspects such as roughness, friction or weather. Subsequently, data is refined and recombined with additional metadata from vehicles, such as wiper activity and engine intake humidity, to further enhance the analysis and build relationships between heterogeneous data types. Subsequently, map matching is performed to link events to specific locations. Depending on the use case—automotive or infrastructure—customer value is created from the data. The aggregation time window can be a day or 10 minutes, depending on whether it is for RSC or RSA, respectively. For example, it could be used for winter maintenance or to alert vehicles about aquaplaning.
Respondent G, a platform architect focusing on algorithms, highlights that model-based approaches are foundational for the cloud systems. However, cloud-based systems like RSA, being predominantly data-driven, are much more complex than the model-based embedded systems. This is reflected in the quote from respondent G, who has also worked in embedded systems in the past.
“There is a very large system, it is complex and not one person has an overview of the entire system. And this is quite different from the onboard (embedded) systems that I work with before. There you could be one person, and you can know the entire system and then make a plan for integration.”
Systems such as RSA and RSC have many potential use cases beyond those described earlier. Given the varied scope, cloud-based systems continue to evolve while remaining intricately interlinked with embedded systems. One key advantage of cloud-based systems is their enhanced computational power and scalability independent of OEM’s hardware limitations. In contrast, embedded systems rely on the limited computational power of the ECUs. Furthermore, data-driven development provides greater adaptability, enabling the system to evolve further by harnessing the potential of AI.
5. Discussion
Based on a comparative study of embedded and cloud-based systems in the automotive domain, this paper explores the emergence of CoIS and the intertwining of model-based and data-driven approaches in their development. The study highlights important interconnections between models and data. From the case study, it is apparent that model-based and data-driven approaches play a vital role in the development of complex and increasingly intelligent systems.
Embedded systems are an important part of automotive control systems and are intricately connected to the hardware. They perform safety-critical functions by providing predictability, based on models that encapsulate physical phenomena. From the case study, it is evident that model-based approaches are vital to ensure that the system performs the intended functions and their inter-relations, covering the structural and behavioral aspects (Reference Rhodes and RossRhodes & Ross, 2010). However, the dynamic nature of systems makes it challenging to model all aspects, requiring data-driven approaches, as implied by the empirical data and highlighted by Ramos et al. (Reference Ramos, Ferreira and Barceló2012) and Jiang et al. (Reference Jiang, Yin, Li, Luo and Kaynak2021). Complementing models with data is necessary to support the structural and behavioral aspects such as external adaptation and incorporating emergent properties (Reference Rhodes and RossRhodes & Ross, 2010; Reference LjungLjung, 2010). Data from embedded systems enhances the algorithms and the underlying models improving understanding within the automotive system’s boundaries and aiding self-adaptation and contextual awareness (Reference Bencomo and Songcf. Bencomo et al., 2019). The examples of TGI and TPI, where data enriches the tire model and the pressure estimation, highlight this.
In contrast, cloud-based systems capture aggregated data from networked automobiles to model road conditions and integrates heterogeneous data (e.g. additional sensor data, GPS and weather data) for short-term alerts and long-term road condition analysis, addressing the contextual, temporal and perceptual aspects of the complex systems (Reference Rhodes and Rosscf. Rhodes and Ross, 2010). Contextual aspects in this case refer to the dynamic road conditions such as change in friction due to surface degradation in winter. Temporal aspects refer to the different timescales for which data is produced and consumed, either for alerts or for actionable maintenance insights. The perceptual aspects involve multiple stakeholders associated with the system such as the automobile users, and the road infrastructure contractors. This points to not only data production and consumption but also the generation of insights into the stakeholder’s preferences (e.g. road use or traffic behavior) where humans and the system are coupled.
The complex software functionality in automobiles and the use of layered architecture (Reference Berger and CarlsonBerger & Carlsson, 2022) facilitates reuse and recombination of data and resources. The networked automobile system here represents an emerging complex intelligent system (CoIS) (Reference Lakemond and Holmbergcf. Lakemond & Holmberg, 2022). The cloud platform not only facilitates the recombination of heterogeneous data (e.g. weather data, TGI and TPI data, GPS and additional sensor data) and resources (e.g. embedded system storage and cloud storage) but also acts as a core platform that offers service to the transportation system. This is in line with the findings of Yu et al. (Reference Yu, Lakemond and Holmberg2024) in the context of platform-based research arenas.
The comparative case study revealed how CoIS is evolving, specifically the network of intelligent automobiles, supporting the larger transportation system. It highlights the foundational role of model-based approaches and the equally critical role of data-driven approaches in facilitating the evolution of the system. On the one hand model-based approaches aid data gathering to scope system behavior in embedded systems, playing a crucial role in the sub-system’s (automobile) safety. On the other hand, cloud-based systems facilitate the composition of data from multiple sources to enhance the underlying models and safety-criticality at a system level. In effect, cloud-based systems rely on model-based approaches to generate accurate data about road conditions while the data-driven approaches improve the capability of the CoIS to dynamically adapt to the environmental context.
This case study illustrates how model-based and data-driven approaches contribute to the development of CoIS and its evolution. The case study has demonstrated this through the comparison of embedded systems that operate at the subsystem level and cloud-based systems operate at CoIS level.
6. Conclusions
Through a case study, this paper explored the intertwining of model-based and data-driven approaches in automotive systems representing an emerging CoIS constituting an important part of a larger transportation system. The findings highlight how model-based and data-driven approaches coexist and complement each other, at various levels of the system and contribute to the evolution of CoIS.
Model-based approaches in embedded systems support the safety-critical functions by increasing the predictability of system behavior and generate data in the process. Data-driven approaches make use of the flexibility of the cloud-based platform and use the data generated to provide feedback to embedded systems, the network (e.g. connected vehicles) and road infrastructure, serving different purposes for various end users. It is apparent that model-based approaches are dominant at the subsystem level, in this case embedded systems. Data-driven approaches play a dual role of complementing model-based approaches at sub-system level and enabling the expansion of the system, in this case cloud-based systems. While the two approaches complement each other, they are also autonomous in their functionality, demonstrating the adaptability of CoIS as they evolve.
Although this study focused on the automotive sector and a single case, a set of carefully selected experts, patents, and additional sources enrich the understanding of the phenomenon. While such a focus may introduce some selection bias, careful triangulation was employed to enhance the validity of this study. The findings indicate the significant role of model-based and data-driven approaches in the evolution of CoIS. Furthermore, the platform-based strategies identified in this study are consistent with those reported in aviation systems and research arenas, increasing the generalizability of the findings. This provides a basis for further exploration of integrated model-based and data-driven approaches that evolve CoIS towards systems of systems.
Acknowledgements
This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program – Humanity and Society (WASP-HS) funded by the Marianne and Marcus Wallenberg Foundation.