1. Introduction
Competitive pressures arising from globalization and rapid technological advancements require companies to continuously broaden their product portfolios. Such expansion is essential for optimizing existing systems and for fostering the creation of technically innovative inventions that offer a competitive edge (Reference Gorodnichenko, Svejnar and TerrellGorodnichenko et al., 2010).
Interdisciplinary collaboration contributes to maintaining a competitive position by integrating insights from diverse fields. This approach enables the development of solutions that are innovative, efficient, and capable of meeting complex requirements, thus ensuring that mechatronic systems remain robust and adaptive (Reference Braun, Diehl, Petermann, Hellenbrand and LindemannBraun et al., 2007).
Anticipating future developments, resilience in complex and individualized operating environments is of essential importance. The targeted collection and analysis of usage data provide valuable insights for product development and system optimization. Analyzing usage data significantly contributes to reducing uncertainties and risks while enhancing the efficiency and quality of systems. This process supports data-driven decision-making that facilitates predictive maintenance, resource optimization, and continuous system adaptation in response to evolving user needs and operational conditions (Reference Rüßmann, Lorenz, Gerbert, Waldner, Justus, Engel and HarnischRüßmann et al., 2015; Reference Wagenmann, Bursac, Rapp and AlbersWagenmann et al., 2022a). Nevertheless, data is frequently not collected systematically enough and is not actively used in development processes. The present paper focuses on the challenges associated with data-driven system modelling and proposes an approach to facilitate efficient data management in system models for developers. The validity of the method is established through its initial validation as part of the case studies conducted.
2. State of research
2.1. System models in Systems Engineering
Systems Engineering (SE) is a transdisciplinary and integrative approach that focuses on the holistic design and life cycle management of technical systems. This approach encompasses not only the design and implementation but also the operation and eventual decommissioning of systems. SE relies on the application of system methods, to ensure that all aspects of a system’s life cycle are addressed in a coherent and structured manner (GfSE, 2024).
A key element of SE is the application of systems thinking. This perspective emphasizes the importance of understanding interactions and relationships among various system components as well as between these components and their surrounding environment. Recognizing these interdependencies facilitates a comprehensive view of how changes in one part of the system can affect the whole, thus supporting more informed decision-making and risk management (Reference OttOtt, 2009). Standardized processes and a transparent lexicon play a critical role in SE by enhancing interdisciplinary communication. A common language and set of procedures ensure that experts from different fields can collaborate effectively, reducing the potential for misunderstandings and inefficiencies. This structured communication framework also helps to streamline the development process by minimizing the amount of additional interdisciplinary input required during later stages of the project (Haberfellner et al., 2019). Another central component of SE is the continuous, cross-disciplinary validation of the entire system. This ongoing evaluation ensures that the system meets its intended requirements at every stage, from initial design through to final implementation. Regular validation helps identify and correct issues early, thereby reducing overall risks and increasing the reliability of the system (Reference Kossiakoff, Biemer, Seymour and FlaniganKossiakoff et al., 2020).
Model-based systems engineering (MBSE) represents an evolution of the traditional SE approach. MBSE is founded on the concept of a unified system model that integrates all disciplines and covers the entire system life cycle. Under the Single Source of Truth principle, this unified model serves as the definitive repository for all SE activities, ensuring that every stakeholder has access to consistent and up-to-date information (Reference Madni and PurohitMadni & Purohit, 2019). The primary objective of MBSE is to facilitate the centralized and homogeneous mapping, storage, and management of information within the system model. This approach contrasts with traditional document-based methods that can lead to fragmented and inconsistent data. By maintaining a single, integrated model, MBSE enhances cross-disciplinary collaboration and streamlines information exchange across various domains (Reference Albers, Bursac, Scherer, Birk, Powelske and MuschikAlbers et al., 2019a).
To ensure consistency and precision in the modelling process, MBSE employs specific modelling methods. These methods provide clear guidelines for defining model elements and establishing the procedures by which models are constructed and maintained. One structured approach involves the RFLP chain (Requirements, Functional Architecture, Logical Architecture, Physical Architecture). This framework ensures that system requirements are systematically translated into functional specifications, then into logical frameworks, and finally into physical implementations. Such a structured progression promotes a traceable and methodical development process that supports the evolution of complex systems from initial specifications to final implementation (Reference Albers, Matthiesen, Bursac, Moeser, Schmidt and LüdckeAlbers et al., 2014; Reference Kleiner and KramerKleiner & Kramer, 2013).
2.2. Development of systems in generations
System Generation Engineering (SGE) describes an iterative development and redefinition of the previous Product Generation Engineering (PGE), wherein the development of a new product generation is fundamentally based on a reference system. This reference system is a comprehensive framework where elements as system components are derived from existing or planned socio-technical systems, including their associated documentation. In essence, the reference system provides a well-established foundation and a clear starting point for the development of a new product generation, ensuring that the prior gained experience and knowledge is systematically leveraged. Each reference system within a given product generation encapsulates a detailed summary of all its reference system elements (RSE). These elements include both technical specifications and process-related insights that have been accumulated from previous iterations or planned developments. During the product development process, the reference system is not static; it can be adapted based on new information or the outcomes of validation processes. Such adaptability ensures that the reference system remains relevant and continues to reflect the latest understanding of technical requirements and market conditions (Reference Albers, Rapp, Spadinger, Richter, Birk, Marthaler and WesselsAlbers et al., 2019b; Reference Albers, Kürten, Rapp, Birk, Hünemeyer and KempfAlbers et al., 2022). The utilization of SGE in the product development process offers multiple benefits. First, it facilitates the targeted deployment of accumulated experience and prior knowledge from reference systems. This structured approach not only enhances overall efficiency but also provides a more robust framework for managing innovation. Additionally, by relying on established reference systems, potential risks emerging during development can be identified and minimized at an early stage, thereby reducing uncertainties and ensuring a smoother transition from concept to market (Reference Albers, Rapp, Birk and BursacAlbers et al., 2017). Furthermore, within the context of SGE, the systematic integration of usage data from RSE offers significant advantages for validation purposes. Data-driven validation involves analyzing actual customer usage patterns of the reference system, which provides critical insights into its performance in real-world scenarios. This empirical approach facilitates more objective decision-making, supports the continuous refinement of system elements, and ultimately contributes to reducing market uncertainties. By grounding validation in real usage data, companies can better understand how customers interact with their systems, enabling a more informed and adaptive development process (Reference Hünemeyer, Bauer, Wagenmann, Kubin and AlbersHünemeyer et al., 2023).
2.3. Data-driven development in Advanced Systems Engineering
In the field of Advanced Systems Engineering, the utilization of data-driven development is becoming an increasingly pivotal methodology for addressing the challenges of growing intricacy, dynamism and volatility in today’s markets, coupled with the rising uncertainty and unpredictability of developments (Reference RashediRashedi, 2022). The term “Advanced Systems” encompasses future systems that are distinguished by a high degree of autonomy, dynamic networking, artificial intelligence, and intelligent user interactions. The advent of new technologies and the concomitant reduction in product life cycles has resulted in a notable increase in system complexity (Reference Dumitrescu, Albers, Riedel, Stark and GausemeierDumitrescu et al., 2021). In this context, the deployment of a system model as the principal artefact in MBSE represents a pivotal strategy for navigating complexity. MBSE is strongly oriented towards technological feasibility; however, its applicability is contingent upon the acceptance of developers (Reference Albers and LohmeyerAlbers & Lohmeyer, 2012). Advanced Engineering represents a further evolution of traditional product development methods, incorporating digital technologies just as knowledge graphs and generative Artificial Intelligence. For example Wang et al. underline the gained value of knowledge resource by visualization for engineering purposes (Wang et al., 2023). Furthermore, innovative development approaches, such as SGE, are employed to facilitate the dynamic adaptation of system developments to market dynamics (Reference Dumitrescu, Albers, Riedel, Stark and GausemeierDumitrescu et al., 2021). The data accumulated and stored during the middle phase of the life cycle of a mechatronic system can be classified in various ways. The presented study uses the data classification proposed by Meyer et al., which distinguishes five distinct use phase data clusters shown in Figure 1 (Reference Meyer, Panzner, Koldewey and DumitrescuMeyer et al., 2022).

Figure 1. Dendrogram highlighting the five identified use phase data clusters (Reference Meyer, Panzner, Koldewey and DumitrescuMeyer et al., 2022)
3. Research questions and methodology
The integration of usage data from technical systems and the analysis of this data within system models provides a central information basis for development processes. This approach enables a holistic analysis of technical systems and a deeper understanding of interdependencies in the real-world use of systems. In addition, the early integration of data analyses into the product development process allows targeted data collection based on the functional structure of the technical system to be integrated into the architecture. Thus, the goal of this work is to develop an approach for data-driven system modelling within the interdisciplinary system generation development of mechatronic systems. This will be achieved through the integration of usage data into a system model within the framework of MBSE, thereby supporting the implementation of data analyses based on usage data in product development processes. In doing so, the following research questions are answered:
-
RQ1: What are the challenges of data-driven development and system modelling in the interdisciplinary system generation development of mechatronic systems?
-
RQ2: How can data-driven development in the interdisciplinary system generation development of mechatronic systems be supported by data-driven system modelling?
-
RQ3: What added value does data-driven system modelling have for data-driven development in the interdisciplinary system generation development of mechatronic systems?
This work is structured according to the Design Research Methodology (Reference Blessing and ChakrabartiBlessing & Chakrabarti, 2009). In the descriptive study I, research question 1 is answered with the help of an empirical analysis of the research environment to generate a deeper understanding of the current challenges in data-driven development processes and the system modelling in the context of mechatronic systems. In the prescriptive study, research question 2 is answered by developing a method for data-driven system modelling in the interdisciplinary system generation development of mechatronic systems using Systems Engineering to integrate usage data into holistic system models. In the descriptive study II, research question 3 is answered by validating the method with regard to the efficiency and applicability of different user groups. The results presented in this research work are developed in the context of a German machine tool manufacturer. Further, the evaluation is carried out as part of a student project with the Albstadt University of Applied Science. Hereby, 14 Data Science students work in groups on three different use cases over a period of 12 weeks. The study is accompanied by data experts from the research environment. The project was conducted in accordance with a three-week sprint schedule. After each sprint, all students are obliged to submit a laboratory report and a quantitative survey for a continuous evaluation over the course of the project.
4. Analysis of the challenges in data-driven system modelling
As part of the empirical analysis, semi-structured interviews are conducted with 25 developers from the research environment to identify the current challenges regarding data-driven system modelling. The unstructured interviews yielded a list of challenges pertaining to data-driven development of mechatronic systems (Table 1). The right column hereby shows the number of mentions during the interviews.
Table 1. Challenges in data-driven development of mechatronic systems

The implementation of data-driven development in module teams is constrained by a number of challenges. The heterogeneity and lack of transparency regarding the data structures, as well as the lack of metadata documentation, make it challenging to utilize the stored usage data effectively. The reason for the heterogeneous data structures of usage data (C1) is the large number and complexity of the data sources in a mechatronic system as well as unstructured data transfer. Further, C2 identifies the absence of centralized metadata documentation which hinders the effective data-driven development due to an inefficient data search, which is dependent on the input of expert knowledge. Moreover, the absence of data quality standards in conjunction with ambiguous responsibilities results in the inability to anchor data-driven approaches within the development processes (C4). The complexity of data collection and analysis of usage data, coupled with the limited data analysis skills within development teams, frequently necessitates the involvement of data experts external to the development teams. Conversely, developers from traditional engineering disciplines lack straightforward access to data-driven development. Despite the process model for data-driven validation of the system of objectives, which aims to empower developers for data-driven development (Reference Wagenmann, Krause, Rapp, Albers, Sommer and BursacWagenmann et al., 2022b), there is a dearth of centralized information and training resources available to developers from traditional disciplines within the research environment. Thus, a paucity of intrinsic motivation to analyze usage data is currently observable in a considerable number of development teams (C5). The absence of a data-driven mindset has the consequence of the de-prioritization of a data-driven approach in development projects (C6). The low prioritization of a data-driven approach also impedes the willingness to invest in the IT infrastructure on the technical systems. Additionally, the absence of centralized storage structures and their meticulous maintenance impedes the targeted utilization of references for the data-driven validation of elements of the system of objectives in the development of mechatronic systems (C7). A further challenge is the lack of adequate compatibility between the databases (C8). Moreover, as a foundation for data transmission, it is essential to guarantee sufficient data protection of customer knowledge (C9). Consequently, the capacity to conduct data analysis is considerably constrained by C8 and C9, thereby diminishing its utility in the context of development projects. In conclusion, a lack of process allocation of the data in data-driven development is identified as a significant issue (C10). Consequently, the informative value of the usage data with regard to the actual operating processes of the technical systems is limited. In terms of the challenges in data-driven development, it can be summarized that the research environment is characterized by a high degree of intransparency and heterogeneity, as well as a lack of organizational and personnel anchoring. In regard to a data-driven system modelling, the unstructured interviews also led to the identification of a series of challenges associated with system modelling in the development of mechatronic systems (Table 2).
Table 2. Challenges in system modelling in the development of mechatronic systems

C11 describes the lack of comprehensive documentation regarding the system’s knowledge base. A further challenge is the absence of cross-disciplinary modularization (C12) which includes a multitude of individual module interfaces and a lack of clarity regarding the definition of system boundaries. Another challenge is the absence of a comprehensive and cross-disciplinary central system model (C13). In addition, the management of complex changes and cancellations poses a major challenge (C14). This is due to the complex and opaque relationships within the system and the manual implementation of changes and discontinuations in the heterogeneous development documentation. The inadequate documentation of system knowledge in document-based filing structures, the lack of interdisciplinary modularization and the complex change and discontinuation management lead to inefficient processes. Moreover, the absence of transparency regarding the internal variance and dependencies within the system hinders effective cross-disciplinary collaboration (C15).
The analysis in this chapter has revealed a number of challenges reducing efficiency in interdisciplinary mechatronic system development. These include the existence of implicit expert knowledge that is confined to specific domains. The presence of heterogeneous data structures and the incomplete documentation of system knowledge serves to further complicate the effective management of data and knowledge, which in turn gives rise to the emergence of knowledge gaps and reduces efficiency in development teams. In order to address these challenges, it is necessary to implement a comprehensive documentation strategy that encompasses the collection, storage, and integration of usage data with the technical system in the development of mechatronic systems.
5. Method for integrating usage data in system models
The aim of holistic system modelling in the development of mechatronic systems is to extend the classic system modelling according to the RFLP chain to include modelling of the data structures of usage data and data analyses based on this data. Thus, this chapter proposes a procedure for integrating structures and analyses of usage data into holistic system models of mechatronic systems. Based on the named existing frameworks as well as the challenges presented in Chapter 4 a five step-method is proposed in Figure 2 for the integration of usage data into holistic system models of mechatronic systems:

Figure 2. Method for integrating usage data in holistic system models
Figure 3 illustrates a high-level example of a system model constructed by the RFLP chain. The modelling is extended by two levels, namely data structures and data analyses. In the system model, usage data is linked to the elements of the logical level to facilitate access to system-level knowledge. Consequently, enabling to distinguish between different solution principles and to enhance the level of detail in the evaluation process of an analysis. To facilitate the transparent documentation of the origin of usage data, a link between the data structures and the control system can also be implemented. If the analysis procedure is more complex, it is also possible to model the procedure in an expanded view. This enhances the transparency and traceability of the analysis and the results. Data analyses are linked to requirements, the functional or physical level depending on the RSE to be validated. This enables a data-driven system modelling:

Figure 3. Categories of data analyses in data driven system modelling
The first category encompasses data analyses related to requirements. These analysis objects are typically derived from the project management domain or new development projects. The results of data analyses related to requirements typically inform a decision, which in turn gives rise to a specific change in the system model. This may be due to a change or discontinuation, provided that a sufficiently meaningful data basis is available. Furthermore, they provide a robust foundation for defining novel requirements for emerging product generations and for identifying pertinent target groups and markets for new developments. The data analyses in the first category typically pertain to a multitude of systems or encompass the entirety of a product line.
The second category contains data analyses related to the physical level of the system model and is employed to ascertain the underlying causes of errors or wear on the customer’s system. The aforementioned analysis objects are typically generated by the development teams themselves, based on feedback from service users or customers. Consequently, the analyses are frequently constrained to a single system or a small number of systems. The database available for analysis is therefore typically smaller, contingent on the willingness of the affected customers to share usage data. In this instance, the physical level is appropriate for consistent linking, given that error or wear analyses conducted at the customer’s premises pertain to a particular component. Furthermore, the notifications from the e.g. service are also structured according to the component. In addition to identifying specific error cases, data analyses can also be used to identify other customers who utilize certain functions in a similar manner and who may experience similar problems in the future. This is based on a comprehensive analysis of usage behavior.
The third category of data analysis is related to the functional level of the system model. This category of data analysis encompasses the conventional architectural work undertaken in the development of mechatronic systems. The objective is to construct a comprehensive repository of knowledge pertaining to the functional level and its characteristics as experienced by the customer. It is not uncommon for these analyses to fail to yield a decision that is directly implemented in the system model. Conversely, these data analyses serve to substantiate design decisions for the efficient utilization of resources. To illustrate, a well-founded identification of maxima and limit areas can facilitate more precise component sizing for the requisite service life. Furthermore, targeted product segmentation can be conducted within the architectural framework.
6. Method evaluation for integrating usage data in system models
6.1. Evaluation of the efficiency
To assess efficiency, the initial time required for conducting data analyses is quantified through a series of use cases. The evaluation compares these time intervals with the duration developers require to execute the proposed method across ten distinct use cases. This time measurement encompasses both the period necessary to thoroughly comprehend the extended system model and the time needed to create and establish the appropriate links between the data analysis, the corresponding data sets, and the relevant elements within the system model. The findings of this evaluation are detailed in Figure 4.

Figure 4. Results of the evaluation of the efficiency in the research environment
The evaluation revealed a notable enhancement in the efficiency of data-driven development through data-driven system modelling. The initial time measurement (left) showed an average duration of 1.7 weeks. In contrast, the application of the method for data-driven system modelling reduced this average to 41.1 minutes (right). Three success factors were identified for this increase in efficiency:
First, a centralized and holistic documentation of system knowledge supports a fundamental understanding of data analysis and its integration within the system model. This documentation enables developers from all engineering disciplines to readily access the structures of the usage data associated with the modelled technical system. Consequently, the system model provides transparency regarding the data’s provenance by linking the usage data to the system control or programming system, thereby enhancing developers’ ability to assess the intrinsic value of the usage data in addressing the initial research question.
Second, the holistic system model includes links to additional information in the form of metadata and analysis documentation. The evaluation demonstrates that the presence of metadata documentation, along with its association with the database element in the system model, is a critical determinant of operational efficiency.
Third, the ability to access reference systems facilitates the development of a fundamental understanding of the advantages inherent to data analysis. This capability, in turn, promotes a data-driven mindset among developers and increases the prioritization of data-driven approaches.
Nevertheless, several factors impeding the anticipated efficiency gains were also identified. On one hand, the system model is currently at a relatively immature stage of development and only represents the central elements of the technical system. On the other hand, efficiency is closely correlated with the usability of the modelling tool and the developers’ experience in utilizing the tool.
6.2. Evaluation of the applicability
The applicability of the developed approach in the research environment is evaluated using a rating system, where test subjects assign a point value on a scale from 0 (not applicable) to 5 (very applicable). The average applicability rating from developers is 2.8 out of 5. The ease of handling and clarity of the system model positively affect applicability. Additionally, developers from engineering disciplines can leverage their existing system expertise to connect the technical system with the usage data, thereby enhancing applicability. In contrast, the integration of new usage data, long-term maintenance, and ensuring the system model’s correctness—as well as the incorporation of model-based procedures into development projects—are rated negatively. Currently, the system model includes only those databases used for analytical purposes in this research. Incorporating new databases and maintaining their current status requires considerable effort and necessitates a comprehensive understanding of both the usage data and the technical system components. Identifying the necessary competency profile is challenging in the research context, as many developers’ expertise is concentrated in a specific sub-area. Moreover, ensuring the long-term applicability of the approach requires clearly defined responsibilities and processes to maintain the system model’s currency and accuracy. Within the student project, applicability is evaluated quantitatively through a survey administered at the conclusion of each sprint, covering four review periods. Three questions are posed regarding the extent of system understanding, the support provided by the system model, and the effort involved in using the system model. Each question is rated on a numerical scale from 1 (not at all) to 10 (very strongly).
The first question assesses how well the holistic system model facilitates comprehension of the technical system over the project’s duration. The average student rating is 5.2 out of 10. Initially, the extended system model is used primarily to develop an understanding of the fundamental structure and functionality. However, the system’s high complexity and the model’s low maturity pose obstacles to comprehension.
The second question evaluates the extent to which the extended system model supports data analyses. The average student rating is 3.6 out of 10. In practice, the system model offers substantial support during the initial sprint by assisting in the assignment of usage data to the technical system elements. In the final sprint, the model facilitates data-driven optimization by referencing modelled analysis results and evaluating their impact on the technical system. In contrast, the system model is seldom used for data analysis activities during the middle sprints.
The third question measures the effort required to utilize the system model. The average student rating is 4.6 out of 10. This perception of effort correlates positively with the intensity of system model use, suggesting that its utilization is less intuitive for students than for developers, likely due to students limited expert knowledge of the technical system.
7. Discussion and outlook
A systematic, data-driven system modeling approach—employing a holistic system model within interdisciplinary SGE of mechatronic systems—offers significant potential for efficient data-driven development processes. Integrating usage data into system models makes data structures and analyses accessible to developers from traditional engineering disciplines, thereby promoting the acceptance of data-driven approaches within development teams. However, obstacles in both data-driven development and system modeling continue to impede this potential in the research environment. For example, the lack of quality and structured data collection undermines the informative value of analyses used to validate elements of the system model. Additionally, there is a clear discrepancy between the fundamental recognition of the need for data-driven approaches and the availability of necessary resources and capacities for their systematic implementation. Furthermore, a function-oriented modularization of the technical system is essential for successful system modeling, as it is a prerequisite for decision-making and the identification of optimization potential derived from operating data analyses, which can be implemented by modifying individual modules.
Future research will focus on establishing optimal structures for data collection to enhance the efficiency of data analysis in development projects and to facilitate continuous improvements in data quality through comprehensive data and system architectures. Given that the positive effects of engineering practices are often only observable over extended periods, additional research with an explicit focus on the application of the presented framework is warranted.