1. Introduction
Research Data Management (RDM) is increasingly gaining importance in data-intensive engineering research and has become an integral part of the overall research process. In engineering design processes, data can form the basis for subsequent product lines and, in the context of technical inheritance, the foundation for data-driven product development (Reference Altun, Oladazimi, Wawer, Raumel, Wurz, Barienti, Nürnberger, Lachmayer, Mozgova and KoeplerLachmayer and Mozgova, 2021; Reference Meyer, Westerhausen, Kyriazis, Hühne and LachmayerMeyer zu West-erhausen et al., 2024). To be able to use and reuse data efficiently, a defined RDM is required that encompasses all strategies and activities for the efficient administration, archiving, and protection of generated data in research projects. It ensures that data is systematically recorded, comprehensively documented, and made accessible. In research processes, the added value of RDM lies in the structured organization and storage of data, which improves the reproducibility, accessibility, efficiency, and quality of work in collaborative projects and within the research community (Reference Mozgova, Koepler, Kraft, Lachmayer and AuerMozgova et al., 2020). A secure RDM enables better traceability, supports data sharing, and ensures the long-term use of data, making it a key requirement for research projects.
When implementing RDM, researchers themselves are mainly responsible for developing and implementing strategies for handling and managing research data as part of the specific project (Reference Clements and und McCutcheonClements and McCutcheon, 2014). As a result of RDM, researchers are faced with a variety of new activities and requirements that need to be implemented in research projects (Reference Briney, Coates and und GobenBriney et al., 2020). It is becoming apparent that approaches and practices for managing research data vary widely and are often viewed as additional tasks to the actual research. These shortcomings occur particularly in the handling of research data during the research process, which can lead to inconsistent application and potential inefficiencies in the research process (Reference Iglezakis and SchemberaIglezakis and Schembera, 2018). Therefore, a consistent and standardized implementation of the RDM is crucial to ensure consistency, data quality, and efficiency of the research as well as to improve collaboration and traceability in research. Integrating RDM into research projects and processes is essential to ensure the long-term efficiency and effectiveness of research. In addition to integrating RDM into the research processes and projects, it is also important to ensure the implementation of RDM and to provide support for researchers.
In this paper, a target process is developed that provides a research oriented integration of RDM into research processes and projects. The process visualization is intended to support researchers in the implementation of RDM in their research projects and to contribute to a holistic RDM. To ensure the traceability and reusability of data with a qualitative implementation of RDM in research projects, a maturity level integration is carried out in addition to the actual process visualization. The maturity integration enables an assessment to analyze the current state of RDM practices and identifies the potential for improvement based on defined goals. The different maturity levels classify the identified RDM activities in order to provide researchers with decision support for the implementation of RDM. As a use case, the maturity assessment is applied in a collaborative research project to highlight the challenges and potential inherent with this approach. In Section 2, an overview of engineering research and the research process is given, the fundamentals and existing approaches of RDM are discussed and the use of maturity models for RDM is described. The approach and the methodology on which the results of this work are based are described in Section 3. In Section 4, the visualized research process within a research project is described. Based on this, in Section 5 the maturity level integration into the process is described in an activity-oriented manner. This is then applied to a use case from engineering research. A summary of the results of this work and an outlook for further work is given in Section 6.
2. Background
Initiatives such as the international consortium European Open Science Cloud (EOSC) and the national consortium NFDI4Ing will further integrate RDM into engineering research and create a data infrastructure to make data access and reuse more efficient and sustainable (Reference Schmitt, Anthofer and AuerSchmitt et al., 2020). RDM covers the entire duration of the project and runs alongside the research process aiming to handle all types of data generated during the research process (Reference Wawer, Wurst and LachmayerWawer et al., 2024). Structured RDM provides the basis for knowledge transfer within research projects and enables project participants to exchange data efficiently and sustainably.
2.1. Engineering research proccesses
In the data intensive engineering research, a data-driven research process for investigating technical problems and developing new technologies is often divided into five main phases: Data collection, data processing, data analysis, publication (presentation of results), and archiving of the consolidated database (Reference Iglezakis and SchemberaIglezakis and Schembera, 2018; Reference Schembera, Selent and SeelandSchembera et al., 2019). Also design research as part of engineering research aims to develop methods, tools, and recommendations for industry to support designers in overcoming current problems (Reference Blessing, Perrin and VinckBlessing, 1996). Many of these design processes are characterized by a generic core of stages and are problem- or solution-driven. These step-by-step, iterative processes are based on analysis and synthesis sequences: Identifying a need, analyzing the task, various design phases, and implementation (Reference Gericke and und BlessingGericke und Blessing, 2011). The research processes are defined as a structured and methodical procedure aimed at investigating specific scientific questions and systematically expanding existing knowledge. Research approaches can be divided into quantitative and qualitative methods, offering different approaches for data collection and analysis, as well as synthesis processes (Reference Blessing and ChakrabartiBlessing and Chakrabarti, 2009). A particular challenge with engineering data is its heterogeneity, resulting in specific requirements for RDM (Reference Joo and KimMüller et al., 2024; Joo and Kim, 2017). The research model by Kowalczyk (Reference Kowalczyk2017) presents a data-driven research process from collection to storage of data in data repositories. Highlighting the need for research-accompanying data management, which must be active in all phases. From data collection, which consists of the generation and gathering of data, to research content development (content phase), to storing and making data available in relevant data repositories, it is necessary to document the resulting data and consider data formats.
2.2. Fundamentals in data management
The research approaches described are characterized by the fact that the results and findings are based on digital objects and their further processing in the research process. Digital objects include all types of research data (e.g., literature, machine data, software, etc.) that are generated and processed in the research process. This can be machine data for further analysis, or publications to identify the state of the art and form hypotheses. Digital objects consist of defined components (metadata, repository, collection, persistent ID) according to the core model, which are to be designed as FAIR digital objects as part of the RDM (Reference Schultes and WittenburgSchultes and Wittenburg, 2019). The acronym FAIR refers to Findable, Accessible, Interoperable, and Reusable and aims at a standardized handling of research data. In this context, a total of 15 principles have been defined that must be fulfilled in the context of RDM when handling digital objects to ensure that they are traceable and reusable (Reference Wilkinson, Dumontier and AalbersbergWilkinson et al., 2016). Metadata is used to further describe the data and is essential for FAIR-Data. In addition to general bibliographic metadata, it is important to define research-specific metadata (Reference Altun, Oladazimi, Wawer, Raumel, Wurz, Barienti, Nürnberger, Lachmayer, Mozgova and KoeplerAltun et al., 2023). RDM is carried out in research projects and includes planning, data collection, analysis, archiving, access, and reuse. This is often represented by data lifecycle models and shows the relevant phases mentioned above (Reference Wissik and ÐurčoWissik and Ðurčo, 2015). The handling of research data must be ensured with different dimensions of quality being affected and influenced by RDM. Different dimensions play a key role in different research phases (Reference KindlingKindling, 2013). Data quality in a holistic sense is strongly dependent on research content and data management content and has interactions that run through the research process and build on each other (Müller et al., 2023).
2.3. Maturity models
For quality assurance, maturity models can be used, which allow a mostly qualitative assessment of a reference system defined on discrete maturity levels. The reference systems are usually processes and organizations. These models are strongly problem oriented and are defined in relation to a target state. One of the most cited and reused maturity models is the Capability Maturity Model (CMM), originally developed by Paulk et al. (Reference Paulk, Curtis, Chrissis and Weber1993). There are also a large number of defined models in the area of RDM, which addresses different reference systems (Reference Lehmann and OdebrechtLehmann and Odebrecht, 2023; Reference Proença and BorbinhaProença and Borbinha, 2018; Reference Oppenländer, Glöckler, Hoffmann and Müller-BirnOppenländer et al., 2017). These models address the quality of research data directly, or the implementation in institutions or entire universities with associated stakeholders. A direct reference to researchers as the executing unit and the implementation directly in the projects is not addressed in a targeted manner in the existing models.
3. Methodology
In order to integrate RDM into the research process, a process survey and process visualization are conducted in the present work, which are based on the engineering research process and the objectives of RDM. To represent a comprehensive RDM, a target process is developed using a top-down approach according to Schmelzer and Sesselmann (Reference Schmelzer and Sesselmann2020) and visualized within the Business Process Model and Notation (BPMN). This provides an activity-driven representation of the objectives of the RDM and enables integration into the research process. The top-down approach is intended to ensure that the objectives of the RDM are taken into account in the process model and that it can be applied to different research approaches and project types. For this purpose, the objectives of the RDM are analyzed and synthesized into activity descriptions, which are then integrated into the process. The visualized process serves as the basis for a maturity based process integration of RDM in the research process. For this purpose, the activity descriptions are assigned to different maturity levels according to a defined maturity level characteristic. The activities are classified according to their impact on the quality of the RDM implementation. A subsequent use case from engineering research illustrates the application of the maturity-based process model and the associated challenges and opportunities.
4. Data management process integration
To address the challenges described above, the process representation is implemented by representing a research level and data management level, visualizing the integration of RDM into existing research processes and projects. Within engineering research projects, different project phases (according to Project Management Institute (2021)) and process phases in which RDM is implemented are defined. The process includes all relevant activities of a research project, ranging from project planning to project completion, and is aimed at researchers who are working on a research project and are responsible for the implementation of RDM. Figure 1 illustrates the interaction between the two levels and the structure of the project and process phases.

Figure 1. Defined process phases within a research project
Since the initiation phase generally only include content based research activities and no data management-related decisions are made in these project phases, the start of the process and the integration of the data management level starts with the project planning phase.
In this model, the research process in the project execution, and monitoring and control phase is defined as result-driven. It is visualized as a repeating process section in this phase until all research objectives have been fulfilled and completed based on defined work packages focusing on project execution and incidental controlling and monitoring. The research process is divided into data collection, including planning, analysis/synthesis for further processing of data up to the research result, as well as the subsequent publication of the results. To cover various research areas of engineering sciences, the research level is presented at a high level and is not described in a method-specific manner. The process phases of access and archiving at the data management level are part of the publication phase and project completion phase and can be initiated in both, as digital artifacts are archived and made accessible during project controlling and at project closing.
To emphasize the interaction between the research level and data management level, the activities are assigned to the two levels in a phase-oriented manner. All research-related activities are defined at the research level, including research planning and their execution to achieve research objectives during the controlling phase. At the data management level, all necessary data management-related activities for a comprehensive RDM are process-oriented described.
The activities of these two levels interact with each other and build upon one another, necessitating the implementation of RDM alongside the research to ensure complete and qualitative execution. Figure 2 provides an excerpt from the entire process visualization of a research project.
The following sections offer a more detailed examination of the individual process phases and the interaction between the data management level and the research level, with reference to the project phases.

Figure 2. Visualization of activities and process integration between the data management level and the research level (excerpt)
4.1. Project planning
In the planning phase of a research project, the research proposal is written and the research project is further refined. At this stage, RDM is integrated to ensure the structured and sustainable handling of the research data.
Planning
Project planning of the initiated and defined project starts with the definition of the research project and the objectives and proceeds with the preparation and submission of the proposal document. The proposal is orientated towards the requirements of the funding body and includes university and industry-related research projects. At the data management level, the main goal is to develop a data management plan (DMP) that considers the requirements of funding bodies, project partners, and scientific communities. It contains specifications for the data structure, metadata, data formats, and strategies for securing and versioning the data within the project duration and beyond. This plan is prepared concurrently with the proposal and subsequently attached to the documents. Upon approval, the project is transferred to the control phase at the commencement of the project.
4.2. Project execution
The execution phase encompasses the content-related processing of the project and represents the research process. It is based on the processing of work packages, which are defined in a results-driven manner to generate new scientific findings.
Collection (Planning)
Once a work package has begun processing, the data collection phase commences. This phase comprises several steps, the selection of research methods, the identification of existing data, the planning of a data structure, and the definition of metadata to document the collection processes. Data collection is conducted through the generation of new data, the reuse of existing data, or a combination of both, and includes their initial processing. As part of the RDM, the processes and the defined metadata are documented during data collection, thereby ensuring data quality. Once the data collection is complete, the data is saved together with the metadata as a database.
Analysis/Synthesis
In the analysis/synthesis phase, the database is prepared, aggregated, and analyzed. At the research level, methodological approaches and tools are employed to analyze and synthesize scientific findings. At the data management level, data processing is documented, and traceability is ensured through the use of workflow models. The results are merged, the data structure finalized and compiled as research results.
Publication/Access/Archiving
Upon completion of the analysis, the publication procedure may then be initiated. This involves the dissemination of research results and the assignment of a digital object identifier (DOI). Concurrently, access to the underlying data structure can be ensured or archiving can be conducted. In this context, the underlying data to be made accessible or archived must be defined at the data management level.
Technical reusability is taken into account and additional metadata is created. The data is uploaded to repositories and, if public, is assigned a DOI to ensure that it can be cited and found.
4.3. Project closure
The final phase of the project cycle entails the completion of the project and verification of the archiving or accessibility of all relevant data.
Completion
Upon completion of the work packages, the archiving or accessibility of the research data is finally verified. If data is missing, the processes are initiated anew. Concurrently, a final report is prepared and submitted. Upon completion of these activities, the project attains its target status.
5. Maturity based data management
A maturity-based analysis of the data management level is used to provide researchers with assistance and potential for improvement in the implementation of RDM in research projects. To achieve this, activities are classified in accordance to a rdm specific maturity model from previous research. In this study, a maturity level characteristic was developed for the implementation of RDM in research projects, and individual maturity level models were created for each process area (Reference Wawer, Wurst and LachmayerWawer et al., 2023). The defined maturity levels and their characteristics, as well as their color coding in the process visualization, are shown in Table 1. This approach allows RDM to be considered in a differentiated and goal-oriented manner, thereby enabling a research-guided integration and improvement of RDM in research projects.
Table 1. Maturity model characteristics (Reference Wawer, Wurst and Lachmayercf. Wawer et al., 2023)

5.1. Maturity based activity allocation
At data management level, the activities of a specific process phase have a direct influence on the quality and the associated RDM maturity level. Based on goal definitions from the maturity models, activities are assigned to the defined maturity levels. Thus, the process illustration can demonstrate the direct influence of activities on determining the maturity level. Figure 3 shows the maturity level classification for activities in the planning phase. The colors indicate the assignment of activities to increasing maturity levels. To reach a higher maturity level, all corresponding activities and those from previous maturity levels must be fulfilled. The “Fill in the DMP” activity at Maturity Level 1 is the basic activity of this phase and provides an intuitive implementation of data management planning in project planning at this maturity level if no detailed content planning or identification of relevant requirements has been done beforehand. At Maturity Level 2, these activities are added, which include the planning of the DMP content in the project. During the planning phase, this involves identifying the funding agency’s requirements and project-oriented planning of the data management content. To achieve the RDM goals of a standardized implementation, activities for standardized implementation are added at Maturity Level 3. In the planning phase, this involves the subsequent use of established templates or the selection of systems to ensure the efficiency and effectiveness of the RDM. A quality-assured implementation of the RDM is planned at Maturity Level 4. In the case of the planning phase, this refers to the quality of the data management plan and includes the relevance of the content, timeliness and completeness of the plan. Maturity level 5 is not assigned to direct activities. This level includes fundamental optimizations of RDM solutions. It can involve technologies or best practices and methods for these phases. Furthermore, some activities can also be assigned to two maturity levels and thus be designed in two colors. In most cases, this addresses the project-specific organization at Maturity Level 2 and, as the maturity level increases, the simultaneous consideration of community-specific standards (Maturity Level 3).
5.2. Use case: RDM planning in a CRC project
The maturity based process model is being applied to an ongoing research project to examine the support options. The project in question is the Collaborative Research Center (CRC) 1153, which aims at the development of novel process chains for the production of hybrid solid components (Behrens and Uhe, 2021). This externally funded project comprises 20 sub-projects from various disciplines that work closely together and exchange and reuse their results. Within the project, there is a sub-project that deals with data management within the CRC and develops solutions for it. For this purpose, a RDM-System was developed. Project-specific requirements were elaborated for data management planning and further requirements of the funding body were identified. Based on these, a template was implemented in the existing RDM-System, thus creating a data management plan template for researchers to fill out. To maintain community standards, the template is based on existing engineering-specific templates.

Figure 3. Maturity based activity visualization of the planning phase
Concerning Figure 3, it is shown that these measures have prepared the basis for the fulfllment of maturity level 3. During the development of the RDM system, the necessary resources are considered at project level (maturity level 2) and are based on domain-specific standards and guidelines (maturity level 3). Especially for the planning phase, maturity level 3 (red, yellow, blue activities) is achieved with the help of the template when filling out the DMP. The actual completion of the DMP by project team members then ensures achieving the maturity level for each sub-project. To reach level 4 or even 5, activities can be introduced to ensure the quality of the content of the DMP. For example, DMPs can be checked for completeness directly in the system, or feedback can be provided by RDM specialists on the relevance of the content. As the planning must be kept up to date, such improvements must also be considered throughout the entire project duration and extend beyond the planning phase itself. For Level 5, further automation measures can be implemented, and project results can be shared with the engineering community as best practices to promote RDM across the project.
5.3. Discussion
The use case illustrates that predefined infrastructures provide solutions and can therefore achieve higher maturity levels for efficient RDM in research projects. The maturity levels can be used as a guide so that further optimizations can be made for the DMP. They provide support for reflecting on the RDM conscious approach and can provide incentives for optimization. This support illustrates the potential of the maturity based process model. On the other hand, the use case makes it clear that the project members are responsible for completing the DMPs and thus for implementation. It is crucial to differentiate between the offer and its actual use. Developed infrastructures can provide RDM offerings with a high degree of maturity, but their implementation is the responsibility of the research team and its awareness of the sustainable handling of research data.
6. Conclusion
RDM is essential for efficient and effective research in collaborative projects. Standardized and quality-assured data management results in added value for internal project success. However, reusable data also provides added value for the entire research community in the context of engineering research. As it is usually the researchers themselves who are responsible for implementation in the projects, it is important to further support implementation in these projects and integration into the research process. For this reason, a process model was developed using BPMN, which visualizes a research-accompanying implementation of data management in engineering research projects. To be able to demonstrate a quality-differentiated implementation, the activities were assigned to the data management level based on a maturity level characteristic. This maturity based process model supports researchers in maintaining the objectives of the RDM. The application to the use case demonstrates that improvements to the RDM can be realized as the maturity level increases, thus providing researchers with assistance for qualitative RDM. Future work will address the development of an assessment method and the implementation of an assessment tool in the existing RDM system. This should simplify the application of the maturity models and also enable the evaluation of further process phases. In addition to implementation in the project’s own RDM system, a stand-alone tool is also to be developed. In this way, researchers from different engineering research projects will be supported in the quality assurance of the RDM.
Acknowledgements
The authors would like to thank the Federal Government and the Heads of Government of the Länder, as well as the Joint Science Conference (GWK), for their initiative within the framework of the NFDI4Ing consortium (German Research Foundation (DFG) - project number 442146713) and CRC 1153 (German Research Foundation (DFG) -project number 252662854).
Data availability
The underlying research data is available in the public research data repository of the Leibniz Universität Hannover. The accessible data set contains the complete process visualisation of a research project, as well as the corresponding maturity level visualisation of all phases. DOI: https://doi.org/10.25835/6u9tg5ma