1. Introduction
Text-to-Image Generative AI models (GenAI) that produce sophisticated images from user prompts have sparked ongoing interest from the design community for their potential to provide inspiration, generate concepts, and support designers in visualising their concepts (Reference Berni, Borgianni, Rotini, Gonçalves and ThoringBerni et al., 2024). Crucially, text-to-image GenAI platforms offer distinctly different and collaborative modes of creating visualisations via text prompts. These platforms provide functions that vary and iterate outcomes, unlike traditional visualisation approaches such as sketching or 3D modelling. Given their infancy, there is currently little to no knowledge or best practices on how to use these tools to support the design process. This has led to various text-to-Image GenAI pattern interaction studies investigating influence on and support for creativity (Ranscombe et al., Reference Ranscombe, Tan, Goudswaard and Snider2024; Torricelli et al., Reference Torricelli, Martino, Baronchelli and Aiello2024; Xie et al., Reference Xie, Pan, Ma, Jie and Mei2023), design practices (Reference Li, Liu, Zhang, Cai, Yin, Wu and ChaiLi et al., 2024) and co-creation (Reference Turchi, Carta, Ambrosini and MaliziaTurchi et al., 2023). From a research perspective, a particular advantage of digital tools is that the interaction between the user and the tool can be logged for later interrogation. Designer (Human)-Computer Interaction with digital design tools has received a growing and sustained interest. For example, Gopsill et al. Reference Gopsill, Snider, Shi and Hicks(2016), Celjak et al. Reference Celjak, Horvat and Škec(2023), and Šklebar et al. Reference Šklebar, Martinec, Škec and Štorga(2024), have all investigated designer interactions with CAD tools and have been able to gain insights into the design process and designers’ behaviours. These works set a precedent for using interaction logs to study the design process and designer behaviour. They highlight how data can provide a rich understanding of tool use but also function as a proxy when examining the designers’ behaviour during a project (Reference Šklebar, Martinec, Škec and ŠtorgaŠklebar et al., 2024).
The contribution of this paper is a study analysing the human-computer interaction of designers using text-to-image GenAI tools. The study explores the potential of using GenAI interaction data to analyse designer behaviour. This in turn can be used to inform our understanding of how designers approach AI tools, including their strategies, preferences, and ultimately best practices for using GenAI to visualise designs. The scope of research reported in this article is to; 1) develop a set of analyses based on interactions/mechanics within the image GenAI workflow, 2) apply analyses to data generated during a text-to-image GenAI design activity 3) identify patterns in data that characterise different designers’ behaviours during the design task. The paper continues with a discussion of the related work in inspiring designers during the early design phases, studying designer-digital tool interaction behaviour and applications of text-to-image GenAI in design (Section 2). This is followed by a description of the study that was performed (Section 3). Section 4 details the results of the study followed by a discussion in Section 5. Section 6 concludes the paper with the key findings from the study.
2. Related work
This section outlines related research to highlight different designers’ behaviours exhibited during inspiration finding, concept generation and concept development when visualising. Then, it outlines the core interactions with text-to-image GenAI platforms used to generate images that are analysed in this study.
2.1. Designers’ behaviours during inspiration-seeking and ideation
Inspiration-seeking and ideation are well understood as core activities undertaken by industrial designers in the early phases of product development. These activities are supported by the collection and creation of visual material that represents ideas to be developed into product solutions (referred to as visualisation). The remainder of this section outlines key behaviours with respect to visualisation that support inspiration-seeking and ideation which will be characterised via analysis of interaction data.
In early design and conceptual phases, designers often seek inspiration (typically from visual material) ”without having a specific direction” and “might be dependent on randomly finding relevant stimuli in an opportunistic manner” (Reference Gonçalves, Cardoso and Badke-SchaubGonçalves et al., 2016, p. 20). Such broad explorations for inspiration – i.e. finding distantly related stimuli – can help designers be more creative in their ideas. These inspirational sources do not always exhibit clear surface-level similarities with the design brief. The link between inspiration and design problem is not always apparent (Reference Gonçalves, Cardoso and Badke-SchaubGonçalves et al., 2016) and consequently visualisations are more diverse and may appear more abstract or high-level.
When designers progress to ideation, they become more goal-oriented in their explorations (Reference Taura, Nagai, Taura and NagaiTaura et al., 2013) and supporting visual material (typically sketches) becomes less abstract representing more narrow representations of possible product solutions (Reference GoldschmidtGoldschmidt, 1991). However, the degree of focus and detail in visualisations varies within this phase. During the concept generation stage, they make broader explorations of the solution space visualising and analysing multiple solutions at a lower level of detail. In contrast, as they progress to the concept development stage they become more goal-focused developing a narrower set of solutions. Here a breadth of multiple ideas narrows to visualising more specific details of a singular optimal solution (Reference Taura, Nagai, Taura and NagaiTaura et al., 2013).
2.2. Explanation of human-computer interactions with text-to-image GenAI
Text-to-Image Generative AI (GenAI) tools have a wide range of different capabilities. At their core, they commonly have a feature for users to 1) prompt the tool with text, and 2) a follow-up function to generate variations of the initial output. Text Prompts are usually the starting point where users input text describing the image they are seeking to generate. This description can include details about the subject, visual qualities, and stylistic cues. The words used within a text prompt are the user’s primary means to control the resulting images, whereas the parameter instructions dictate the style of the images. At a high level, the more words that feature within the prompt, the more specificity and control the user has in directing the resulting image. For example, prompting for “apple” leaves the AI unconstrained to define the characteristics of the image. Conversely “photograph of a green apple growing in an orchard” prompts the AI with a much higher degree of specificity. It should be noted overly long text prompts can reach a saturation point, causing results to lose specificity. An Image Prompt - where the GenAI is provided an image as a reference - augments the text prompt helping the user to better control the generated output via visual prompt in addition to verbal. Variation functions allow users to generate further iterations of an AI-generated image. These functions prompt the AI to recreate images based on the same text prompt while introducing variations, which can differ in degree based on user input or system defaults. These variations arise from the inherent probabilistic nature of the generative algorithm, enabling diverse outputs. Different text-to-image GenAI platforms present different opportunities to control the extent of variation. For example, Midjourney offers; “simple” Variation producing further variations of the chosen image, “strong variations” making significant changes, “subtle variations” making subtle changes and variations focused on specific areas of the image (“Vary Region”).
In summary, the key insights drawn from the review of related work are as follows. Regarding designers’ behaviours, we emphasize the transition from broad exploratory behaviours during inspiration-seeking, characterized by diverse and abstract imagery, to progressively narrowing behaviours in ideation. These are marked by a breadth of multiple ideas represented at lower levels of detail transitioning to less varied and more detailed visualisations exhibiting subtle refinements of specific details. Additionally, we identify the core interactions involved in using GenAI to generate images, which form the basis for the analysis described in the following section.
3. Study procedure
The procedure for the Midjourney-supported design task is now outlined. It begins by rationalising the choice of Midjourney as the text-to-image GenAI platform adopted for the study, then describes the design task undertaken, the interactions recorded and their analysis.
3.1. Text-to-image GenAI platform selection, design task and participants
Midjourney was selected as the text-to-image GenAI for this study. Midjourney, like many other text-to-image GenAI platforms, accepts text-based prompts and offers a range of variation functions to iteratively modify the images produced by the AI. It was selected over other text-to-image GenAI platforms as, at the time of writing, it strikes a middle ground between very specialized tools, such as Vizcom (targeted at concept art), which incorporates functions like direct painting and editing, and more limited platforms like DALL-E, offering fewer interactive and refinement options. Finally, Midjourney’s integration with the Discord platform facilitates detailed and accessible recording of prompts. This versatility makes Midjourney suitable for our research offering high-resolution and realistic image outputs and flexibility striking a balance between complex and limited functionality found in some alternatives. We direct the reader to Berni et al. Reference Berni, Borgianni, Rotini, Gonçalves and Thoring(2024) for a more extensive review of GenAI tools and their support for designers.

Figure 1. Screenshots of the Midjourney interface within Discord highlighting text prompts and variation functions
A one-day design task was the basis to investigate how interaction data from text-to-image GenAI platforms might expose/identify patterns in designer behaviour. The design brief given to participants was to: “Design an innovative emergency product that can significantly improve the safety and survival chances of individuals at risk of forest fires”. This brief was chosen as it aligns with both our participants’ expertise (industrial designers) and the specific phase of the design process under investigation, namely inspiration and concept development. Additionally, the broad scope of the brief encourages participants to begin with initial exploratory behaviours, as it avoids prescribing specific details that might prompt participants to prematurely adopt a detailed design approach.
Participants were provided with Midjourney for inspiration and to visualise ideas, and they were also permitted to sketch in a logbook (note analysis of these logbooks is outside the scope of this article). The task deliverables were an image that captures their inspiration, images that represent concepts generated, and images that visualise the finalised design and its details. The task began with a short introduction to Midjourney where participants were taught and practiced the various interactions and functions available within the platform. Following training participants used the platform over three distinct 120min sessions; “inspiration finding” where they were instructed to use Midjourney to gather inspirational material and create their inspiration image, “concept generation” where they were instructed to produce a range of design ideas that address the project brief, and “concept development” where they were instructed to produce images representing the final design and key details.
Eleven participants were recruited for this study from a cohort of 3rd and 4th-year industrial design undergraduates. This cohort was selected as having design experience and a sound understanding of typical expectations and behaviours during inspiration finding and ideation. This study employs a small sample size as its objective is not to generalise across a broader population, which would require a larger sample. Instead, the focus is on exploratory analysis to assess the potential of interaction data in capturing behavioural patterns. As such, we prioritize depth over breadth, prioritising a smaller sample examined over an extended study duration of one day. To this end, for this article, we used maximum variation sampling to describe in-depth four participants’ behaviours based on their comparatively different approaches to using text-to-image GenAI. The four participants analysed were selected based on the range of differences observed in key interaction metrics, described in Table 1. As such the sample captures contrasting interactions with the GenAI, ensuring that our sample reflects the broadest observable behavioural diversity within the cohort. While the four selected participants do not encompass the full range of possible design behaviours, their selection was intentional in illustrating distinct and contrasting interactions, thereby achieving the study’s aim to characterise varied design behaviours.
Table 1. Summary of interaction analysis and insights drawn

3.2. Interaction data and analysis
Interaction data was captured via prompts submitted to Midjourney by participants along with resulting images generated. Each interaction (prompt) comprised a timestamp, prompt text, image aspect ratio, Midjourney version no., “codes” describing any variation functions, and user ID. For example:
**A product shot of a normal person using an emergency portable satellite to create a distress signal from a forest fire and receive help from a rescue helicopter --ar 16:9 --v 6.0** - Variations (Region) by <@1264553447651278911> (fast)
The above prompt includes text describing the desired aspect ratio (shown as “-- ar 16:9”), the version of Midjourney used (version 6, shown as “--v 6.0”), and that this prompt was a part of Vary Region function shown as “Variations (Region)”.
Prompts were downloaded from each session, collated and stored as a CSV. Table 1 outlines the analyses in terms of data, measures and insights drawn. Together, these analyses achieve our first objective to develop a preliminary set of analyses/measures based on interactions/mechanics within the image GenAI workflow. We acknowledge that Midjourney has further functions beyond those described above that are not analysed within this study. This is because they are either procedural (e.g. “Upscale” or “Zoom/Pan” image), or pertain to controlling the artistic qualities of images produced rather than manipulating the content or ideas represented (e.g. controlling the image aspect ratio, “—ar 16:9”).
4. Results
A total of 493 interactions across the four sampled participants were recorded, 166 in the inspiration phase, 157 in the concept generation phase and 170 in the concept development phase. Resulting data arising from the analysis of each interaction is presented for each design phase focusing on describing distinct trends in data and corresponding images generated.
4.1. Inspiration phase
Figure 2 presents a table summarising interactions alongside graphs of prompt word count and similarity. Designer A primarily engages in short, concise prompts (averaging 6 words with 60% prompt similarity) while Designer D uses longer prompts (averaging 22 words with higher similarity). Also notable is the comparative difference in the number of interactions (18 versus 59), the vocabulary used (34 words versus 157) and the use of the Vary Region function (0 versus 20 uses). Further inspecting interactions against time for this phase (see graphs in Figure 2) shows how Designer D exhibits two phases of gradually increasing prompt length, at the same time maintaining relatively high prompt similarity reflecting the use of the Vary Region function with minor additions in prompt text. Conversely, Designer A uses shorter prompts exhibiting a pattern of oscillating between 0-100% similarity, where 100% indicates a “simple” variation or an “upscale” to enlarge an image. A very low or 0 percentage similarity measure indicates entirely new prompts are used.

Figure 2. Interaction data and a sample of images generated by Designer A and Designer D during the inspiration phase
These patterns of interaction suggest distinct design strategies for Designer A and Designer D. Designer A’s approach characterizes broader exploration. The frequent shifts between new prompts and existing variations indicate a strategy aimed at quickly generating a diverse range of ideas which is reflected in the diverse range of imagery generated (see top row of sample images in Figure 2). In contrast, Designer D demonstrates a strategy of narrow exploration coupled with incremental detailing characterised by consistent use of longer prompts, coupled with high prompt similarity and extensive engagement with the Vary Region function. The bottom row of images in Figure 2 illustrates how this pattern results in the generation of highly similar images.
4.2. Concept generation phase
Figure 3 presents a table summarising interactions alongside graphs of prompt word count and similarity collected during the concept generation phase. Key differences emerge in the interaction patterns of Designer A and Designer B. Designer A relies on shorter prompts, averaging 15.5 words with a smaller vocabulary (172 unique words), while Designer B uses longer prompts (averaging 58.1 words) and a broader vocabulary (257 unique words). Despite this, prompt similarity remains comparable between the two (63.9% for Designer A and 70.4% for Designer B). Notably, Designer A makes greater use of image prompts (6 instances versus 1 for Designer B). Over time, Designer A shows a fluctuating pattern between high and low similarity, indicating shifts between refining existing ideas (using a range of variation functions) and introducing new ideas via new/different text prompts, (see left graph in Figure 3) whereas Designer B maintains a steady succession of high-similarity interactions (see right graph in Figure 3).

Figure 3. Interaction data and a sample of images generated by Designer A and Designer B during the concept generation phase
These patterns suggest distinct design strategies. Designer A’s approach indicates a strategy aimed at generating diversity within an overall concept theme as evidenced in the resulting images comprising multiple variations of harness designs, differing in configuration, style, and colour (see the top row of sample images in Figure 3). Designer B, on the other hand, exhibits a narrower exploration with incremental detailing via longer and similar prompts focused on refining the contents of their concept - a survival kit (see bottom row of images in Figure 3).
4.3. Concept development phase
Figure 4 presents a table summarising interactions alongside graphs of prompt word count and similarity during the detailed design phase. Summary data and graphs comparing Designer C and Designer D show distinct interaction patterns. Designer C uses a larger vocabulary (212 words vs. 123 for Designer D) and relies more on re-prompting text, indicating a continued focus on generating designs through descriptive inputs. This leads to a pattern of longer prompts fluctuating in similarity (see graphs on the left of Figure 4). In contrast, Designer D frequently uses image prompts (23 vs. 8) and the Vary Region function (13 vs. 5), suggesting a shift towards refining specific visual elements, akin to image editing. This pattern corresponds to shorter prompts fluctuating in similarity (see graphs on the right of Figure 4). The sample of images that represent these patterns (see Figure 4) illustrates how Designer C’s designs show more diversity, linked to varying text prompts, suggesting an exploration of different details. Meanwhile, Designer D’s controlled and consistent visuals result from focused use of image prompts and region-based variations, aimed at refining a specific design detail. These patterns reveal Designer C’s strategy of exploring multiple design details through text generation versus Designer D’s incremental refinement of a single detail through image-based adjustments (image prompt and Vary Region functions).

Figure 4. Interaction data and a sample of images generated by designer C and designer D during the concept development phase
5. Discussion: to what extent can patterns in interaction data signify designers’ behaviours?
The user interaction study successfully identified patterns that signify exploratory and narrowing behaviours occurring across different phases of the design task. Throughout both the inspiration and concept generation phases, exploratory behaviours are characterized by shorter prompts, lower consecutive prompt similarity, and more frequent fluctuations in similarity. The difference in interaction data between inspiration and generation is in the noticeable increase in prompt length and consecutive similarity. Here we contend the increasing prompt length, increased use of image prompts and detail-focused functions like Vary Region reflect a desire for greater specificity. This is consistent with the way concept generation seeks to depict and visualise concrete solutions as compared with inspirational imagery that represents higher-level ideas yet to be actualised. Images depicting diverse classes of products in the top row of Figure 2 versus images depicting configurations of a specific product class “harnesses” shown in the top row of Figure 3 represent this. In contrast, Designer D in inspiration and Designer B in concept generation phases exhibit patterns indicative of narrowing behaviours not befitting of inspiration finding or concept generation phases. Longer prompts that steadily increase in length, consistently high prompt similarity with little fluctuation, and extensive reliance on functions like Vary Region reflect a focused, iterative approach aimed at refining a specific design. This pattern aligns with (Reference Xie, Pan, Ma, Jie and MeiXie et al., 2023), who highlight greater “detail” in prompts as signifying more precise (narrowing) intentions. The patterns of increasing use of image prompt and Vary Region functions also signify narrowing behaviours in the detailed design phase. However, patterns of increasing prompt length and consistently high consecutive prompt similarity do not hold as indicators of narrowing during the detailed design phase. In this phase, image prompts in conjunction with the Vary Region function led to relatively short and dissimilar text prompts generating highly similar images. As such we claim an original contribution in identifying patterns in GenAI interactions that characterise the distinct behaviours of the designers studied during the three phases of the design task. We thus fulfil the study’s aim of characterising the observable and distinct behaviours of designers in terms of GenAI interactions. Furthermore, using analysis of GenAI interactions demonstrates that despite receiving the same design education, it is evident that designers do not necessarily use text-to-image GenAI uniformly.
A crucial limitation of our findings is that we are only able to claim the identification of interaction patterns that characterise certain designer behaviours observed in the data, rather than an exhaustive mapping of all possible behaviours or their varying intensities. More exhaustive and controlled studies could enable the identification of behavioural gradients within broad and narrow exploration patterns, offering deeper insights and metrics for how behaviours shift dynamically over the course of the design process. Such studies could also identify additional behaviours, such as design fixation. Likewise, such studies could provide deeper insights into how changes in interaction patterns correspond to different degrees of exploratory and convergent thinking. At the same time, we acknowledge that these metrics alone may not fully capture the complexity of design behaviours. As per extant research into the analysis of CAD interactions (Gopsill et al., Reference Gopsill, Snider, Shi and Hicks2016; Šklebar et al., Reference Šklebar, Martinec, Škec and Štorga2024), we underscore the value of using quantitative interaction data alongside observational and interview-based methods to triangulate findings, ensuring a more comprehensive understanding of how designers engage with GenAI tools.
Demonstrating the use of GenAI interaction analysis to serve as a proxy for analysing designer’s behaviours is a significant contribution to the research community. This is because it enables a non-intrusive approach to collect rich data on a topic of growing significance and interest to the design community. We also contend that identifying these patterns has important implications for the design community and AI developers by presenting analytics for best practices. Specifically, for designer-AI collaboration, it opens the opportunity to integrate feedback for designers on which operations to use and how to structure prompts that best reflect the ideal design behaviours at different stages of the design process. For example, during the inspiration phase, the GenAI tool could make suggestions for shorter prompts when a pattern of consistently long or similar prompts is detected. Similarly avoiding certain detail-focused functions can be advised. This has significance as while the perceived effort of using the AI is relatively low, the mode of interaction is novel and how this translates to a typical design process is not well known by the design community. Indeed, our experiment shows very clearly that users can use the platform in a non-efficient manner for the task at hand during the inspiration and concept generation phases. The narrowing behaviours of Designers B and D in the inspiration and concept generation phases described in sections 4.1 and 4.2 demonstrate this.
6. Conclusion
The paper reports a study to explore the extent to which user interaction data with text-to-image GenAI can describe patterns of designers’ behaviours when engaging in inspiration finding, concept generation and detailed design. We analysed 493 interactions from four participants interacting with the Midjourney text-to-image GenAI tool. Findings reveal two design behaviours - exploratory and narrowing. We identify how shorter prompts with varied content correspond to exploratory behaviours. Conversely increasing prompt length and similarity combined with increased use of more precise variation functions (Vary Region) and image prompts reflect a transition toward narrowing behaviours for focused design detail development. We found the use of image prompt and Vary Region functions further indicate narrowing during concept development, however, text prompt data was less indicative of behaviours during this phase. As such, this article contributes an approach to utilize data from GenAI interactions to characterise and subsequently track designers’ exploratory and narrowing behaviours when using text-to-image GenAI.
Acknowledgement
This research was conducted by the ARC Centre for Next-Gen Architectural Manufacturing and funded by the Australian government (ARC IC220100030).