1. Introduction
In recent years, the stellarator community has developed novel numerical optimization algorithms to quickly design stellarators that are quasisymmetric. Quasisymmetry (QS) is a property of the magnetic field, where the field strength on a given magnetic surface can be written as a linear combination of the Boozer angles (Boozer Reference Boozer1981a ) on the surface (Helander Reference Helander2014). It can be shown that such fields confine guiding centre orbits (Boozer Reference Boozer1981b ; Nührenberg & Zille Reference Nührenberg and Zille1988; Burby, Kallinikos & MacKay Reference Burby, Kallinikos and MacKay2020; Rodriguez, Helander & Bhattacharjee Reference Rodriguez, Helander and Bhattacharjee2020 Gradient-based optimization algorithms have facilitated the design of magnetic fields with the QS property (Landreman & Paul Reference Landreman and Paul2022), as well as the electromagnetic coils that generate them (Wechsung et al. Reference Wechsung, Landreman, Giuliani, Cerfon and Stadler2022b).
There is interest in the stellarator community to explore the landscape of QS stellarators. Examples of such surveys can be found in the literature, including a set of four stellarators with highly precise volumetric QS in Landreman & Paul (Reference Landreman and Paul2022), a similarly generated data set of more than 150 devices by Buller et al. (Reference Buller, Landreman, Kappel and Gaur2024) and one of over 30 generated exploiting adjoint-based algorithms by Nies et al. (Reference Nies, Paul, Panici, Hudson and Bhattacharjee2024). An even larger data set of 500 000 stellarators was presented in Landreman (Reference Landreman2022), which leverages a model of quasisymmetric stellarators near their magnetic axes (Garren & Boozer Reference Garren and Boozer1991; Landreman, Sengupta & Plunk Reference Landreman, Sengupta and Plunk2019; Landreman & Sengupta Reference Landreman and Sengupta2019; Rodríguez et al. Reference Rodríguez, Sengupta and Bhattacharjee2023). The devices in the data sets mentioned above all were released without the coil sets that generate them. For data sets that include their coils we must turn to the examples of over 40 stellarator coil sets studied by Kappel, Landreman & Malhotra (Reference Kappel, Landreman and Malhotra2024) or the work of Jorge, Giuliani & Loizu (Reference Jorge, Giuliani and Loizu2024), which includes a set of over 10 devices with simple and flexible coils.
A globalized coil design workflow was recently described in Giuliani (Reference Giuliani2024), and was used to generate a large database of quasiaxisymmetric (QA) stellarators and their associated coil sets. The database is called QUASR for ‘quasisymmetric stellarator repositor’ and can be explored online at quasr.flatironinstitute.org. In this work, we extend these algorithms to search for quasihelically (QH) symmetric devices and augment QUASR so that it now contains over 300 000 devices with associated coil sets. Using this large data set, we explore the landscape of stellarators that are generated by modular electromagnetic coils. To start, we present a few devices in QUASR and analyse their physics properties. We then apply a variety of techniques to visualize and analyse this large data set. The first uses the near-axis model and its landscape of excellent and poor QS (Rodriguez, Sengupta & Bhattacharjee Reference Rodriguez, Sengupta and Bhattacharjee2022b ; Rodríguez et al. Reference Rodríguez, Sengupta and Bhattacharjee2023), which we call the near-axis QS landscape, to frame some configurations in the data set. The second, and as an alternative, we employ a dimensionality reduction approach called principal component analysis (PCA) (Jolliffe Reference Jolliffe1990). This is a well-known technique frequently used in data science for dimensionality reduction and visualization (Bishop Reference Bishop2006). It enables the practitioner to project the high-dimensional data set onto a linear manifold, e.g. a two-dimensional plane or three-dimensional volume, to visualize relationships and possible formation of clusters. We compute two quantitative scores that measure how faithful the lower-dimensional representation is to the higher-dimensional data.
To summarize, the main goals of this paper are twofold: (i) we extend QUASR to include both QA and QH stellarators (§§ 2 and 3); and (ii) we use a couple of visualization techniques to explore the data set and gain insight on the devices in it (§§ 4 and 5).
2. The coil design workflow
Our goal is to find coil geometries and currents that produce magnetic fields with the QS property (either QA or QH). To achieve this, we use a globalized coil design workflow that comprises three phases, wrapped in a globalization algorithm. Since this workflow and its constituent algorithms have been described before (Eriksson et al. Reference Eriksson, Pearce, Gardner, Turner and Poloczek2019; Giuliani et al. Reference Giuliani, Wechsung, Cerfon, Stadler and Landreman2022a ,Reference Giuliani, Wechsung, Stadler, Cerfon and Landreman b ; Giuliani et al. Reference Giuliani, Wechsung, Cerfon, Landreman and Stadler2023, Giuliani Reference Giuliani2024), we only provide a brief overview here, summarized in figure 1. The first phase searches for stellarator coils that are quasisymmetric near their axis. This algorithm is robust, but may only find devices with nested flux surfaces on a small region in the neighbourhood of the magnetic axis. These devices are then provided as an initial guess to Phases II and III which heal generalized chaos, optimize for nested flux surfaces and polish for precise QS. The three phases are wrapped in a globalization algorithm (Eriksson et al. Reference Eriksson, Pearce, Gardner, Turner and Poloczek2019), ensuring that the objective landscape is sufficiently well-explored.

Figure 1. Globalized coil design workflow.
To proceed with the volumetric refinement of QS of Phases II and III, we must define some measure
$f_{\mathrm{QS}}$
for the field generated by the electromagnetic coils to minimize. To do so, we must take into account the two flavours of QS we are interested in: quasiaxisymmetry and quasihelical symmetry. The approaches for both symmetries are very similar, so we explain the process for quasiaxisymmetry, and then show how it can be extended to quasihelical symmetry.
Quasiaxisymmetry is characterized by
$\theta$
isolines of
$B(\varphi ,\theta ) = \left\| {{\boldsymbol{B}}({\boldsymbol{\unicode{x1D6F4}} _s}(\varphi ,\theta ))} \right\|$
, on magnetic flux surfaces
$\boldsymbol{\unicode{x1D6F4}} _s$
parametrized in Boozer coordinates
$(\varphi , \theta )$
(Boozer Reference Boozer1981a
). The field’s deviation from exact quasiaxisymmetry on the magnetic surface can be computed by evaluating

where

is a least squares projection of
$B(\varphi , \theta )$
onto the space of functions that do not depend on
$\varphi$
. These integrals can easily be evaluated using the quadrature points on the tensor product grid
$(\varphi _i, \theta_{\!j}) \in [0, 1/n_{\text{fp}}) \times [0, 1)$
, where
$\varphi _i = i/(N_{\varphi }n_{\text{fp}})$
and
$\theta_{\!j} = j/N_{\theta }$
for
$i=0, \cdots , N_{\varphi }-1$
,
$j = 0, \cdots , N_{\theta }-1$
and
$N_{\varphi }, N_{\theta }$
are the number of quadrature points in the toroidal and poloidal directions, respectively. These points correspond to the periodic trapezoidal rule, which is spectrally accurate (Trefethen & Weideman Reference Trefethen and Weideman2014). This measure of QS is similar to the commonly used sum over the symmetry breaking components of
$B$
(defined as
$f_B$
in Rodriguez, Paul & Bhattacharjee (Reference Rodriguez, Paul and Bhattacharjee2022a
)). To extend
$f_{\mathrm{QS}}$
to the QH symmetric case, we note that the difference is in the isolines of
$B(\varphi , \theta )$
, which now correspond to lines of constant
$\theta - Nn_{\text{fp}}\varphi$
, for
$N\in \mathbb{Z}_{\neq 0}$
. Most devices of interest have isolines with a slope exactly equal to
$\pm n_{\text{fp}}$
(Landreman Reference Landreman2022; Rodriguez et al. Reference Rodriguez, Sengupta and Bhattacharjee2022b
; Rodríguez et al. Reference Rodríguez, Sengupta and Bhattacharjee2023). The same formulae and quadrature points used for QA can also be used in the case of QH by applying (2.1)–(2.2) in a rotated frame.
To generate multiple devices through the phases described above, we scan over a number of physics and engineering target characteristics. In terms of physics properties, we target various values of mean rotational transform
$\iota$
, aspect ratio and total coil length, while keeping the major radius of 1 m. The engineering characteristics on the coils impose limits on their curvature and mean squared curvature, which may not exceed
$5$
and
$5\,\text{m}^{-2}$
, respectively. The coils are also designed to have uniform incremental arclength. We refer to these as engineering constraints. The goal of these parameter scans is to explore the space of quasisymmetric stellarators that can be generated by coil sets, and observe possible trade-offs between competing physics and engineering characteristics. Whenever ‘quality of QS’ for a device is provided in the manuscript, it is calculated by taking the square root of the average of (2.1) on a number of surfaces of the magnetic field, which we may refer to as
$\bar {f}_{\mathrm{QS}}$
. Note that by construction, the surfaces on which we optimize for QS in the final stage of the workflow are tangent the magnetic field at a fixed number of quadrature points on the surface. As a result,
$\boldsymbol{B} \boldsymbol{\cdot} \boldsymbol{n}$
error is numerically zero when queried at those same quadrature points. This is notably different from the approach in (Jorge et al. Reference Jorge, Goodman, Landreman, Rodrigues and Wechsung2023), where the
$\boldsymbol{B} \boldsymbol{\cdot} \boldsymbol{n}$
error is explicitly minimized.
The stellarator design problem amounts to a constrained optimization problem, where the ODE (Phase I) or PDE (Phases II and III) constraint is satisfied to numerical precision at each iteration of the optimization loop. This was not necessarily the case for the physics and engineering constraints. However, all the target physics and engineering constraints were satisfied to a precision of
$0.1\,\%$
by using a penalty method. After a fixed budget of iterations of the minimization algorithm, we increased by a factor of 10 the penalty weights associated with the constraints that were violated by more than this accuracy threshold. Then, with these new set of weights, the minimization algorithm is restarted. This ensures that these constraints are satisfied to the desired accuracy at the end of the optimization workflow. After a fixed number of restarts, devices that do not satisfy the engineering and physics constraints to
$0.1\,\%$
accuracy, even after the penalty reweighting, are discarded.
3. A few QH designs in QUASR
In this section, we present a few devices in the data set, summarized in table 1 and plotted in figure 2. The table also includes a link to a device manifest page, where additional physics and engineering information is provided. We estimate the collisionless guiding-centre (de Blank Reference de Blank2004) losses of alpha particles in vacuum magnetic generated the coils of these devices after scaling them up to the ARIES-CS (Najmabadi et al. Najmabadi, Reference Najmabadi2008) minor radius and spawning 5 000 α particles isotropically at half-radius in SIMSOPT (Landreman et al. Reference Landreman, Medasani, Wechsung, Giuliani, Jorge and Zhu2021) then tracing until 0.2 seconds. All devices have favourable particle loss estimates despite deviations from QS, most likely owing to the narrow orbit widths in QH devices (Landreman & Paul Reference Landreman and Paul2022; Paul et al. Reference Paul, Bhattacharjee, Landreman, Alex, Velasco and Nies2022; Buller et al. Reference Buller, Landreman, Kappel and Gaur2024). Except for the two field period device, all presented devices have reasonable elongation of flux surfaces in the
$RZ$
plane. Note that this measure of shaping (especially when large) can be somewhat misleading in devices with a large helical excursion of the magnetic axis (see § 5.2 for additional discussion on this). In such cases it is more instructive to measure the elongation in the plane perpendicular to the magnetic axis; for the
$n_{\text{fp}}=2$
device, the maximum elongation in such plane reduces to
$6.85$
, which is still significant. This quantity was numerically computed by fitting a near-axis model to the field (Appendix A), and providing the parameters to pyQSC (Landreman et al. Reference Landreman, Jorge, Rodriguez and Dudt2024).
Table 1. Characteristics of some QH devices from the QUASR data set. The maximum elongation is computed in the
$RZ$
plane. The final column of the table contains links to a device manifest page that contain more physics and engineering information about each design.


Figure 2. Notable QH devices and their coil sets in the QUASR.
The first device is a two field period QH field with a ‘crossing point’ with little space for coils, for which coils may nevertheless be found. Although it has the highest particle losses compared with the other devices presented here, the device is reminiscent of the original figure-eight stellarator design (Spitzer Reference Spitzer1958).
The
$n_{\text{fp}}=4$
device is quite similar to the Landreman–Paul precise QH configuration (Landreman & Paul Reference Landreman and Paul2022), but was discovered here independently with coils. Coils for the precise QH device have been presented in Wiedman, Buller & Landreman (Reference Wiedman, Buller and Landreman2024), and found using a stage II algorithm (Zhu et al. Reference Zhu, Hudson, Song and Wan2017). Both device
$\text{ID}=1630198$
and the Wiedman et al. (Reference Wiedman, Buller and Landreman2024) device use five coils per half-period and a total coil length of approximately 104 m, when scaled to a major radius of 1 m. All the coil sets in QUASR were designed to have a minimum coil–surface and coil–coil distance of
$0.1\, \text{m}$
, and maximum coil curvature and mean squared curvature of
$5\, \text{m}^{-1}$
and
$35\, \text{m}^{-2}$
, respectively. These were arbitrary choices, and (Wiedman et al. Reference Wiedman, Buller and Landreman2024) used different thresholds (some more strict and some less). Both designs though have remarkable estimated alpha particle losses (
$0.00\, \%$
) at reactor scale. Modular coils that wrap poloidally around the magnetic surfaces are just one coil topology and one might conceive of designs for the Landreman–Paul precise QH device with dipole (Kaptanoglu et al. Reference Kaptanoglu, Wiedman, Halpern, Hurwitz, Paul and Landreman2025) or helical (Kaptanoglu, Langlois & Landreman Reference Kaptanoglu, Langlois and Landreman2024) coils.
The
$n_{\mathrm{fp}}=6$
device shows an example of a configuration with a high number of field periods, which are configurations that have proved challenging to find and provide a set of coils for (Landreman Reference Landreman2022).
A note must be made on the coils presented for these configurations here. The filamentary coils sometimes appear quite long, which might not make the coil sets particularly practical. However, having such coils remains useful at least as an initial starting point for further refinement.
4. The near-axis QS landscape
The landscape of QS quality in terms of the coil degrees of freedom is complex and high-dimensional, which can be difficult to visualize. Using a near-axis framework, though, this landscape gains significant structure when parametrized in terms of the magnetic axis shape, owing to the key role played by topological properties of these (Rodriguez et al. Reference Rodriguez, Sengupta and Bhattacharjee2022b ). A representation of a representative subset of these axes described only by a few parameters (and thus lower dimensionality) can then provide a simple way to assess and visualize QS fields (Rodríguez et al. Reference Rodríguez, Sengupta and Bhattacharjee2023).
To construct such a space, we start by providing the elements necessary to characterize a near-axis quasisymmetric vacuum field (Landreman & Sengupta Reference Landreman and Sengupta2019; Rodríguez et al. Reference Rodríguez, Sengupta and Bhattacharjee2023). Such a field is defined by providing the geometry of the magnetic axis (customarily given as a set of Fourier harmonics
$\boldsymbol{R}, \boldsymbol{Z}$
describing the axis shape in cylindrical coordinates), and two scalars
$\overline \eta$
and
$B_{2c}$
. Given such inputs, one can evaluate a function
$\Delta B_{20}(\boldsymbol{R}, \boldsymbol{Z}, \overline \eta , B_{2c})$
which measures the magnitude of the symmetry breaking of
$B$
at second order in the distance from the magnetic axis (Landreman & Sengupta Reference Landreman and Sengupta2019; Rodríguez et al. Reference Rodríguez, Sengupta and Bhattacharjee2023). That is, if
$\Delta B_{20}=0$
, then the near-axis vacuum field considered has ‘perfect’ QS to second order. This scalar function of
$(\boldsymbol{R}, \boldsymbol{Z}, \overline \eta , B_{2c}) \in \mathbb{R}^{2N+1}$
variables is too high-dimensional for straightforward plotting. Using the approach in Rodríguez et al. (Reference Rodríguez, Sengupta and Bhattacharjee2023), one may nevertheless reduce the dimensionality in such a way that the critical structure of the problem is retained. This reduction of the space draws from the following logic.
-
(i) The main structural element in the field description is the axis shape, which will remain our free parameters. For visualization purposes, the axis geometry is restricted to a few harmonics (this is key only to visualization, but not to the concept itself), in this case
$\boldsymbol{R} = (1, R_{n_{\text{fp}}}, R_{2n_{\text{fp}}}, R_{3n_{\text{fp}}})$ , and
$\boldsymbol{Z} = (Z_{n_{\text{fp}}}, Z_{2n_{\text{fp}}}, Z_{3n_{\text{fp}}})$ .
-
(ii) For a given axis shape,
$\overline \eta = \overline {\eta }^{\dagger }(\boldsymbol{R}, \boldsymbol{Z})$ is chosen uniquely to roughly minimize elongation of surfaces, formally
(4.1)where\begin{align}{{\bar \eta }^\dagger } = \mathop{\text{argmax}}\limits_{\bar \eta } |\iota (\bar \eta ) - N|,\end{align}
$\iota$ is the on-axis rotational transform and
$N$ is the helicity (see Rodríguez et al. (Reference Rodríguez, Sengupta and Bhattacharjee2023) for further details and proofs). Changing the value of
$\bar {\eta }$ will directly affect elongation of surfaces as well as the rotational transform.
-
(iii) The
$\boldsymbol{Z}$ harmonics and
$B_{2c}$ are then optimized over to minimize
$\Delta B_{20}$ .
This results in a scalar function

which is only a function of three independent variables
$(R_{n_{\text{fp}}}, R_{2n_{\text{fp}}}, R_{3n_{\text{fp}}})$
, and it can easily be visualized by evaluating it on a three-dimensional tensor product grid using the pyQSC code (Landreman et al. Reference Landreman, Jorge, Rodriguez and Dudt2024). The minimization problem over the
$Z$
harmonics (
$\boldsymbol{Z}$
) uses local optimization algorithms, where the
$Z$
harmonics provided to the optimization are initialized to have
$\boldsymbol{Z} := \boldsymbol{R}$
. This results in magnetic axes with
$Z$
harmonics that scale with
$R$
as the optimization algorithm does not stray far from the initial guess. The resulting plots have structure thanks to the topological properties of the magnetic axis (Rodriguez et al. Reference Rodriguez, Sengupta and Bhattacharjee2022b
). This landscape can be divided into phases (or regions) where either QA and QH devices are found, figure 3, corresponding to axis shapes with different self-linking numbers (Rodriguez et al. Reference Rodriguez, Sengupta and Bhattacharjee2022b
; Moffatt & Ricca Reference Moffatt and Ricca1992; Fuller Jr Reference Fuller1999; Oberti & Ricca Reference Oberti and Ricca2016). At the origin, we have the QA phase, surrounded by QH phases. As the magnitude of the third harmonic increases (and thus the axis shaping increases), the QA region shrinks; there is a limit on how much an axis can be shaped before it gains a non-zero helicity and represent a QH device. The east and west QH regions have helicity
$n_{\text{fp}}$
, and the north and south regions have helicity
$2n_{\text{fp}}$
.

Figure 3. Panel (a) shows the landscape of quasisymmetric stellarators for
$n_{\text{fp}}=4$
, the red dot in (a i) corresponds to the ‘Helically Symmetric eXperiment’ (HSX) device. There are four QS phases delineated by poorly quasisymmetric (dark colour) devices. Devices optimized for QA and QH in QUASR plotted on top of the landscape with blue and red dots, respectively. Panels (b) and (c) correspond to
$n_{\text{fp}}=4$
and 5, respectively.
In Rodríguez et al. (Reference Rodríguez, Sengupta and Bhattacharjee2023), two branches of QH stellarators with favourable QS were observed: the HSX branch, as well as a new QH branch that had not yet been explored. The HSX branch is the one on which the HSX device lies, and the ‘new QH’ branch is the one below it (see figure 3a ). To visualize the devices in QUASR on this landscape, we fit the near-axis model to the magnetic fields generated by the coil sets. The algorithm to do this is described in Appendix A.
4.1. The QUASR on the near-axis QS landscape
Following the procedure in Appendix A, QUASR devices may be then approximated by near-axis fields (a form of reduced representation of the configuration) and represented in the reduced quasisymmetric phase-space discussed above. However, we must note that some of the simplifying assumptions behind such a reduced phase space plot in figure 3 do not necessarily hold for all the devices in QUASR.
First, note that the optimization that was used to design the devices in QUASR did not directly target the devices’ elongation. As a result, we are not guaranteed that
$\overline {\eta }^*$
will be close to
$\overline {\eta }^\dagger$
as the phase space construction assumes. Around 40 % of
$n_{\text{fp}}=4,5$
devices have
$|\overline {\eta }^*-\overline {\eta }^\dagger |/|\overline {\eta }^\dagger |\lt 0.1$
(roughly a change of
$\sim 20\,\%$
in elongation) and this increases to around 70 % when the upper bound is relaxed to 0.2. This illustrates that there is a non-negligible portion of devices in QUASR with possibly unrealistic elongations, which motivates the addition of constraints on surface elongation to the design workflow in figure 1.
Second, for illustration purposes, we focused on configurations whose axes had three dominant harmonics. A device is deemed to have
$N$
dominant Fourier harmonics if
${R_{k{\kern 1pt} {n_{{\rm{fp}}}}}}( {1 + {k^2}n_{{\rm{fp}}}^2} ),{Z_{k{\kern 1pt} {n_{{\rm{fp}}}}}}( {1 + {k^2}n_{{\rm{fp}}}^2} ) \lt 0.1$
for
$k\geq N+1$
. These conditions rely on rescaled Fourier harmonics that are dimensionless and naturally measure relevance of axis harmonics, introduced in Rodriguez et al. (Reference Rodriguez, Sengupta and Bhattacharjee2022b
). Around 50 % of devices have three dominant harmonics, and this number increases to around 70 % when four dominant harmonics are allowed. During the globalization of the near-axis phase of the workflow, the axis is described only using two Fourier harmonics. Then, the solution is subsequently polished and the number of harmonics used to describe the axis is increased. This continuation procedure may have influenced these statistics. Additionally, note that the choice to represent magnetic axes in cylindrical coordinates and Fourier harmonics might not be the optimal choice, and an alternative parametrization of the phase space could be better suited (even within the near-axis framework).
With the above in mind, we present on the phase space in figure 3 those devices in QUASR that satisfy
-
(i) the device’s
$\overline {\eta }^*$ has a relative error of at most 10 % with respect to
$\overline {\eta }^\dagger$ , i.e.
$|\overline {\eta }^*-\overline {\eta }^\dagger |/|\overline {\eta }^\dagger | \lt 0.1$ ;
-
(ii) the magnetic axis has only three dominant harmonics, i.e.
${R_{k{\kern 1pt} {n_{{\rm{fp}}}}}}( {1 + {k^2}n_{{\rm{fp}}}^2} ),{Z_{k{\kern 1pt} {n_{{\rm{fp}}}}}}( {1 + {k^2}n_{{\rm{fp}}}^2} )\lt0.1$ for
$k\geq 4$ .
It is also ensured that all plotted devices have
$R_{3n_{\text{fp}}}\gt 0$
. If this is not the case, then the device is rotated by a half-period, which flips the sign of the odd harmonics:
$R_{n_{\text{fp}}}, R_{3n_{\text{fp}}}$
. A device’s coordinate on the landscape is computed from the harmonics of its magnetic axis.
We observe that the QA devices lie as expected in the QA region. This QA region shrinks as the magnitude of the third harmonic increases, which corroborates the observation that QA devices typically do not have highly shaped magnetic axes. In the QH region, QUASR configurations appear to cluster around the HSX branch, noticeably leaving the ‘new QH’ branch, which at the time of publication the authors in Rodríguez et al. (Reference Rodríguez, Sengupta and Bhattacharjee2023) noted that had not yet been explored. This is so even when optimization efforts were launched directly from the new branch, cases in which optimization trajectories left the branch to find better QS elsewhere. Whether the struggle to find such configurations comes from the difficulty to find appropriate coils under the current engineering constraints, some intrinsic limitations on volumetric QS or the existence of a higher-dimensional path in the phase space towards better QS, remains a subject of study.
While this comparison with the near-axis model is interesting and insightful, this approach has some important drawbacks. The landscape plots we obtain here are restricted to two or three dimensions before the visualizations become laborious. Despite this, it might still be possible to measure a device’s distance to a favourable branch in higher dimensions, permitting some additional higher-dimensional insights. A better representation of the axes could also be fitting. In addition, the filters on
$|\overline {\eta }^*-\overline {\eta }^\dagger |/|\overline {\eta }^\dagger |$
may also be unnecessarily restrictive, although as we have argued, this filter is not completely artificial if elongation of flux surfaces is taken into consideration. In the following section, we will use a different approach that applies to the entire data set.
5. Principal component analysis
Principal component analysis is a dimensionality reduction technique that projects
$D$
-dimensional data points
$\boldsymbol{x}_i \in \mathbb R^D$
onto a lower-dimensional manifold (Bishop Reference Bishop2006). The vector
$\boldsymbol{x}_i$
is also called a feature vector. For the following explanation, we focus on reducing the dimensionality of the stellarators to two-dimensional points on the plane, though one is free to retain as many dimensions as desired.
The PCA finds a plane that lies closest to the data points, and then projects the data points on this plane, thereby reducing its dimensionality. Any projected point can be expressed as
${\alpha _1}{{\boldsymbol{d}}_1} + {\alpha _2}{{\boldsymbol{d}}_2} + \boldsymbol{\gamma}$
, for some
$\alpha _1, \alpha _2 \in \mathbb R$
, where
$\boldsymbol{d}_1$
and
$\boldsymbol{d}_2$
are orthonormal. The vectors
$\boldsymbol{d}_1, \boldsymbol{d}_2$
and
$\boldsymbol{\gamma}$
are chosen to minimize the error,

where
$\tilde {\boldsymbol{x}}_i$
is the projection of the original point
$\boldsymbol{x}_i$
onto the plane of the form

From the orthonormality of
$\boldsymbol{d}_1$
and
$\boldsymbol{d}_2$
, it can be shown that
$\alpha _{1,i} = \boldsymbol{d}_1 \boldsymbol{\cdot} ( \boldsymbol{x}_i-\boldsymbol{\gamma} )$
and
$\alpha _{2,i} = \boldsymbol{d}_1 \boldsymbol{\cdot} ( \boldsymbol{x}_i-\boldsymbol{\gamma} )$
, and the optimal choice of
$\boldsymbol{\gamma}$
is the mean of the data, i.e.
$\boldsymbol{\gamma} ^* =\overline {\boldsymbol{x}}= {1}/{N}\sum _{i = 1}^N\boldsymbol{x}_i$
. The Karush–Kuhn–Tucker conditions of the minimization problem (Bishop Reference Bishop2006) in (5.1) reveal that the optimal choice of vectors
$\boldsymbol{d}^*_1, \boldsymbol{d}^*_2$
are given by the first two eigenvectors of the covariance matrix

The vectors
$\boldsymbol{d}^*_1, \boldsymbol{d}^*_2$
are called, respectively, the first and second principal components. For this choice of
$\boldsymbol{\gamma} ^*, \boldsymbol{d}^*_1, \boldsymbol{d}^*_2$
, the reconstruction error is

which is the sum of the
$D-2$
remaining eigenvalues of the covariance matrix. The ratios
$\lambda _1/(\lambda _1 + \cdots + \lambda _D)$
and
$\lambda _2/(\lambda _1 + \cdots + \lambda _D)$
measure by how much the first two principal components reduced the reconstruction error. If the sum of these two quantities is close to unity, then little information was lost in the dimensionality reduction. One is free to include as many principal components as desired, but for simplicity of plotting and ease of understanding, we restrict ourselves to two or three principal components.

Figure 4. Data
$\boldsymbol{x}_i \in \mathbb R^3$
from three Gaussians are shown in (a), and the optimal projection
$\tilde {\boldsymbol{x}}_i$
onto a two-dimensional plane is shown in (b). The transparent ellipsoids correspond to the 95 % confidence interval of each Gaussian.
Once a PCA is completed, each high-dimensional data point
$\boldsymbol{x}_i$
will have a corresponding two-dimensional coordinate
$(\boldsymbol{d}^*_1 \boldsymbol{\cdot} ( \boldsymbol{x}_i-\boldsymbol{\gamma} ^*), \,\boldsymbol{d}^*_2 \boldsymbol{\cdot} ( \boldsymbol{x}_i-\boldsymbol{\gamma} ^*)) \in \mathbb R^2$
. We illustrate this procedure with a three-dimensional set of data points
$\boldsymbol{x}_{i} \in \mathbb R^3$
in figure 4(a), projected down to a two-dimensional plane on the right. Projected points far away from each other on the plane will also be far away from each other in the higher dimensional space. In other words, dissimilar devices in the low-dimensional representation are dissimilar in higher dimensions. Note that the converse of this statement is not necessarily true.
The advantages of PCA is that it is an unsupervised technique that does not require labels or prior knowledge about the data. It identifies important features in a low-dimensional subspace that describe the data set and permit its visualization. By doing so, it reduces the complexity and noise in the data and gives an idea of its intrinsic dimensionality, which is revealed by examining the eigenvalues of the covariance matrix. The PCA is often used as a preprocessing step for machine learning and optimization algorithms to expose the most important components of the data (Kobak & Berens Reference Kobak and Berens2019). It also does not have any parameters to adjust, which makes it straightforward to apply to a data set.
However, PCA can also have disadvantages. By reducing the dimensionality of the data, we may lose some important information. The transformation to low-dimensional space may distort the geometric relationships that existed in the original space. In addition, PCA is useful when the data are well approximated on a linear manifold of dimension less than
$D$
, which is certainly not always the case. For example, consider a data set that lies on a three-dimensional helix. The intrinsic dimension of the data are one, since the helix can be unravelled to parameterize any point on it by a single coordinate. But applying PCA to the data set distorts it, since its topology is ill-approximated on a linear manifold of dimension less than three. In cases like this, nonlinear dimensionality reduction techniques can be useful, and the literature on these approaches is rich. For example, isomap (Tenenbaum, Silva & Langford Reference Tenenbaum, Silva and Langford2000), and t-distributed stochastic neighbour embedding (Van der Maaten & Hinton, Reference Van der Maaten and Hinton2008) are all well-known algorithms to accomplish this task. The disadvantage of nonlinear techniques is that they may have user-defined tuning parameters, which can be difficult to select. Since the underlying topology of our data set is not known, and to avoid the difficulties associated with nonlinear techniques, we use PCA in this work. Auto encoders have also been proposed as interesting methods for compression or dimensionality reduction. Bourlard & Kamp (Reference Bourlard and Kamp1988) showed that auto encoders perform a similar operation to singular value decomposition or low rank matrix approximation. Since PCA is a type of low rank matrix approximation, this is an interesting connection to what is done in this work.
In light of the above discussion, we use a global and local measure of accuracy to evaluate how faithful the data in lower dimensions is to the original.
-
(i) The cumulative principal component (PC) ratios. As mentioned above, this score quantifies how much information was lost in the projection onto the lower-dimensional linear manifold, corresponding to the sum of the PC ratios:
$\lambda _1/ (\lambda _1 + \cdots + \lambda _{D})$ when projecting onto a one-dimensional manifold,
$(\lambda _1 + \lambda _2 )/ (\lambda _1 + \cdots + \lambda _{D})$ when projecting onto two-dimensional manifold, and
$(\lambda _1 + \lambda _2 + \lambda _3)/ (\lambda _1 + \cdots + \lambda _{D})$ when projecting onto a three-dimensional manifold. The closer this value is to unity, the less information is lost in the projection. If more PCs are included, then the corresponding eigenvalues are added to the numerator. This is a global measure of accuracy.
-
(ii) The k-nearest neighbours (KNN) fraction. The fraction of KNN in the high-dimensional data set that remain in the two-dimensional projected data points (Lee & Verleysen Reference Lee and Verleysen2009; Kobak & Berens Reference Kobak and Berens2019). The purpose of this quantity is to measure how accurately local relationships are preserved after dimensionality reduction. Note that for very large
$k$ , this fraction approaches unity, so some reasonable choice of
$k$ has to be picked. This is a local measure of accuracy.
We present these at work in a simple example of Gaussian processes in Appendix B. In what follows, we define how we represent the different devices in the QUASR dataset, and then analyse them using the tools above.
5.1. Feature vectors
In order to apply PCA to our set of stellarators, we must define how a device’s feature vector is constructed. A device’s feature vector is defined to have
$D=663$
entries, where each entry is a Fourier harmonic of a stellarator symmetric surface in the device’s vacuum magnetic field, using
$m_{\text{pol}}, n_{\text{tor}}=10$
, where
$m_{\text{pol}}, n_{\text{tor}}$
are, respectively, the number of poloidal and toroidal harmonics used to represent the surface geometry. We use the special surface representation (Giuliani et al. Reference Giuliani, Wechsung, Stadler, Cerfon and Landreman2022b
):



where


and

Since we are only working with stellarator symmetric devices, the symmetry-breaking basis functions in the sums of (5.5)–(5.9) are skipped. This representation uses three truncated Fourier series to represent the Cartesian coordinates of the surface:
$(x(\varphi , \theta ), y(\varphi , \theta ), z(\varphi , \theta ))$
. This is in contrast to ‘Variational Moments Equilibrium Code’ (VMEC) (Hirshman & Whitson Reference Hirshman and Whitson1983a
) and DESC (Panici et al. Reference Panici, Conlin, Dudt, Unalmis and Kolemen2023), which use a cylindrical representation
$(R(\phi , \theta ), \phi , Z(\phi , \theta ))$
, with
$\phi$
the standard cylindrical toroidal angle.
The feature vector associated with a device contains all the Fourier harmonics
$\hat x_{i, j}, \hat y_{i, j}$
and
$\hat {z}_{i, j}$
associated with the surface, without any additional normalization. We emphasize this lack of normalization because in the context of PCA it is usually recommended to normalize the data across samples. One common normalization is to rescale each component of the feature vector to a Gaussian with mean zero and covariance one to ensure no feature dominates the analysis due to it being much larger. However, this kind of normalization only makes sense if the different components of the feature vector are not meaningfully related to each other and their values have different orders of magnitude because they come from inherently different quantities or phenomena. In our case, the feature vectors contain Fourier coefficients, which decay as the harmonic increases. Therefore, we chose to forgo the normalization of different Fourier components across the devices because the values of the Fourier coefficients have a very important meaning: they are related to each other to describe a single surface. For the vacuum field devices in QUASR, the magnetic field of the device is fully specified by the geometry of a single magnetic surface, regardless of how this is parametrized. Additional parameters would be required in the feature vector for devices with plasma pressure, and normalization might be indicated there. This leaves some non-uniqueness in our chosen representation, as different angle choices for the defining surface can lead to different Fourier coefficients defining the same magnetic surface geometry. To remove this indeterminacy, we choose the angles on the surface
$(\varphi , \theta )$
to correspond to Boozer coordinates. Even then, there remains certain non-uniqueness; for example, the same configuration rotated by a half-period about the
$Z$
-axis, or reflected about the
$XY$
plane would correspond to different feature vectors. Thus, to ensure that geometrically similar devices are close to each other in Fourier space, in addition to
-
(i) making the toroidal and poloidal angles on the surface Boozer angles
$(\varphi , \theta )$ ,
we check that the magnetic axis in cylindrical coordinates
$\Gamma (\phi ) = (R(\phi ), \phi , Z(\phi ))$
satisfies
-
(ii)
$Z'(0) \geq 0$ ,
-
(iii)
$R(0) \geq R(\pi /n_{\text{fp}})$
and that the surface
$\Sigma (\varphi , \theta ) = (x(\varphi , \theta ), y(\varphi , \theta ), z(\varphi , \theta ))$
in Boozer coordinates wraps toriodally and poloidally such that
-
(iv)
$x(0, 0) \geq x(0, \pi )$ ,
-
(v)
$(({\partial z})/({\partial \theta }))(0, 0) \leq 0$ ,
-
(vi)
$(({\partial y})/({\partial \varphi }))(0, 0) \geq 0$ .
These conditions are illustrated in figure 5. The second and third conditions break the freedom about the stellarator symmetric points. The final three conditions ensure that the origin of the surface
$(\varphi =0, \theta =0)$
lies on the outboard side of the stellarator, that the parametrization wraps poloidally clockwise if looking in the
$+Y$
direction at the curve given by
$\Sigma (\theta , 0)$
, and that increasing
$\varphi$
means going anticlockwise about the
$Z$
-axis if looking in the
$-Z$
direction. If these conditions are not satisfied, we reflect the device about the
$XY$
plane, rotate it by a half-period about the
$Z$
-axis, or reparametrize the surface so that it wraps in the correct direction. The direction of the inequalities in conditions (ii)–(vi) were arbitrarily chosen.

Figure 5. Conditions that the surface parametrization must satisfy so that the feature vector of Fourier coefficients is unique. Panel (a) illustrates how the surface coordinates satisfy
$x(0, 0) \geq x(0, \pi )$
,
$(({\partial z})/({\partial \theta }))(0, 0) \geq 0$
,
$(({\partial y})/({\partial \varphi }))(0, 0) \geq 0$
, and that the
$Z$
-coordinate of the magnetic axis (in red) satisfies
$Z'(\phi ) \geq 0$
. Panel (b) is a view in the
$-Z$
direction onto the
$XY$
plane, showing that the radius of the magnetic axis satisfies
$R(0) \geq R(\pi /n_{\text{fp}})$
. The grid lines on the magnetic surface correspond to lines of constant toroidal and poloidal Boozer angles.
A more compact feature vector could have been generated by converting the representation in (5.5)–(5.9) to a cylindrical one, thereby reducing the number of Fourier series from three to two. We do not do so here because the coil design workflow relies on the Cartesian representation of the surfaces. Although Cartesian coordinates are used in the coil design workflow, this does not mean any field shape is possible. Note that in Phase I (the near-axis phase of the coil design) it is assumed that the magnetic axis can be represented in cylindrical coordinates. As a result, we will not find devices with more general magnetic axis geometries, e.g. devices where the axis is anywhere vertical, or knotted. Generalizing our implementation is the subject of future work.
In addition, this conversion is not lossless when restricted to a finite dimensional Fourier series, as explained in § 5.2. Other forms of parametrization could also be possible. It is tempting, for example, to represent each configuration by its second-order near-axis model, which constitutes a significantly reduced space. Although this could be interesting, it does not correspond to a one-to-one map, and thus should not be used as a unique representation. This is because, a priori, one could devise an infinity of distinct fields, all of which correspond to the same second-order near-axis construction. Another alternative could be to use a vector of Fourier harmonics associated with the coils and their currents. We did not do this due to the inherent non-uniqueness of coil sets, i.e. multiple, vastly different coil sets can produce similar magnetic fields. That being said, we have not fully explored these directions, and it might be interesting to do this in the future.

Figure 6. The PCA of QH symmetric devices in QUASR, along with the new QH device indicated with red text. At the centre of each dashed square, we take the associated surface DOFs and generate cross-sections to illustrate the geometric diversity of devices. This type of cross-section diagram is common in morphometrics literature (Bonhomme et al. Reference Bonhomme, Picq, Gaucherel and Claude2014). The red dashed lines correspond to projections of one-dimensional PCAs of the left- and right-hand clusters onto the two-dimensional manifold. The colour on the magnetic surfaces corresponds to the local field strength. The quality of QS is calculated by taking the square root of the average of (2.1) on a number of surfaces of the magnetic field.
When presenting QUASR devices in what follows, we will do so alongside quasisymmetric devices available in the literature (see details in Appendix C).
5.2. Quasihelical symmetry (
$n_{\text{fp}}=4, \iota =2.3$
)
To start, we look at
$n_{\text{fp}}=4$
devices with a mean rotational transform of 2.3 in figure 6. We observe that there are two distinct clusters of devices. Within these clusters, the devices form a continuum with varying qualities of QS.
One may question how faithful this two-dimensional plot is to the true high-dimensional data set. A hint that this plot is an accurate representation of the relationships in the data are the sum of the percentages in the
$x$
and
$y$
axis labels is large. This is evidence that including a third dimension is likely unnecessary to understand how devices relate to each other, which can be confirmed via a three-dimensional PCA. Using 10 nearest neighbours in the KNN calculation, the average KNN fraction over all data points is 0.689. This indicates that on average, just under 70 % of each node’s 10 nearest neighbours in the high-dimensional representation are preserved on the lower-dimensional manifold. These projection error measures are comparable to the dimensionality reduction example of data sampled from simple Gaussians, in the introduction of Appendix B.

Figure 7. Cross-sections of a device from cluster 2 in figure 6. The section is computed in the
$RZ$
plane (red) and the plane orthogonal the magnetic axis (blue). The magnetic axis is drawn in black. Both planes pass through the same point on the magnetic axis.
Each point on the landscape plot corresponds to a specific vacuum magnetic field, with associated magnetic surfaces. There are box-shaped regions delineated by dashed lines on the plot. Taking the point at the centre of each box, we compute cross-sections of the associated magnetic surface. The cross-sections are plotted in black, red and green, associated with the cylindrical angle
$\phi =0, \pi /(4n_{\text{fp}}), \pi /(2n_{\text{fp}})$
, respectively. This figure makes it clear how devices differ from each other on the plane, and importantly how moving along the principal components affects the geometry of the device. Moving in the positive direction associated with
$\boldsymbol{d}^*_1$
, the elongation appears to increase, while going in the negative direction the elongation decreases. In the positive
$\boldsymbol{d}^*_2$
direction, the relative positions of the cross-sections move, and in particular the axis excursion increases. These cross-sections do not necessarily correspond to physically relevant devices, as illustrated by the leftmost column, where the cross-sections are self-intersecting. This is not surprising, as the devices lie on a nonlinear manifold, and the PCA slices through it with a linear one. If one strays too far away from the origin of the plot (the sample mean), the approximation errors become noticeable and unphysical devices appear.
Within the space populated by the QUASR devices, we find that converting the highly shaped surfaces, e.g. the device in figure 7, to a cylindrical representation, as is used in the more standard equilibrium solvers VMEC or DESC, requires more than
$m_{\text{pol}},n_{\text{tor}}=20$
Fourier modes. This is significantly larger than the number of modes used to design the stellarator (
$m_{\text{pol}},n_{\text{tor}}=10$
) with the representation in (5.5)–(5.9). Figure 7 illustrates the geometrical origin of this difficulty, where due to the large helical excursion of the magnetic axis, cross-sections in the
$RZ$
plane become highly elongated, leading to a misleading interpretation of the elongation of the device. This occurrence is particularly true for the devices on the right-hand branch, in the bottom quadrant, where standard equilibrium solvers such as VMEC experience difficulties. These problems are probably shared by DESC (Panici et al. Reference Panici, Conlin, Dudt, Unalmis and Kolemen2023) which uses the same cylindrical representation, but could perhaps be alleviated in the Frenet version of the GVEC code (Hindenlang, Plunk & Maj Reference Hindenlang, Plunk and Maj2024).
As a reference for this set, we also provide the location of the NEW_QH device from § 4 and observe again that there are no devices in this subset of QUASR that are close to it. The NEW_QH device appears closest to the low elongation branch, in the vicinity of devices with excellent QS, but is not contained in QUASR.

Figure 8. Continuum of devices in QUASR. Panels (a) and (b) correspond to the left- and right-hand cluster in figure 6, respectively. The numbers below these devices are their corresponding principal component values, and QS errors. The colour on the magnetic surfaces corresponds to the local field strength.
5.2.1. Visualizing the continuum of devices in each cluster
We observe that the two clusters appear to lie on separate one-dimensional linear manifolds. Subselecting all the devices in the first cluster where
$\text{PC1} \lt 0$
, we do a second PCA on the full feature vector with
$D=663$
entries, and retaining only a single principal component. It appears that the devices are well approximated by a one-dimensional linear manifold, because
${\lambda _1}/({\lambda _1} + \cdots + {\lambda _{663}}) = 0.916$
. The same thing can be done for devices in the second cluster, i.e. where
$\text{PC1}\gt 0$
. Again, the devices here are well approximated by a one-dimensional linear manifold because
$\lambda _1 / (\lambda _1 + \cdots + \lambda _{663}) = 0.876$
. After these separate one-dimensional PCAs, each device from the first cluster now has a single coordinate called PC, which lies on the interval
$[\min (\text{PC}), \max (\text{PC})]$
. We can easily visualize the progression of the device geometries along these one-dimensional manifold by sampling devices uniformly according to the principal component label, i.e. devices with a principal component label that is approximately
$\min (\text{PC1}) + i(\max (\text{PC})-\min (\text{PC}))/N$
for
$i = 0, \cdots , N$
. In figure 6, we have projected these one-dimensional manifolds onto the plane, indicated by two red dashed lines. The progression of devices along these lines in both clusters is shown in figure 8, where the transition between the endpoint geometries appears smooth. This is evidence that the devices in each cluster in the full
$D=663$
dimensional feature space lie on simple one-dimensional continua.
The difference between the two clusters can in this way also be made clearer. We said before that
$\boldsymbol{d}^*_1$
appeared to control the elongation of cross-sections, with cluster 2 characterized by an increased value for it. However, from figure 8, there is a key geometric difference between the clusters that explains this observation and which has to do with the difference in twisting of the devices. At the sharpest corners in the
$RZ$
projection, cluster 1 devices have a magnetic axis mainly lying on the plane, while cluster 2 devices have their axes coming off the plane. This difference in bending and twisting is key in controlling rotational transform and other properties, and is worth further exploration in the future.
5.3. Quasihelical symmetry (
$n_{\text{fp}}=4, \iota =1.1$
)
We look at another region of the QH design space that has been explored in the literature (
$n_{\text{fp}}=4$
and
$\iota =1.1$
) in figure 9, where there are two clusters of devices once again, in this case with the largest centred at the origin, the second lying above it. The HSX, WISTELL-A and NIES_AR = 9 devices are given within the main cluster for reference. There is in this case a large disparity between clusters, with the top one being scarcely populated.

Figure 9. Panel (a) contains a PCA of QH symmetric devices in QUASR when
$\iota =1.1$
and
$n_{\text{fp}}=4$
, where devices in QUASR with two dominant PC components are plotted. Panels (b) and (c) show, respectively, the two-term QS error from Helander & Simakov (Reference Helander and Simakov2008) and Landreman & Paul (Reference Landreman and Paul2022) computed using VMEC and relative
$\iota$
constraint violation (
$|\iota -1.1|/1.1$
) on the PC subspace. The black isolines correspond to coordinates where the constraint violation is precisely 5 % and the black dots are devices in panel (a) that have only two dominant principal components. The region on the PC plane where the two-term QS error could not be evaluated or VMEC did not converge, notably on the bottom left-hand side of the panels, is hatched.
To try to make sense of this particular clustering of configurations, we consider constructing and characterizing vacuum magnetic fields for the whole PC plane. Each pair of coordinates in such a plane corresponds to one particular stellarator geometry, whose equilibrium we may then solve using VMEC. This allows us to understand the appearance of the cluster, as well as opening the door to analysing other configuration properties within such a space. We present a measure of QS quality over this plane in figure 9(b) (in particular a plot of the two-term QS residual from (Helander & Simakov Reference Helander and Simakov2008; Landreman & Paul Reference Landreman and Paul2022), also
$f_C$
in Rodriguez et al. Reference Rodriguez, Paul and Bhattacharjee2022a
), and overlay as black dots the QUASR devices in the clusters above with two dominant PC components. As expected, two agglomerations of devices in QUASR lie in local QS minima, providing external validation that the devices in QUASR were properly optimized for QS. However, it fails to properly describe the separation into the two clusters, for which one must consider the additional design constraint that the average rotational transform is
$\iota =1.1$
. Figure 9(c) shows the relative violation of this constraint of rotational transform on the plane; i.e.
$|\iota -1.1|/1.1$
. It appears that there are two separate manifolds where the
$\iota$
design constraint is accurately satisfied. Since the devices in QUASR were designed to satisfy these constraints to within
$0.1\,\%$
accuracy, the configuration clusters lie in the intersection of these delineated regions and figure 9(b). This is not exact, though, especially because in reality devices have more than only two principal components, and are not located precisely on the PC1–PC2 plane.
The topology of the data appears thus to be governed by the interplay of the rotational transform design constraint and the valleys of the QS objective. This perspective on the origin of the clusters also explains why the top cluster is more scarcely populated, as it has less accurate QS than the bottom one and hence corresponds to a shallower optimization well. This phenomenon could also explain the clusters in § 5.2, but VMEC failing in a large portion of the PC1–PC2 plane prevents any conclusion.
5.4. Quasihelical symmetry (
$n_{\text{fp}}=3, \iota =1.1$
)
In figure 10(a), we plot the landscape of devices for a lower rotational transform
$\iota =1.1$
and
$n_{\text{fp}}=3$
, where it appears that little information was lost in the projection. We find the devices in QUASR here are close to an unpublished device from (Kappel et al. Reference Kappel, Landreman and Malhotra2024). The KNN fraction is quite high, 0.778. In contrast to the PCA in § 5.2, both principal components appear to modify the devices’ elongation and the positioning of the cross-sections. We note that here, there does not appear to be two separate clusters of devices and the devices form a V-shaped continuum. Figure 10(b, c) contain the two-term QS error and
$\iota$
constraint violation. The QUASR devices appear to follow the QS well throughout the plane (even suggesting the V-shape), but this clearly clashes with the rotational transform constraint. It does not in the lower portion of the cluster, which explains the large number of field cases found there. But it does in the middle portion.

Figure 10. Panel (a) contains a PCA of QH symmetric devices in QUASR, along with an unpublished device from (Kappel et al. Reference Kappel, Landreman and Malhotra2024), indicated with red text. The colour on the magnetic surfaces corresponds to the local field strength. Panels (b) and (c) show, respectively, the two-term QS error from Helander & Simakov (Reference Helander and Simakov2008) and Landreman & Paul (Reference Landreman and Paul2022) computed using VMEC and relative
$\iota$
constraint violation (
$|\iota -1.1|/1.1$
) on the PC subspace. The black isolines correspond to coordinates where the constraint violation is precisely 4 % and the black dots are devices in panel (a) that have only two dominant principal components. The region on the PC plane where the two-term QS error could not be evaluated or VMEC did not converge is hatched.
Behind this apparent contradiction lies an important observation we emphasized before. Figure 10(b, c) correspond to configurations with only two principal components. But the QUASR devices truly live in a higher dimensional space. In particular, the configurations in the elbow of the V-shape cluster do, which are in fact the devices that do not fit with the interpretation of figure 10(b, c).
We may nevertheless attempt to describe the cluster in this scenario as a one-dimensional continuum. However, a PCA is inappropriate here as clearly the manifold is nonlinear. Instead, we use the isomap algorithm (Tenenbaum et al. Reference Tenenbaum, Silva and Langford2000) to visualize the progression of the device geometries along the manifold. The isomap algorithm works by generating a connectivity graph between data points using the
$k$
-nearest neighbours of each node. Then, pairwise distances between points are computed by traversing the connectivity graph. The multidimensional scaling algorithm (Kruskal Reference Kruskal1964) attempts to find a lower dimensional representation that preserves these pairwise distances. Our aim is to find a one-dimensional labelling of the devices so that the devices can be ordered, similar to § 5.2. Applying the isomap algorithm on the full
$D=663$
dimensional data points, we obtain the embedding shown in figure 11, i.e. each device is given a one-dimensional coordinate. Uniformly sampling this coordinate, the progression of devices in the bottom row of the figure is obtained.

Figure 11. The nonlinear manifold in figure 10 can be parametrized with a single coordinate using the isomap algorithm. The row of devices are uniformly sampled from the isomap embedding, and the numbers below these devices corresponds to the isomap coordinate and the quality of QS. The colour on the magnetic surfaces corresponds to the local field strength.
5.5. Quasihelical symmetry (
$n_{\text{fp}}=5, \iota =2.5$
)

Figure 12. The PCA of QH symmetric devices in QUASR when
$\iota =2.5$
and
$n_{\text{fp}}=5$
. Two clusters are labelled and devices from them are plotted. The colour on the magnetic surfaces corresponds to the local field strength.
In this section, we show the two-dimensional landscape of devices with a rotational transform
$\iota =2.50$
and
$n_{\text{fp}}=5$
in figure 12. The first principal component appears to shift the relative positions of the cross-sections, while the second principal component appears to act on the elongation of the device (similar to the
$n_{\mathrm{fp}}=4$
before). There appears to be two separate clusters of devices, where the second cluster contains the most quasisymmetric devices (again there seems to be a parallel to the
$n_{\mathrm{fp}}=4$
scenario, even on the geometric behaviour within and between clusters).
However, a non-negligible amount of information was lost in this projection, as the sum of the PC ratios is just over 80 %. Similarly, the KNN fraction is just over 0.45. To provide a more accurate representation of the data, we must augment figure 12 with a third principal component in figure 13. The two-dimensional projections of the data on the PC1–PC2, PC1–PC3 and PC2–PC3 planes are also provided to make the three-dimensional nature of the data more easily understood. Our projection error measures immediately improve as the sum of the PC ratios is approximately 90 %, and the KNN fraction increases to 0.678.
The third PC component evidences that the clusters are rather different: the first cluster is a three-dimensional cloud, while the second cluster appears to have very little variation in the third PC. The higher dimensionality of cluster 1 can be quantified by observing the cumulative PC ratio as the number of principal components used to represent cluster 1 and 2 in figure 13(a). It takes three PCs for the cumulative PC ratio of cluster 1 to pass 0.9, while it takes only two for cluster 2.

Figure 13. In panel (a), the cumulative PC ratio with increasing number of principal components used to represent the two clusters from figure 12. In panel (b), PCA of QH symmetric devices in QUASR when
$\iota =2.5$
and
$n_{\text{fp}}=5$
, similar to figure 12 but with a third principal component. The two-dimensional projections of the data on the PC1–PC2, PC1–PC3 and PC2–PC3 planes are also provided in black to make the three-dimensional nature of the data more easily understood.

Figure 14. The PCA of QA devices in QUASR along with devices in the literature, indicated with red text.
5.6. Quasiaxisymmetry
For completeness, in figure 14(a), we complete a PCA on QA devices in QUASR with
$n_{\text{fp}}=2$
and
$\iota =0.4$
. The cumulative PC ratio is just over
$80\,\%$
, revealing that a non-negligible amount of information is lost in this projection, and it might be useful to include a third principal component in the visualization. Given the two principal axes, we project devices outside QUASR onto the figure and label them. This area of the parameter space is well explored. In figure 14, we do a PCA for QA devices, subselected based on their rotational transform (
$\iota = 0.6$
) and number of field periods (
$n_{\text{fp}}=2,3$
). Labelled devices from the literature appear similar to the devices in QUASR. We do not provide cross-sections of hypothetical devices on the linear manifold as we did in previous sections because, visually, the cross-sections do not differ significantly from each other.
The analysis from § 4 reveals that the region of QA devices with favourable QS is small, and, moreover, it shrinks with increasing shaping of the device. This bears out in the PCA analysis here too, as we find that there is not much geometric diversity of the devices at the aspect ratio studied. Notably, the Landreman–Paul QA device (PRECISE_QA) is close on the plane to the device called GIU_QA. This device was obtained by providing the PRECISE_QA device as an initial guess to a direct coil optimization algorithm. The quality of QS in these two devices differs by almost an order of magnitude, despite the geometrical proximity on the plot.
6. Discussion and conclusion
The main goal of this paper was to introduce a new data set of optimized stellarators, and provide tools to navigate the contents of it. We start by highlighting a few devices in QUASR first, and then move to analysing the data-set. First, we frame QUASR within the near-axis framework and its predicted landscape of quasisymmetric configurations. We find close correspondence of devices in QUASR to regions of favourable near-axis QS. However, this approach does not apply to the full set of devices in QUASR given the difference in assumptions that went into their construction, most notably constraints on elongation and simplicity of the shaping. A more model agnostic procedure to analyse the data are then presented in the form of a PCA. On the subsets of the data set that we have explored in this work, we find that the first and second principal components act predominantly on the elongation (and different field twisting), and relative positions of cross-sections of the magnetic surfaces. These two dimensions are able to capture the devices in the data set to a large degree. For some subsets of the data, we observe clear clusters. Within the clusters, we find evidence that the devices form a continuum whose shape follows that of the design constraint manifold.

Figure 15. The cumulative PC ratio of subsets of the data for various combinations of
$n_{\text{fp}}, \iota$
and helicity. Panels (a) and (b) correspond to QA and QH devices, respectively. Subpanels (i), (ii) and (iii) correspond to particular values of
$n_{\text{fp}}$
, and the curves correspond to different values of
$\iota$
. The horizontal dashed line corresponds to a cumulative PC ratio of 0.9.
For the sets presented, two to three principal components appear sufficient for visualization. This is true of even more subsets throughout the QUASR database, as illustrated in figure 15, where we compute a PCA for various combinations of
$n_{\text{fp}}$
and
$\iota$
for QA and QH devices. It is possible to attain a cumulative PC ratio of 0.9 consistently. This illustrates that much of the variation between the devices can be characterized on a low-dimensional, linear manifold. This points to the lower dimensional nature of the subset of quasisymmetric configurations in the larger space of stellarators. This echoes the analysis in Sengupta et al. (Reference Sengupta, Nikulsin, Buller, Madan, Paul, Nies, Kaptanoglu, Hudson and Bhattacharjee2023), where it was revealed that QS imposes a lower-dimensionality on the field strength. We would like to highlight though that this lower-dimensionality does not necessarily extend to the coil design space, which is known to be complex with multiple minima (Wechsung et al. Reference Wechsung, Giuliani, Landreman, Cerfon and Stadler2022a
; Giuliani Reference Giuliani2024). This is because there are many, vastly different coil sets that produce approximately the same magnetic field (Landreman Reference Landreman2017). This highlights that some sort of global search or a problem reformulation is indeed necessary for coil design.
Whether the information lost in the dimensionality reduction is significant depends on the application. For the purposes of visualization in this work, a cumulative PC ratio of 90 % appeared satisfactory. But in other applications, this might not be the case. Some useful extensions of this include using the very simple, linear formulae learned during the PCA to generate warm starts for optimization. Alternatively, one might attempt optimizing in the PC subspace. These directions are not explored in this paper.
Acknowledgements
The authors would like to thank Flatiron’s Scientific Computing Core (SCC). We would also like to thank M. Berger, R. Jorge, M. Landreman, E. Paul, M. Rachh and L. Saul for helpful discussions.
Editor Per Helander thanks the referees for their advice in evaluating this article.
Funding
E.R. was supported by a grant of the Alexander-von-Humboldt-Stiftung, Bonn, Germany, through a postdoctoral research fellowship.
Declaration of interests
The authors report no conflict of interest.
Data availability
The latest version of the QUASR data set can be found at https://zenodo.org/doi/10.5281/zenodo.10050655. It can be visually explored at quasr.flatironinstitute.org.
Appendix A. Fitting the second-order near-axis model to the devices
In this appendix we describe how a near-axis model to second order is constructed for a given field from the QUASR database.
The geometry of the field’s magnetic axis can be directly constructed from the given coils, but an additional three parameters are needed that describe the second-order near-axis model in vacuum:
$B_0$
,
$\overline {\eta }$
and
$B_{2c}$
. In the neighbourhood of the magnetic axis, the field strength
$B = \|\boldsymbol{B}\|$
of the near-axis model satisfies

where stellarator symmetry is assumed,
$\varphi , \theta$
are Boozer angles and
$r$
is the radial distance off-axis. There are a multitude of methods to estimate these near-axis parameters in the field generated by the coils. For instance, one could evaluate the field strength on a three-dimensional tensor-product grid
$(r_i, \varphi _j, \theta _k)$
in the neighbourhood of the magnetic axis and solve for the values of
$B_0, \overline {\eta }, B_{2c}, B_{20}(\varphi _j)$
in a least-squares sense. Another possibility is to fit the predicted cross-sections of the near-axis magnetic surfaces to the true ones (Landreman Reference Landreman2019). These techniques both have the difficulty that one must use information off axis, and the quality of the fit might be sensitive to the choice of how far off-axis one goes. In order to make the fitting procedure independent of this user-defined choice, we solve three minimization problems sequentially,

where
$\phi$
is the standard cylindrical angle. Since the magnetic field is generated by a set of coils, we can compute the magnetic field as well as its first and second derivatives using the Biot–Savart law, which we have direct access to in SIMSOPT. As a result, we can find the near-axis parameters that most closely reproduce the coils’ magnetic field and its derivatives on axis. Note that the near-axis model assumes that there is perfect QS to first order. While a good approximation of QS is attainable in the volume, the devices do not present perfect near-axis QS, and we may not always be able to obtain be a good fit of the coils’ magnetic field.

Figure 16. Poincaré plot (green) and cross-sections of surfaces generated by the near-axis expansion (red) fit to a device in QUASR. Panel (a) shows the first-order near-axis expansion (NAE) surfaces, while (b) shows the second-order NAE surfaces. The outermost surfaces here have aspect ratio
$\sim 9$
. The field lines are generated from starting points along the inboard side with
$Z=0$
. The Biot–Savart Poincaré sections are different in (a) and (b) because the field lines are initialized from different starting points.
In figure 16, we provide a Poincaré plot alongside cross-sections of surfaces generated by a first- and second-order near-axis expansion. The NAE model was fit using the method described in this section, revealing that the second-order NAE is capable of describing the magnetic surfaces in the neighbourhood of the magnetic axis much more faithfully than the first-order one.
Appendix B. The PCA example for Gaussian processes
To gain some intuition on the quality measures of cumulative PC ratios and KNN fraction introduced in § 5, we apply PCA to data generated from three Gaussians in this appendix. These Gaussians are defined to have mean and covariance matrices,

where
$Q_1, Q_2$
are randomly generated rotation matrices. Individually, the Gaussians are well approximated by linear manifolds of dimension one, two and three, respectively. Samples from these distributions are shown in figure 4(a) in the main text, while a two-dimensional projection of the data is shown in figure 4(b).
In two-dimensional, the cumulative PC ratio is quite high 0.986, revealing that the majority of variation in the data are captured by two principal components. Despite this favourable quality measure and such a simple data set, the KNN fraction is approximately 0.4, i.e.
$\approx$
40 % of the 10 nearest neighbours in three dimensions remain so on the two-dimensional projection on average.
In table 2, we provide the cumulative PC ratio as a function of the number of PCs. In order for the PC ratio to attain 0.9, each cluster requires a different number of PCs. This hints that cluster 1 is well approximated on a line, cluster 2 on a plane and cluster 3 on a volume. In fact, since cluster 3 is genuinely three-dimensional, dimensionality reduction is not possible without significant loss of information. The KNN fractions increase with the number of PCs, with a value greater than 0.4 appearing to be a lower bound on a reasonable lower-dimensional representation.
Table 2. The top three rows are the cumulative PC ratio and the bottom three rows are the KNN fraction with respect to the number of PCs for the clusters in figure 4.

Table 3. Labelling of devices from the literature, the figure number on which they appear in this manuscript, along with the value of
$n_{\text{fp}}$
, type of QS targeted, aspect ratio and rotational transform. We quote either the mean rotational transform, or the edge rotational transform. These separate cases are highlighted with an ‘m’ or ‘e’ in parentheses.

Appendix C. Reference QS configurations from the literature
The QS configurations from the literature referenced throughout the paper are summarized in table 3. These devices were all optimized for QS at various aspect ratios, and target rotational transforms. Moreover, some devices were designed to have a target mean rotational transform (Landreman & Paul Reference Landreman and Paul2022; Giuliani et al. Reference Giuliani, Wechsung, Stadler, Cerfon and Landreman2022b ), while others had a target edge rotational transform (Nies et al. Reference Nies, Paul, Panici, Hudson and Bhattacharjee2024).
To make a direct comparison of these to the devices in the QUASR dataset, we must construct their corresponding feature vectors using the following procedure. First, we take the geometry of the device’s outermost surface and use VMEC (Hirshman & Whitson Reference Hirshman and Whitson1983) to determine the vacuum magnetic field on the entire toroidal volume. We take the surface in the device that has aspect ratio closest to 10 for QA and 12 for QH. The Boozer coordinates on this surface are computed using BOOZ_XFORM (Sanchez et al. Reference Sanchez, Hirshman, Ware, Berry and Spong2000). Finally, we compute the Fourier harmonics of this surface parametrized in Boozer coordinates. We may then treat them as we did with QUASR objects, and when necessary project them to the appropriate principal component plots. We directly compare these configurations to the relevant ones in QUASR when they share the value of
$n_{\text{fp}}$
, helicity and mean rotational transform (out to aspect ratio 10 (for QA) and 12 (for QH) to an accuracy of
$\pm 0.05$
).