Whitepaper on predictive modeling in the Large Plasma Device (LAPD)

This whitepaper is rather old; I may no longer agree with the contents. Nevertheless it may be interesting to a wider audience and so is shared below. Thank you to Erik Wessel and Claire Baum for feedback.

- Phil, August 6th, 2025

tl;dr: I propose building a predictive model of plasma phenomena in the Large Plasma Device. This model will be built by predicting a partially physical and partially learned representation forward in time. Time series predictive models may be very useful for plasma science and engineering tasks and provides a first step towards automating fusion science.

Fusion is important and development needs to be faster

Our goal is to develop fusion reactors as soon as possible so that we can sustainably produce power and more quickly propel spacecraft. These devices are difficult to understand. Fundamentally, the real test of understanding is prediction. I propose building a machine learning-based time series prediction model of laboratory plasmas built from simulations and experimental data. This proposed project is a step towards automating, and accelerating, fusion science by building a model that weakly understands plasma dynamics.

Reconciling simulation and experiment

Machine learning may provide a way to reconcile simulations and experiments to provide the best of both worlds. Simulations can provide direct access to all information about the plasma at any given time, but although these simulations may model some aspects of reality, they are generally quite inaccurate. Experiments suffer from opposite issues: data collected are spatiotemporally sparse but better represent reality. Experimental data also contain phenomena that are not yet explained by theory, but learning these phenomena can still be useful for future science and engineering tasks (as explained later).

We need better predictive models

Good predictions are important for operating within safe parameter spaces and designing future experiments. Using current physics models, predicting future plasma behavior with reasonable accuracy, either inter- and intra-shot, is nigh impossible for interesting regimes. Machine learning models, trained on experimental data, simulation data (probably including surrogate models), and analytical models, can likely predict more accurately and quickly than has previously been possible. I would like to construct predictive models applied to control problems. In my examples throughout this document I will reference tokamaks, but starting out with such a complicated system (or a system with large sources of free energy) with low rep rates may not be wise because of the notorious data inefficiency of high-capacity ML models. A high-rep-rate machine with potentially simpler dynamics (which may also be easier to simulate), such as the Large Plasma Device (LAPD), is a good fit. One approach to predictive modeling is to predict a set of statistics about a plasma (e.g., confinement time, fluctuation power spectra, etc…) from a set of inputs (e.g., heating power, current drive, shaping, etc…). An alternative approach is to, from an initial state (e.g., instantaneous magnetics measurements), predict the time series directly (magnetics signals) from which statistical quantities can be inferred (e.g., MHD growth rates or something). I propose following the latter approach: building a predictive, time-series model of laboratory plasmas created in the LAPD by predicting the time evolution of some learned, latent representation. Some portion of this representation may be constrained to be physically interpretable. One reason for pursuing a time series prediction task is because it will be more useful for control problems.

A predictive model may be useful for, but not limited to, these plasma science and engineering tasks:

Agent-based control and optimization
 From what I understand, model-based control can be much more sample efficient than model-free control in general. Providing an existing (and mutable) model to an agent for a control or optimization task could make learning policies much easier.
Physics discovery
 For discovery and experiment design, the predictive model would necessarily need to be generative, i.e., able to create synthetic examples from a learned distribution. The learned model may capture phenomena that humans would otherwise miss or ignore (e.g., super-H mode was observed long before it was actually “discovered”). Synthetic discharges can be generated at arbitrary points in parameter space (likely with good accuracy in the interpolation regime), enabling close examination of trends or scaling laws, and perhaps easing equation discovery.
Experiment design
 Generating ensembles of surrogate experiments can be done much faster than performing real ones. We can finding interesting regimes or phenomena by using a novelty search algorithm that proposes real experiments and compares the real outcome with the predicted one. The model could also perform impossible or dangerous surrogate experiments that could illuminate promising research directions. Decent extrapolative capabilities—which modern NN architectures may lack—are needed for experiment design, but this direction may still be worth investigating.
Increased diagnostic efficiency
 If a sufficiently powerful representation of plasma state can be constructed via a model trained on sparse collections of diagnostics, then every other diagnostic signal at any arbitrary position can be recreated, given that the diagnostic and position were covered in the training set (either in the simulated or real data). In effect, it may be possible to create a spatially dense set of diagnostic signals from just one or two, which would be immensely useful but at the expense of robustness to hardware failures. I am uncertain if this reconstruction can be accomplished with current tools and computing power (and if it is within the scope of a thesis project), but it likely will be possible in the future, and may even be necessary for safe operation of a reactor, tokamak or otherwise.
Robustness to diagnostic hardware failures (given many diagnostics)
 If other diagnostics signals can be constructed from a latent plasma representation (analogous to synthetic diagnostics from a simulation), then malfunctioning diagnostics can be identified. The simplest identification procedure would be to build the representation with a majority of the diagnostics disabled or ignored. Rotating through the diagnostics will yield different representations, of which the most divergent will contain the anomalous diagnostic. Although simple, this approach suffers from a combinatorial explosion in the number of diagnostics. Another promising way to do this anomaly detection would be to used the learned plasma representation (including the misbehaving signal) to reconstruct all diagnostics. If there are many diagnostics, the contribution of the broken diagnostic will likely be significantly reduced as it is averaged or otherwise combined with other diagnostics that have similar information content. When the broken diagnostic signal is reconstructed, the anomaly may be detectable through thresholding or a more powerful method if necessary. Cleverer ways to discover anomalous diagnostics probably exist, but the point is that identification of hardware failures may be easier with some plasma representation rather than a pure anomaly-detection approach. Slow failures will likely always be difficult to detect.

Implementation and initial physics problem

There are several approaches to implementing a machine learning-based predictive plasma model. The first portion of this thesis project will be determining a promising angle of attack. As for the first phenomenon to analyze, predicting some type of (perturbative) wave propagation may be a good place to start, but I would greatly appreciate any other ideas in this area.

Resources needed

I do not yet have enough experience with machine learning or plasma simulations to determine the computational requirements of this project. To hazard a guess, I reckon the resources needed are within the current capabilities of our laboratory plus a few thousand dollars.