---
license: mit
pipeline_tag: image-feature-extraction
---

[![SDO-FM_Banner.png](https://cdn-uploads.huggingface.co/production/uploads/66aa4018951180b79f1c6574/k-6Zzqp78Ed_0tu8W2pQJ.png)](https://sdofm.org)
<h2 align="center">SDO-FM: A foundation model for the Sun</h2>

# 1. Introduction
SDO-FM is a foundation model using data from NASA’s Solar Dynamics Observatory
(SDO) spacecraft; integrating three separate instruments to encapsulate the
Sun’s complex physical interactions into a multi-modal embedding space. This
model can be used to streamline scientific investigations involving SDO by making
the enormous datasets more computationally accessible for heliophysics research
and enable investigations that require instrument fusion.

The overall process for building SDO-FM is composed of
four stages; (1) data preparation, (2) large foundation model (FM) training, (3) embedding extraction,
and (4) fine-tuning or direct embedding usage for scientific validation cases. Collectively we denote
the data preparation as effort completed under SDOML [4], a machine-learning dataset of SDO.


Our models are based upon autoencoders, with training conducted under the objective of image
reconstruction over the period beginning from satellite launch in 2010 to 2023. Once these models
are trained, a compressed representation dataset is created from the embeddings by a full-pass over
the encoder. The compressed representations are called direct embeddings and provide a helpful
result as a set of available SDO features at around two-thousandths (0.002) the original size. Lastly,
the direct embeddings as well as standard model fine-tuning, are used to conduct scientific validation
through a validation harness which is used to check our results against past ML-based heliophysics
approaches and to compare their computational expense.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/66aa4018951180b79f1c6574/vB4pPe_x_x-9Zkdj7yWD1.png)

# 2. Quick Start
To run inference with any of the provided models, you can pull a Docker image [[Docker Hub](https://hub.docker.com/repository/docker/spaceml/sdo-fm/general), or follow installation instructions below. To run any of the scientific tasks:
```bash
python scripts/main.py --config-name=embeddings_nvae_virtualeve
```

## 2.1 Installation
SDO-FM can be installed locally by directly installing the package in [this GitHub repository](https://github.com/spaceml-org/SDO-FM). It's advised to use the docker image, however dependencies are contained in the usual [requirements.txt](https://github.com/spaceml-org/SDO-FM/blob/main/requirements.txt).
```bash
pip install -e .
```

## 2.2 Usage

To run any task we assume execution inside a container with the image described in the [Dockerfile](https://github.com/spaceml-org/SDO-FM/blob/main/Dockerfile) and Hydra configurations, these are kept in the [experiments](https://github.com/spaceml-org/SDO-FM/tree/main/experiments) directory. The entry point is [main.py](scripts/main.py) and args will select a configuration:
```bash
python scripts/main.py --config-name=default
```
CLI overrides are still possible with this selection but be aware of some shells not escaping quotes or sqaure brackets:
```bash
python scripts/main.py --config-name=default experiment.seed=37
```

## 2.3 Pre-training
```bash
python scripts/main.py --config-name=pretrain_32.2M_samae_HP
```

## 2.4 Notebooks
A series of [notebooks](https://github.com/spaceml-org/SDO-FM/tree/main/notebooks) are available to explore each of the four downstream tasks described later in this document. 

# 3 Method
SDO-FM is composed of a backbone, optional neck, and head. We define the backbone
as the model initially trained on the reconstruction task, the neck as the converter between backbone
and head, the former then selected for the downstream application (or validation task). We implement
two model families as backbones, one stemming from a Nouveau Variational Autoencoder (NVAE)
[11], the other from a MAE [12]. They are both adapted to better accommodate our scientific dataset
and for intermediate export of their latent spaces in the form of “embeddings.” We additionally
evaluated various feature engineering options regarding how to manage the solar disk, the most
effective included a simple look up from Stonyhurst coordinates, a heliographic coordinate system
for a fixed observer on Earth (suitable given the geosynchronous orbit), to pixel space.


## 3.1 Model choice
Model selection was initially determined by the ability to capture solar phenomena, guided by
applicability to SDO imagery and the ease of access to the embeddings in the latent space. The
autoencoder architecture was selected for the backbone for ease of embedding construction and
extraction. By design, autoencoders create a lower-dimensional representation during the encoding
process. Other requirements included engineering efficiencies, such as ability to mask the solar limb
for on-disk experiments, and cheaply bias by importance sampling for areas of interest (e.g. active
regions).

### 3.1.1 Solar-aware Masked Autoencoder
Masked Autoencoders (MAEs) learn to be capable at reconstructing images with random components
removed [13]. The approach follows the standard ViT-patchification common to transformer computer
vision approaches for deconstruction of the image that the attention mechanism can learn between.
The source of this “powerful expressivity” is attributed to “rich hidden representation” [14]. This is
particularly of interest in our scenario, as we seek to learn which components of solar imagery are of
value for our scientific validation cases. This model was expanded to increase suitability for temporal
information for remote sensing tasks [15]. We have continued to iterate, adding “solar-awareness” by
including the ability to process the nine wavelengths of interest to us via the Atmospheric Imaging
Assembly, efficiencies for processing the solar disk, and the ability to optionally bias the model
towards learning active regions of scientific interest.

### 3.1.2 Nouveau-VAE
The Nouveau Variational Autoencoder (NVAE) is a deep hierarchical VAE created for image
generation. Like the MAE, it is able to create a rich latent space using depth-wise separable
convolutions and batch normalization. The NVIDIA team’s codebase was modified to permit access
to the hierarchical structure to successfully extract embeddings.

# 4. Scientific Validation Cases
**Predict F10.7** This index is a proxy for solar irradiance, which can be measured from the ground,
as this frequency is not absorbed by the atmosphere. Can we achieve good agreement with ground
measurements? There is limited scientific value in this prediction of a proxy measure such as F10.7,
however this simple task clearly indicates learned capacity in a single result.

**Virtual EVE** In 2014, an instrument malfunction resulted in the loss of the MEGS-A module
of SDO/EVE. With four years of overlapping data, [16, 17] used a hybrid CNN/linear regression
model to successfully demonstrate the capability of machine learning methods to estimate missing
EUV irradiance measurements from MEGS-A (and the degraded MEGS-B components of the EVE
instrument). This validation task employs the embeddings constructed from AIA to understand the
contributions from solar features on the EUV spectra, as the mapping between instruments exists due
to the narrow-band images (SDO/AIA) and sun-as-a-star spectra (SDO/EVE) observing the same
plasma distribution. A linear model accounts for a large portion of the relationship, while a CNN is
used to correct for outlier events such as solar flares. There are known concerns regarding the model’s
performance post-2020, as AIA instrument performance deviates further from the 2014 baseline.
Some of these issues can be addressed by incorporating other sources of irradiance, such as data from sounding rockets, for training over longer periods, although these are sparse. Importantly, this
outperforms a physics-based inversion approach [18].

**Missing Channel Reconstruction** The reconstruction of missing extreme ultraviolet (EUV) images
from wavelength images is a crucial task given the often low or unusable quality of image data frames
from the Solar Dynamics Observatory (SDO). Currently, there is no effective method to recover these
missing steps. However, the foundation model is capable of reconstructing individual frames by
leveraging contextual information available in other wavelength channels. This approach allows for
interpolation to provide a best-guess estimate of missing data at any arbitrary time step.
As with the Virtual EVE project, and differential emission measure analysis [18], the overlapping
temperature range covered by different SDO/AIA wavelength channels allows for the temperature
distribution of the underlying plasma to be reconstructed, may enable the inference of properties of
different temperature ranges.

The overlapping range covered by different wavelength channels may enable the inference of properties
of different temperature ranges. This overlap can be used within a machine learning model to
produce an estimation to replace data that is either missing, corrupted, or otherwise unusable. Our
objective is to develop a more robust model that operates with higher computational efficiency while
producing results comparable to the current SOTA. Special attention is given to the model’s ability to
capture non-linear relationships or rare events, such as intensity values in flaring regions.
There are several uncertainties inherent in this process. Some channels may be more readily recreated
than others due to the physical assumptions that channels in the middle of the temperature/wavelength
ranges will have the most overlap with other channels, potentially yielding better results. However,
this overlap might not always correspond to the actual missing data in the SDO. Addressing these
uncertainties requires an understanding of the shortfalls to determine the appropriateness of this
reconstruction technique in different scenarios.

**Autocalibration** The SDO/AIA EUV channels exhibit degradation due to exposure to the same
emissions they are intended to measure. This degradation results in apparent dimming over time
across multiple EUV channels with unique characteristics. This poses challenges for long-term
studies, as degradation trends within the dataset need to be corrected. Until 2014, SDO utilized EVE
to correct this degradation. As discussed, a malfunction of SDO/EVE resulted in the loss of the
MEGS-A component, and calibration is currently performed by sounding rocket flights. In response
to this, [19] used a CNN to reconstruct the Atmospheric Imaging Assembly (AIA) multi-channel
degradation curves.

Data requirements for this study include the SDOML data from AIA as well as older correction
tables. The sampling requirement is minimal, with data being required once per day or even less
frequently. Traditional SOTA methods, such as those performed by the Lockheed Martin Solar and
Astrophysics Laboratory (LMSAL), involve calibration using sounding rocket flights. These methods,
while accurate, are expensive and technically demanding. Our goal is to reproduce the results from
[19] with greater efficiency in terms of data required and computational resources. This efficiency
is evaluated through an examination of the resultant images compared to those produced by SOTA
calibration pipelines, alongside intensity histograms, data spike analysis, and other metrics.

# 5. Results
Overall, our model families were evaluated for their backbone reconstruction task and against our four
scientific validation cases. In all but the autocalibration task they reached the same level of accuracy
or surpassed their classical counterparts in a fraction of the required time. In the autocalibration case,
the direct embedding approach was able to match, but took additional training time.

## 5.1 Reconstruction
Loss for the reconstruction task is measured by pixel RMSE within the solar disk. SAMAE results
presented in fig. 5 indicate a clear ability to reconstruct most wavelengths under a small embedding
dimension (128) and within a short number of training epochs (10). Interestingly this model struggles
to reconstruct 131 & 171Å, which is likely due to a normalization error we’re still investigating. The
Nouveau-VAE model on raw pixel intensity performs better, even when including the solar limb

## 5.2 Direct Embeddings
Training each scientific validation case on the embeddings directly led to generally much faster
training time and matching or surpassing of accuracy. The was an effort made to best evaluate
the embeddings outside of the scientific cases to consider embedding-to-embedding comparison.
The common TSNE approach was taken over a small one-year sample and there was
seperation by solar activity. This approach however is still fairly opaque and hence the validation
approaches are considered more appropriate.

# Acknowledgements

This work is the research product of the SDO-FM: A Multi-Modal Foundation Model POC
for SDO. This has been funded and supported by NASA under **Grant award No
80NSSC24K0701**. Any opinions, findings, and conclusions or recommendations expressed
in this material are those of the authors and do not necessarily reflect the views of the
National Aeronautics and Space Administration (NASA). The research and its
outputs have been designed, managed and delivered by Trillium Technologies Inc
(https://trillium.tech). Trillium is a research and development company with a focus on
intelligent systems and collaborative communities for planetary stewardship, space
exploration and human health. Trillium aspires to ensure that the
latest tools and techniques in Artificial Intelligence (AI) and Machine Learning (ML) are
applied to developing open science for all Humankind.

 **Authors**  
 
James Walsh, University of Cambridge  
Daniel Gass, University of Central Lancashire  
Raul Ramos Pollan, Universidad Industrial de Santander  
Richard Galvez, Pure Storage  
Paul Wright, Dublin Institute for Advanced Studies  
Atılım Güneş Baydin, University of Oxford  
Noah Kasmanoff, AE Studio  
Jason Naradowsky, University of Tokyo  

PI: Anne Spalding, Trillium Technolgies Inc  
Co-I: James Parr, Trillium Technologies Inc