EOVSA Data Analysis Tutorial 2022
1. Introduction to EOVSA and Datasets
EOVSA (Expanded Owens Valley Solar Array) is a solar-dedicated radio interferometer operated by the New Jersey Institute of Technology. EOVSA observes the full disk of the Sun at all times when the Sun is >10 degrees above the local horizon, which is season dependent and ranges from 7-12 hours duration centered on 20 UT. Like any radio interferometer, the fundamental measurement for imaging is the correlated amplitude and phase between each pair of antennas, which is called a “complex visibility.” EOVSA’s 13 antennas form 78 such visibilities at any frequency and instant of time, i.e. 78 measurements of the spatial Fourier transform of the solar brightness distribution. EOVSA records these visibilities at *451 science frequency channels each second, in four polarization products (XX, YY, XY, YX), as well as additional total flux measurements from each individual antenna. These data are then processed through a pipeline processing system whose data flow is shown in the block diagram (Figure 1). One of the outputs of the pipeline is a visibility database in a widely used open-standard format called a CASA measurement set (or “ms”; CASA is the Common Astronomy Software Applications package used by many modern interferometer arrays).
EOVSA delivers the radio interferometry data on the following three levels:
- Level 0 - Raw visibility data - This includes observations of cosmic sources for phase calibration, and gain and pointing observations required for total power calibration.
- Level 1 - Calibrated visibility data - After applying calibration and other preliminary processing to level 0 data, we create the calibrated visibility data in CASA ms format (second column in Figure 1). These visibility data have all of the required content to produce Level 2 images and spectrogram data in standard FITS format. The following tutorial will guide you through how to make EOVSA images and spectrogram from visibility data. We provide a set of standard ms’s for each day (pink solid boxes in Figure 1) for users who wish to start with visibility data. You can retrieve EOVSA 1-min averaged visibility data in CASA ms format from this page. Full-resolution EOVSA visibility data in CASA ms format will be provided per request. Please contact the *EOVSA team if you wish to have Level 1 visibility data for a specific event.
- Level 2 - Images and spectrogram data in standard FITS format - Most users, however, will prefer to work with spectrogram (frequency-time) and image data, which are also outputs of the pipeline system shown in Figure 1 (orange boxes). Spectrograms are provided as standard FITS tables containing the frequency list, list of times, and data in both total power and a sum of amplitudes over intermediate-length baselines (cross power). Likewise, image data products are in FITS format with standard keywords and are converted into the Helioprojective Cartesian coordinate system compatible with the World Coordinate System (WCS) convention, along with correct registration for the spatial, spectral, and temporal coordinates. Both the spectrogram and image data products are calibrated and have physical radio intensity units (sfu for spectrograms and brightness temperature for radio images).
2. Browsing and Obtaining EOVSA data
EOVSA Browser
The most convenient way for browsing **Level 2** EOVSA data is through the EOVSA Browser. The overview EOVSA dynamic spectra on the top-left panel are from the median of the uncalibrated cross-power visibilities at a few short baselines, which are not (but a good proxy of) the total-power dynamic spectra. The effects of spatial information blended in the cross-power visibilities can be clearly seen as the "U"-shaped features throughout the day, which are due to the movement of the Sun across the sky that effectively changes the length and orientation of the baselines. Flare emission can be seen in the dynamic spectra, which usually appears as vertical bright features across many frequency bands. The pipeline quicklook full-disk images are also displayed for the given frequencies along with SDO AIA full-disk maps. More information can be found on this page. The figure on the right shows an example of the overview EOVSA dynamic spectrum for 2021 May 21.
RHESSI Browser
EOVSA data can also be browsed through RHESSI Browser. Check the "EOVSA Radio Data" box on the data selection area (top-left corner). Then select year/month/date to view the overall EOVSA dynamic spectrum. Note if a time is selected at early UTC hours (e.g., 0-3 UT), it will show the EOVSA dynamic spectrum from the previous day. Also, note that EOVSA data were not commissioned for spectroscopic imaging prior to April 2017.
Once you identify the flare time, you can find the full-resolution (1-s cadence) uncalibrated visibility files (in Miriad format) at this link. Each data file is usually 10 minutes in duration. The name convention is YYYYMMDD (folder name) /IDBYYYYMMDDHHMMSS (file name), where the time in the file name indicates the start time of the visibility data.
Other useful links
- EOVSA User Forum
- EOVSA Flare list
- EOVSA browser
- EOVSA wiki
- EOVSA 1-min averaged CASA ms database
- SunCASA on Github and PyPI
- Flare pipeline?
3. Software requirements and installation
EOVSA visibility data processing: SunCASA A Python wrapper around CASA for synthesis imaging and visualizing solar spectral imaging data. CASA is one of the leading software tools for "supporting the data post-processing needs of the next generation of radio astronomical telescopes such as ALMA and VLA", an international effort led by the National Radio Astronomy Observatory. The current version of CASA uses Python (3.6) interface. More information about CASA can be found on NRAO's CASA website. NOTE: CASA is available ONLY on UNIX-BASED PLATFORMS, and therefore, so is SunCASA.
EOVSA image data processing: GSFIT A IDL-widget(GUI)-based spectral fitting package called gsfit, which provides a user-friendly display of EOVSA image cubes and an interface to fast fitting codes (via platform-dependent shared-object libraries). For this tutorial, we will demonstrate using SunCASA to create EOVSA image and spectrogram from the visibility data observed during a X-class flare on 2022 March 30. There are two approaches in accessing the SunCASA package:
[convenient] Through this notebook, together with Google Colaboratory (colab) which hosts this notebook on free virtual environment that requires no setup and runs entirely in the cloud. If you are into this option, skip the following Installation section and go directly to the Tour Section.
4. Download Sample EOVSA Data for the Tutorial
5. Information on the EOVSA Visibility Data (CASA Measurement Set)
With the pip installation, the CASA modules may be used in a standard Pythonic manner. For example, CASA tasks can be invoked using import, while CASA tools are Python classes that first need to be instantiated to create usable objects. A useful first task to run is listobs, which provides a summary of the measurement set.
6. Self-calibration (optional)
Self-calibration is often needed in bringing out details of the flare at multiple frequencies. An example script, run in SunCASA, for doing self-calibration of the 2017 Aug 21 flare at ~20:20 UT can be found at this tutorial.
7. Cross-Power Dynamic Spectrum
The first module we introduce is dspec. This module allows you to generate a cross-power dynamic spectrum from an MS file, and visualize it. You can select a subset of data by specifying a time range, spectral windows/channels, antenna baseline, or uvrange. The selection syntax follows the CASA convention. More information of CASA selection syntax may be found in the above links or the Measurement Selection Syntax.
from suncasa.utils import dspec as ds import matplotlib.pyplot as plt ## define the visbility data file msfile = 'IDB20220330_concat_slfcaled__backsub.ms' ## define the output filename of the dynamic spectrum specfile = msfile + '.dspec.npz' ## select relatively short baselines within a length (here I use 0.15~0.5km), ## and take a median cross all of them (with the domedian parameter) ## alternatively, you can use the "bl" parameter to select individual baseline(s) uvrange = '0.15~0.5km' ## this step generates a dynamic spectrum and saves it to specfile dspec=ds.get_dspec(vis=msfile, specfile=specfile, uvrange=uvrange, domedian=True) ## dspec is a Python dictionary that contains the resulting dynamic spectrum. ## A copy is saved in "specfile" as a numpy npz file. ## Other optional parameters are available for more selection criteria ## such as frequency range ("spw"), and time range ("timeran") ## Use "ds.get_dspec?" to see more options dspec['spec'].shape ## (1, 1, 451, 1798) One polarization, 451 frequencies and 1798 times dspec.keys() ## ['tim', 'pol', 'uvrange', 'bl', 'timeran', 'freq', 'spec', 'spw'] ## The following command displays the resulting cross-power dynamic spectrum ## Use ds.plt_dspec? to check more plotting options ds.plt_dspec(specfile) #alternatively the next step can also be used ds.plt_dspec(dspec)
Now you should have a popup window showing the dynamic spectrum. Hover your mouse over the dynamic spectrum, you can read the time and frequency information at the bottom of the window.
8. Quick-Look Imaging
Imaging EOVSA data involves image cleaning, as well as solar coordinate transformation and image registration. We bundled a number of these steps ino a module named qlookplot, allowing users to generate an observing summary plot showing the cross power dynamic spectrum, GOES light curves and EOVSA quick-look images. Now let us start with making a summary plot of EOVSA image at the spectral window 5 (5.4 GHz).
## in SunCASA from suncasa.utils import qlookplot as ql msfile = 'IDB20170821201800-202300.4s.slfcaled.ms' ## (Optional) Supply the npz file of the dynamic spectrum from previous step. ## If not provided, the program will generate a new one from the visibility. specfile = msfile + '.dspec.npz' ## set the time interval timerange = '20:21:10~20:21:18' ## select frequency range from 5 GHz to 6 GHz spw = '5~6' ## select stokes XX stokes = 'RR' ## turn off AIA image plotting, default is True plotaia = False ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, \ stokes=stokes, plotaia=plotaia)
The empty panels are there as we only selected one polarization for imaging/spectroscopy. Feel free to explore by adjusting the above parameters, e.g. use a different time range ("timerange" parameter) and/or frequency range ("spw" parameter).
With qlookplot, it is easy to engage solar data from SDO/AIA in the summary plot. Setting plotaia=True in the qlookplot command will download SDO/AIA data at the given time to current directory and add it to the summary plot.
## in SunCASA ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, \ stokes=stokes, plotaia=True)
The resulting radio image is a 4-D datacube (in solar X-pos, Y-pos, frequency, and polarization), which is, by default, saved as a fits file msfile + '.outim.image.fits' under your working directory. The name of the output fits file can be specified using the "outfits" parameter. In this example, we combine all selected frequencies (specified in keyword "spw") into one image (i.e., multi-frequency synthesis). Therefore the third axis only has one plane. The fourth axis contains polarization. In this example, however, only the first one is populated ("XX"). Here is the relevant information in the header of the resulting FITS file.
NAXIS = 4 NAXIS1 = 512/ Nx NAXIS2 = 512/ Ny NAXIS3 = 1/ number of frequency NAXIS4 = 2/ number of polarization
By default, qlookplot produces a full sun radio image (512x512 with a pixel size of 5"). If you know where the radio source is (e.g., from the previous full-Sun imaging), you can make a partial solar image around the source by specifying the image center ("xycen"), pixel scale ("cell"), and image field of view ("fov"). Here we show an example that images a 8-s interval around 20:21:14 UT using multi-frequency synthesis in 12-14 GHz and a smaller restoring beam. The microwave source is show to bifurcate into two components, which correspond pretty well with the double flare ribbons in SDO/AIA.
## in SunCASA xycen = [375, 45] ## image center for clean in solar X-Y in arcsec cell=['2.0arcsec'] ## pixel scale imsize=[128] ## number of pixels in X and Y. If only one value is provided, NX = NY fov = [100,100] ## field of view of the zoomed-in panels in unit of arcsec ql.qlookplot(vis=msfile, specfile=specfile, timerange='20:21:10~20:21:18', \ spw='12GHz~14GHz', stokes='XX', restoringbeam=['6arcsec'],\ imsize=imsize,cell=cell,xycen=xycen,fov=fov,calpha=1.0)
Next, we will make images for every single spectral window in this data set (from spw 1 to 30, spw 0 is in the visibility data, but is not calibrated for this event).
## in SunCASA xycen = [375, 45] ## image center for clean in solar X-Y in arcsec cell=['2.0arcsec'] ## pixel size imsize=[128] ## x and y image size in pixels. fov = [100,100] ## field of view of the zoomed-in panels in unit of arcsec spw = ['{}'.format(s) for s in range(1,31)] clevels = [0.5, 1.0] ## contour levels to fill in between. calpha=0.35 ## now tune down the alpha restoringbeam=['6arcsec'] ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, stokes=stokes, \ restoringbeam=restoringbeam,imsize=imsize,cell=cell, \ xycen=xycen,fov=fov,clevels=clevels,calpha=calpha)
The output FITS file as a 4-d cube is saved to msfile + '.outim.image.fits' under your working directory. The 30 spectral windows used for spectral imaging are combined in the output FITS file. So NAXIS3 (frequency axis) is 30.
NAXIS = 4 / number of array dimensions NAXIS1 = 128 NAXIS2 = 128 NAXIS3 = 30 NAXIS4 = 2
Quick-Look Imaging series
# In SunCASA msfile = 'IDB20170821201800-202300.4s.slfcaled.ms' specfile = msfile + '.dspec.npz' ## set the time interval timerange = '2017/08/21/20:21:10~2017/08/21/20:21:18' ## Bin width for time averaging twidth = 1 ## frequency range spw = ['0~3'] ## image center for clean in solar X-Y in arcsec xycen = [375, 45] ## number of pixels in X and Y. If only one value is provided, NX = NY imsize = [128] ## field of view of the zoomed-in panels in unit of arcsec fov = [100., 100.] ## pixel scale cell = ['2.0arcsec'] ## select stokes XX stokes = 'XX' ## for EOVSA data, set usemsphacenter to False usemsphacenter = False ## set True if make a movie mkmovie = True ## set True if generate compressed fits docompress = True #### ---- Control knobs for AIA plotting ---- #### ## set True if plot AIA images as the background plotaia = True ## provide the path to the directory where the AIA fits files are located. Otherwise, set it to be '' aiadir = 'Directory-where-AIA-fits-files-are-located' ## AIA passband. The options are [171,131,304,335,211,193,94,1600,1700] aiawave = 171 ## numbers of CPU threads for computing ncpu = 2 ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, xycen=xycen, imsize=imsize, fov=fov, cell=cell, usemsphacenter=usemsphacenter, plotaia=plotaia, aiadir=aiadir, aiawave=aiawave, mkmovie=mkmovie, twidth=twidth, ncpu=ncpu, docompress=docompress, stokes=stokes)
9. Downloading fits files to your local system with Batch-Mode Imaging
This section is for interested users who wish to generate FITS files with full control on all parameters being used for synthesis imaging. We provide one example SunCASA script for generating 30-band spectral imaging maps, and another for iterating over time to produce a time series of these maps.
Producing a 30-band map cube for a given time
An example script can be found at this Github link. If you are on the AWS server Virgo, it is under /common/data/eovsa_tutorial/imaging_example.py. First, download or copy the script to your own working directory and cd to your directory.
# in SunCASA CASA <##>: cd your_working_directory CASA <##>: !cp /common/data/eovsa_tutorial/imaging_example.py ./
Second (optional), change inputs in the following block in your copy of the "imaging_example.py" script and save the changes. This block has definitions for time range, image center and FOV, antennas used, cell size, number of pixels, etc.
################### USER INPUT GOES IN THIS BLOK ################ vis = 'IDB20170821201800-202300.4s.slfcaled.ms' # input visibility data trange = '2017/08/21/20:21:00~2017/08/21/20:21:30' # time range for imaging xycen = [380., 50.] # center of the output map (in solar X and Y, Unit: arcsec) xran = [340., 420.] # plot range in solar X. Unit: arcsec yran = [10., 90.] # plot range in solar Y. Unit: arcsec antennas = '' # Use all 13 EOVSA 2-m antennas by default npix = 256 # number of pixels in the image cell = '1arcsec' # pixel scale in arcsec pol = 'XX' # polarization to image, use XX for now pbcor = True # correct for primary beam response? outdir = './images' # Specify where to save the output fits files outimgpre = 'EO' # Something to add to the output image name
Then, run the script in SunCASA via
CASA <##>: execfile('imaging_example.py')
The output is a combined 30-band FITS file saved under "outdir". The naming conversion is outimgpre + YYYYMMDDTHHMMSS_ALLBD.fits. In this example, it is "./images/EO20170821T202115.000_ALLBD.fits". Here is the detailed information of the axes of the output FITS file.
NAXIS = 4 / number of array dimensions NAXIS1 = 128 NAXIS2 = 128 NAXIS3 = 30 NAXIS4 = 2
From this point, you can use your favorite language to read the fits files and plot them. The last block of the example script uses SunPy.map to generate the plot shown on the right. For SSWIDL users, please refer to Section 3.3.3.3.
Producing a time series of 30-band maps (a 4-d cube)
An example script can be found at this Github link. If you are on the AWS server Virgo, it is under /common/data/eovsa_tutorial/imaging_timeseries_example.py. The script enables parallel-processing (based on a customized task "ptclean3" -- do not ask me why we have "3" in the task name). To invoke more CPU processes for parallel-processing, change "nthreads" in the preambles from 1 to the number of threads you wish to use.
# in SunCASA CASA <##>: cd your_working_directory CASA <##>: !cp /common/data/eovsa_tutorial/imaging_timeseries_example.py ./
Similar as the previous script, change inputs in the following block of your copy of the "imaging_timeseries_example.py" script and save the changes.
################### USER INPUT GOES IN THIS BLOK #################### vis = 'IDB20170821201800-202300.4s.slfcaled.ms' # input visibility data specfile = vis + '.dspec.npz' ## input dynamic spectrum nthreads = 1 # Number of processing threads to use overwrite = True # whether to overwrite the existed fits files. trange = '' # time range for imaging, default to all times in the data twidth = 1 # make one image out of every 2 time integrations xycen = [380., 50.] # center of the output map in solar X and Y. Unit: arcsec xran = [340., 420.] # plot range in solar X. Unit: arcsec yran = [10., 90.] # plot range in solar Y. Unit: arcsec antennas = '' # default is to use all 13 EOVSA 2-m antennas. npix = 128 # number of pixels in the image cell = '2arcsec' # pixel scale in arcsec pol = 'XX' # polarization to image, use XX for now pbcor = True # correct for primary beam response? grid_spacing = 5. * u.deg # grid spacing in degrees outdir = './image_series/' # Specify where to save the output fits files imresfile = 'imres.npz' # File to write summary of the imaging results outimgpre = 'EO' # Something to add to the image name
Then, run the script in SunCASA by
CASA <##>: execfile('imaging_timeseries_example.py')
The output fits images and a summary file imresfile are saved under "outdir" (in this example "./image_timeseries"). The naming convention of output fits images is the same as the previous section. The summary yields the time, frequency, and path to every image.
CASA <##>: imres = np.load(outdir + imresfile)['imres'].item() CASA <##>: imres.keys() ['FreqGHz', 'ImageName', 'Succeeded', 'BeginTime', 'EndTime'] CASA <##>: ls image_series/*.fits | head -5 image_series/EO20170821T201800.500_ALLBD.fits image_series/EO20170821T201804.500_ALLBD.fits image_series/EO20170821T201808.500_ALLBD.fits image_series/EO20170821T201812.500_ALLBD.fits image_series/EO20170821T201816.500_ALLBD.fits
The last block of the example script uses SunPy.map to generate a series of plots and wraps them as a javascript movie.
Plotting Images in SSWIDL
All the output files are in standard FITS format in the Helioprojective Cartesian coordinate system (that most spacecraft solar image data adopt; FITS header CTYPE is HPLN-TAN and HPLT-TAN). They are fully compatible with the SSWIDL map suite which deals with FITS files. We have prepared SSWIDL routines to convert the single time or time-series FITS files to an array of SSWIDL map structure. The scripts are available in the Github repository of our tutorial. For those working on Virgo, local copies are placed under /common/data/eovsa_tutorial/. Two scripts are relevant to this tutorial:
- casa_readfits.pro: read an array of multi-frequency FITS files into FITS header (index) and data.
- casa_fits2map.pro: convert an array of multi-frequency FITS files into an array of SSWIDL map structure (adapted from P. T. Gallagher's hsi_fits2map.pro)
First, start SSWIDL from command line:
sswidl
Find and convert the fits files:
; cd to your path that contains casa_readfits.pro and casa_fits2map.pro. If on Virgo, just add the path IDL> add_path, '/common/data/eovsa_tutorial/' IDL> fitsdir = '/common/data/eovsa_tutorial/image_series/' IDL> fitsfiles = file_search(fitsdir+'*_ALLBD.fits') ; The resulting fitfiles contains all multi-frequency FITS files under "fitsdir" ; The following command converts the fits files to an array of map structures. ; "casa_fits2map.pro" also work well on a single FITS file. ; (Optional) keyword "calcrms" is to calculate rms and dynamic range (SNR) of the maps in a user-defined "empty" region ; of the map specified by "rmsxran" and "rmsyran" ;;; Here we load 10 time frames as a demonstration IDL> casa_fits2map, fitsfiles[45:54], eomaps [, /calcrms, rmsxran=[260., 320.], rmsyran=[-70., 0.]]
The resulting maps have a shape of [# of frequencies, # of polarizations, # of times]
IDL> help,eomaps EOMAPS STRUCT = -> <Anonymous> Array[30, 1, 10]
They have compatible keywords recognized by, e.g., plot_map.pro. Example of the output map keywords:
IDL> help,eomaps,/str ** Structure <36009f8>, 24 tags, length=131336, data length=131332, refs=1: DATA DOUBLE Array[128, 128] XC DOUBLE 378.99175 YC DOUBLE 49.001472 DX DOUBLE 2.0000000 DY DOUBLE 2.0000000 TIME STRING '21-Aug-2017 20:20:59.500' ID STRING 'EOVSA XX 3.413GHz' DUR DOUBLE 3.9999997 XUNITS STRING 'arcsec' YUNITS STRING 'arcsec' ROLL_ANGLE DOUBLE 0.0000000 ROLL_CENTER DOUBLE Array[2] FREQ DOUBLE 3.4125694 FREQUNIT STRING 'GHz' STOKES STRING 'XX' DATAUNIT STRING 'K' DATATYPE STRING 'Brightness Temperature' BMAJ DOUBLE 0.0097377778 BMIN DOUBLE 0.0097377778 BPA DOUBLE 0.0000000 RSUN DOUBLE 948.03989 L0 FLOAT 0.00000 B0 DOUBLE 6.9299225 COMMENT STRING 'Converted by CASA_FITS2MAP.PRO'
An example for plotting all images at a selected time:
; choose a time pixel IDL> plttime = anytim('2017-08-21T20:21:15') IDL> dt = min(abs(anytim(eomaps[0,0,*].time)-plttime),tidx) ; use maps at this time, first polarization, and all bands IDL> eomap = reform(eomaps[*,0,tidx]) IDL> window,0,xs=1000,ys=800 IDL> !p.multi=[0,6,5] IDL> loadct, 3 IDL> for i=0, 29 do plot_map,eomap[i],grid=5.,$ tit=string(eomap[i].freq,format='(f5.2)')+' GHz', $ chars=2.0,xran=[340.,420.],yran=[10.,90.]