Archival of Mt. Wilson 60-Foot Solar Tower Data at the MDI Science Center

SOI TN 03-146
P. Rose & R.S. Bogart
2003.10.20

Introduction

This document replaces the older document of SOI TN 98-139 that described our initial plans to archive MWO data in 1998.

This document provides a design and interface specification for the archiving and processing of full-disc, high-resolution (1024x1024 pixel array), and low-resolution (512x512 pixel array) solar Doppler data from the Magneto-Optical Filter (MOF) on the Mt. Wilson 60-ft Solar Tower. The aim of the project is to provide an ongoing publicly-accessible archive of original and processed data from the instrument, ultimately covering the duration of the observing series from 1987 to present date.

Data Product Descriptions

Level-0 data

The Level-0 filtergrams are obtained with an automated data acquisition system using the Magneto-Optical-Filter (MOF). The MOF allows for the selection of a right-circularly polarized component of a very narrow portion of the wavelength being sampled, or a left-circularly polarized component, shifted to a slightly lower wavelength. A pair of solar images are taken every minute in these doppler-shifted wavelengths producing a red-shifted component, and a blue-shifted component of the primary wavelength of the cells being used, in this case either potassium or sodium.


RED-filtergram image taken Sep 25,2000	BLUE-filtergram image taken Sep 25,2000

Observation histories for all 17 years of the Level-0 filtergrams are broken down into two categories; the pre-MDI era, and the MDI-era. The MDI instrument on the SOHO spacecraft became operational in early 1996, so observation histories are broken down into year ranges of 1987 through 1995 and 1996 to current date.

Pre-MDI-era Observation History covering the period 1987-1995

MDI-era Observation History covering the period 1996-present

Level-1 data

Level-1 Dopplergrams are computed from each pair of raw filtergrams on a per-minute basis. Each Dopplergram is a difference of the filtergrams in the sense of red - blue, and the difference is divided by the sum of the filtergrams to normalize the image. Likewise, a Total-Intensity image is simply the sum of the red and blue filtergrams. The purpose of the 60-Foot Solar Tower Project is to collect as many raw filtergrams for the computation of Level-1 Dopplergrams, by utilizing the 2-cell MOF. Unfortunately, two working filtercells were not always available because of their finite lifetime, and a limited amount of single-cell images were taken. By itself, a single cell has the affect of producing total-intensity images. This document is mainly concerned with the computation, conversion, and archiving of Level-1 Dopplergrams; the computation of total-intensity images is not being planned at this moment.


Dopplergram computed from above two filtergrams (red-blue)/(red+blue)	Total Intensity image computed from above two filtergrams; (red+blue)

The Level-1 Dopplergrams consist of FITS files organized in directories containing all images observed within a given day. These data sets will be organized under dataset names prog:mwo,level:lev1,series:vel_01d[day-number], with an epoch for the day number started early enough to assure non-negative values in the archive; i.e. 1987.01.01_00:00:00_TAI. Individual files will be named uyymmdd.hhmm.fits, where yymmdd is the 6-digit year, month and day of the data set, and hhmm is the 4-digit hour and minute, corresponding to the nominal observation time.

The FITS record attached to each of the images contains the following keywords.

KEYWORD		Definition
SIMPLE		Mandatory. Conversion file type (T)
BITPIX		Bits assigned to each pixel. Defined as 16 bits/pixel
NAXIS		Mandatory. Defined as the number of axes in images (2)
NAXIS1		Mandatory. Defined as the number of pixels in x direction
NAXIS2		Mandatory. Defined as the number of pixels in y direction
DATE		Date the image was taken, PST
TIME		Time the image was taken, PST
KTIME		Obs time in PST, 100ths of seconds that elapsed since Jan 1, 00PST. This is the start time of the Level-0 red-filtergram.
K_MIN		Floating point minute since 00PST of the current day for the image
STATUS		Observation Status, 100ths of seconds elapsed since Jan 1, 00PST. This is the start time of the Level-0 blue-filtergram.
JULIANDA		The Julian Date for the current image
DATASIGN		Sign Convention. Set to -1 for velocity data, which signifies that motion is in the redshift direction
INTERVAL		Image integration time, in seconds (typically, 5.575). Two raw filtergram images (one red-shifted and one blue-shifted) are taken to compute each Dopplergram. The time separation of the start of these two images is 5 seconds. The integration time for each image is roughly 0.575 seconds. The INTERVAL is computed as the time separation of the original two filtergrams plus the integration time of the second image.
IM_SCALE		Nominal pixel scale in arc-secs/pixel; Defined as OBS_R0 / 0.5*(FNDLMBMA+FNDLMBMI). Note, FNDLMBMA and FNDLMBMI are the mean values of the semi-major and semi-minor axes for the day.
MA_SCALE		pixel scale along major axis for given minute, defined as OBS_R0 / ELLMAJOR
MI_SCALE		pixel scale along minor axis for given minute, defined as OBS_R0 / ELLMINOR
FNDLMBMA		Daily Average of major axis measured from Sun's center (pixels)
FNDLMBMI		Daily Average of minor axis measured from Sun's center (pixels)
FNDLMBAN		Daily Average Angle measured CW from the +Y axis to major axis
ELLMAJOR		Major axis measured from Sun's center (pixels) for given minute
ELLMINOR		Minor axis measured from Sun's center (pixels) for given minute
ELLANGLE		Ellipse Angle for given minute, measured CW from the positive Y axis to major axis
AVG_AREA		avg-daily area of the solar disk (pixels)^2; FNDLMBMAFNDLMBMIpi
AREA		per-image area of the solar disk (pixels)^2; ELLMAJORELLMINORpi
R_SUN		Semi-major axis length of the apparent solar image (daily mean )
S_MAJOR		Scale for semi-major axis of non-circular images, for given minute
S_MINOR		Scale for semi-minor axis of non-circular images, for given minute
S_ANGLE		Position angle of S_MAJOR axis
ORIENT		The orientation of the Sun's image indicating the "side" of the Sun that corresponds to the origin and x-axis used for the pointing and scale parameters. Typically, SESW Because of different cameras and changes in optical configuration throughout the course of this project, the orientation will change.
X_SCALE		CCD pixels per bin. X{Y}_SCALE are the size of image elements in units of CCD pixels where the pixels are each of angular size, IM_SCALE.
Y_SCALE		See XSCALE above
XCEN		Location of the x center of the image array on the sun measured in arc-seconds from the center of the Sun's disk. Images are registered to the center of the chip array during computation of the Dopplergram.
YCEN		Location of the y center of the image array on the sun measured in arc-seconds from the center of the Sun's disk. Images are registered to the center of the chip array during computation of the Dopplergram
X0		Distance along X axis to the central pixel of the solar disk (pixels) MDI starts counting at '0', so disk center is at 511.0
Y0		Distance along Y axis to the central pixel of the solar disk (pixels) MDI starts counting at '0', so disk center is at 511.0
BUNIT		Physical Units of the data, in "meters/second"
BZERO		Offset applied to true pixel values
BSCALE		Scaling factor to convert integer type pixel elements of the image array to floating point type
CALSLOPE		velocity calibration slope; constant for day. BSCALE/AREA
INTEGDOP		Integrated doppler signal. The sum of all pixel values within the ellipse of the solar disk defined for that minute.
VEL_DOP		Integrated doppler velocity; VEL_DOP = CALSLOPE*INTEGDOP + BZERO
VEL_RES		Velocity residuals; VEL_DOP - OBS_VR
VEL_RMS		Root Mean Square of daily velocity residuals
LAT		Location of the observatory in degrees
LON		Location of the observatory in degrees
ORIGIN		Observatory where the telescope is located. (The Mount Wilson Observatory)
TELESCOP		Name of the telescope used to take the images (60-Foot Solar Tower)
INSTRUME		Type of instrument used; MOF (Magneto-Optical-Filter)
DATE_OBS		Date and time of observation in UT.
SOLAR_P0		Solar Position-angle computed using 'solephem' software program, measured CCW (in degrees) from the +y-axis of the CCD array.
SOLAR_P		Effective Solar Position-angle, measured CCW from +y-axis to N-pole. Angle is in degrees, and points to N-pole, taking into account the orientation of the coelostat mirrors. During the date range from roughly August 19 to April 30, of every year, the SOLAR_P angle makes a noticable jump of several degrees around 12PST, due to the change in coelostat mirror orientation. This mirror change is necessary as the Sun drops in declination to avoid a shadow from the second flat pier.
DATAFILE		Name of the FITS image file.
T_OBS		Actual (center of) observation time. Computed as INTERVAL/2 + "start time of first raw Level-0 filtergram"
MININDEX		Number of whole minutes elapsed since 00PST. This value is the integer value of K_MIN
OBS_B0		Heliographic Latitude of the observer's disk center. Computed using 'solephem'
OBS_L0		Heliographic Longitude of the observer's disk center. Computed from 'solephem'
OBS_R0		Apparent semi-diameter of Sun in arc seconds. Computed from 'solephem'
OBS_DIST		Sun-Earth Distance (AU). Computed from 'solephem'
XSCALE		XSCALE = X_SCALE * IM_SCALE
YSCALE		YSCALE = Y_SCALE * IM_SCALE
OBS_VR		Radial velocity of observer in m/s. Positive direction is toward the Sun. This value is the sum of the geocentric + barycentric velocities relative to the observer.
OBS_VRHC		Heliocentric radial velocity in m/s. Positive direction is toward the Sun. This value is the sum of the geocentric + heliocentric velocities relative to the observer.
QUALITY		Status summary. Scale runs from 0 - 10. This value is determined by 'sigma = int(VEL_RES/VEL_RMS)'. If 'sigma' is 3 standard deviations or less, the QUALITY value is '0' and the image is clean and sharp. If 3sigma < VEL_RES <= 4sigma , QUALITY = 3 If 4sigma < VEL_RES <= 5sigma , QUALITY = 4 If 5sigma < VEL_RES <= 6sigma , QUALITY = 5, etc. If VEL_RES > 10sigma , QUALITY = 10 Values of 4 and 5 are often times still acceptable for spherical harmonic decompostion, but can vary from day to day, so should not assumed as a given.
END

The following keywords will be included in FITS records in all other years except for 1996;

KEYWORD		Definition
CADENCE		interval between measurements of computed Dopplergrams (60 secs)
CENTER_X		Center of solar disk along x-axis assuming origin starts at a count of 1; defined as X0 + 1
CENTER_Y		Center of solar disk along x-axis assuming origin starts at a count of 1; defined as X0 + 1
OBS_TYPE		Cell operation status. There are two possible types based on observing in MOF mode (two cells) 'DOPPL '; observing in INT mode (one cell ) 'INTEN '
OBS_VN		northward velocity of Sun with respect to earth
OBS_VW		westward velocity of Sun
FILTER		filter selection status; sodium (Na) or potassium (K)
CAMERA		camera selection status; a total of four different cameras have been used in the course of this project, 3 hi-res and 1 low-res: 'JPL ' 1024x1024 native resolution built and designed at JPL In use from 1987 through 1991. 'JPL-TALK ' 1024x1024 native resolution with board set designed by Doug Triman at Talktronics Inc., and assembled at JPL In use from mid-1993 to present, with break in early 1994 for testing 'TALKTRONICS' 1024x1024 native resolution built by Talktronics Inc. Secondary camera used in late 2002 for testing purposes. 'PANASONIC ' 492x512 native resolution with rows padded to 512x512 array; 30Hz analog aquisition In use from 1992 through 1994 as temporary replacement for "JPL" camera

FITS conversion notes

The FITS conversion process is verified for correctness by generating data tables of the keyword values for each minute of the day. Plots are generated from these tables and archived at USC, but they are not being made available at Stanford, or on the USC 60-Ft Solar Tower web site. Plots can be requested from USC through email; erhodes@usc.edu. A sample of the plots available are found in the links below:

Ephemeris data: including P-angle, B-angle, and Carrington Longitude
Heliographic coordinates: including major-axis, minor-axis, ellipse angle, etc.
Image Scaling: including major, minor, and average image scales, and time checks
Velocity and Calibration: including doppler signal, scaled velocity, velocity residuals, and quality of the images

Data Flow

The overall data flow is summarized in the following steps:

Collection of Level-0 data: raw data has been collected on 8mm Exabyte tapes in original (sunio) format; data collection will continue to be carried out by the Mt. Wilson, 60-Foot Tower staff, with the tapes permanently archived at USC.
Creation of Level-1 data: raw data has been processed into Dopplergrams and stored to either 8mm tape or CD-R in the original sunio format. Future processed Dopplergrams will be stored only on CD-R.
Calibration: the Level-1 Dopplergrams will be re-calibrated to the Sun_Earth radial velocity component with a higher resolution than previously used and allows for computation of velocity errors and quality of the image.
Conversion: the calibration information, and additional meta-data that comes from the raw-to-doppler processing step will be used to create a FITS record for each image in the doppler time-series. Procedure was designed at USC and uses tools obtained from Stanford.
Transfer: the Level-1 FITS images are transfered to Stanford and stored on their tape-archive system at the MDI Science Center.
Helioseismology processing: the Level 1 data will be processed through the same pipeline modules in use at Stanford for processing MDI data to such products as spherical-harmonic amplitudes, mode frequencies, ring-diagram and time-distance data sets, and archived at Stanford.

Data flow notes

The processing of Mt. Wilson data from native format takes about 4 hours per tape, which contains one day's worth of data, primarily due to tape read time; apart from observation, this is the principle limitation on overall data throughput.
Tests conducted on 8 Oct., 1998 suggested that under good conditions the transfer rate from USC to Stanford was on the order of 13.5 2-MB files per minute, or a rate of roughly 27MB per minute. More recent tests conducted on 9 Oct., 2003 revealed a faster transfer rate of about 60MB per minute. At this rate, about 30 files per minute can be transfered to Stanford; about twice what it was in 1998. It should thus require about 15 to 20 minutes to transfer one day's worth of data. The data will be pulled from USC to Stanford by ftp. A file will be generated upon completion of the transfer of a full day of data. This file will include a total count of images received; a tabulation of the number of images of correct file length and number of images of incorrect file length. Images of incorrect file length will be re-sent.
Level-1 Doppler data located on temporary disks at Stanford will be "ingested" into the MDI archive via a cron job developed by Rick Bogart. The data will be immediately accesible through the MDI data archive located at: http://soi.stanford.edu/sssc/progs/mwo
The Level 1 data are expected to be functionally equivalent to MDI Level 1.5 data, i.e. capable of being processed directly by the level 2 pipeline modules; this requires that they have all required ancillary data described under appropriate keywords, and that they be organized in sets of equal time steps, with gaps noted in the descriptor file.
It is our aim and expectation that minimal changes will be required to the post-Level-1 processing modules: the only modifications that should be needed are fixes to bugs that may be uncovered in the processing of Doppler data with problems not encountered or successfully solved previously.

Estimated data volume

The Mt. Wilson observations began in August, 1987 and have been continuing for over 16 years, with an average of 200+ days of observations per year. Daily observations typically last from 7 to 12 hours and produce 60 filtergram pairs per hour. The filtergrams (and processed Dopplergrams) are 16 bits deep. The overall data volume is roughly broken down as:

Level 0: 250 MB / hr (dataset); 2 GB / day; 0.4 TB / yr; 6+ TB to date
Level 1: 125 MB / hr (dataset); 1.0 GB / day; 0.2 TB / yr; 3+ TB to date

More Precisely, the Mt. Wilson observations can be broken down into a total of 8 possible data types utilizing a combination of two cameras (one 1024x1024 hi-res digital camera, and one 512x512 low-res, video rate camera), two possible observing wavelenthgs ( Na-589.6nm and K-769.9nm), and two cell operations (single cell (INT) - intensity, and 2-cell (MOF) - doppler ). The MOF requires the use of two working filtercells for the collection of raw Level-0 data that can be converted to Dopplergrams. A single working filtercell allows for the collection of total-intensity data only. A summary of the total camera-wavelength-cell combination follows from most common scenario to least common scenario:

1024-Na-MOF: accounts for more than 90% of accumulated data
512-Na-MOF: most common in 1992 and 1993, when original JPL-built CCD camera failed.
1024-Na-INT: collected when there was only one working cell
1024-K-MOF: most common in latter half of 1993, when Na-cells were unavailable.
512-Na-INT: rare
512-K-INT: rare
1024-K-INT: extremely rare
512-K-MOF: extremely rare

Below, is a summary of the days observed from 1987 to the current date. The data products are broken down into hi-res data and low-res data, and may contain any of the wavelength and filter types outlined above.

Below is a graphical representation of the amount of Level-1 doppler data that has been processed each year and is available for FITS conversion and archive to Stanford. The lower panel plot shows the amount of data that has already been archived at Stanford. At the time of this writing, there remains 126 days of pre-MDI, Level-1 data to be processed, and 153 days of Level-0 filtergrams from 2002, that require a spatial correction and further processing to Level-1 Dopplergrams.

Data product availability at the University of Southern California is broken down here. For purposes of comparison and standardizing the way helioseismic data is counted and blocked, MWO data is broken down into GONG months. Named after the Global Oscillation Network Group, operated by the National Solar Observatory, a GONG month is a series of 36 calendar days, and the first GONG month started on May 7, 1995. The plot below shows the amount of data product processed for each GONG month at USC and The Mt. Wilson Observatory for the MDI-era. The MDI-era is defined as starting, here, on May 1, 1996, or at the beginning of GONG month 11.

Prior to the beginning of the GONG project, the 60-Foot Solar Tower data was also broken down into 36-day periods to keep in sync with the GONG months. We named the periods prior to the GONG project as MWO-months. These are numerically increasing months starting from June 12, 1988 and running through May 6, 1995, covering a total of 70 MWO-months. Note: data from 1987 has also been collected but it was not originally counted in the creation of the MWO-months. Ultimately, the MWO-months will be renumbered to accomodate for the 1987 data, but that re-numbering will occur later.

The bar-graph plot below represents the data collected and processed for the pre-MDI era. MWO months 71 through 80 are equivalent to GONG months 1 through 10, and are plotted this way for convenience only. The bars with open lines represent the amount of original raw data collected and the potential amount of doppler (or intensity) data that can be processed for that month, but not yet completed. The filled in dark bars are representative of the amount of complete doppler data processed for that month.

Estimated data transfer time

The remaining time to complete the level 1 processing on the unprocessed level 0 images is currently estimated to total 299 days. Given our current staffing levels, we expect that this processing will be completed in roughly ten months, or by mid-August, 2004.

The estimated number of days of Dopplergram observations which will be processed,converted into FITS format, and transferred to Stanford by June 1, 2004 is expected to equal 310.

At our current staffing levels, we estimate that the number of days of Dopplergrams which will be converted and transferred to Stanford between June 1, 2004, and May 31, 2005, will be 1560. Combined with the 176 days previously transferred to Stanford, and combined with the 310 additional days which we expect to have transferred while we are completing the level 0 processing of our pre-MDI filtergrams, we anticipate that the total number of days of Dopplergrams which will be transferred to Stanford prior to May 31, 2005, will equal 2046.

If we are forced to reduce the number of our data analysts from two to one in the middle of 2004, then we will not be able to transfer as many Dopplergrams to Stanford by May 31, 2005. In that case, we estimate that the number of days of Dopplergrams which will be transferred to Stanford by that date will only equal 310+780+176 or 1266.

Subtracting the 176 days of data which have already been transferred to Stanford from the total number of observing days of 3270, we get a total of 3094 days of observations to be processed and transferred. If we also subtract the total of 101 of these days during which we only had total intensity images rather than Dopplergrams available for processing, we arrive at a total of 2993 days of Dopplergrams which remain to be processed and transferred by the end of the third year of our current LWS grant. With the higher estimate of 2046 days of Dopplergrams which we think might be transferred to Stanford prior to May 31, 2005, we would be left with an additional total of 947 days of Dopplergrams, which would have to be transferred to Stanford prior to the May 31, 2006, termination of our current LWS grant. That number of Dopplergrams which would have to be transferred during the third year of our LWS grant is a reasonable number assuming that we can continue to obtain the services of all of our existing data analysts.

Using the reduced estimate of the number of days of Dopplergrams which will be transferred to Stanford by May 31, 2004, we will only have processed and transferred a total of 1266 Dopplergrams prior to May 31, 2005. This reduced level of processing would leave a total of 1727 days of observations to be processed and transferred during the third year of our current grant. This total is substantially more than one data analyst working alone would be able to process and transfer during that year.