SOHO Archive of MDI Data

SOI TN 00-143
Jim Aloise
2000.06.12

Introduction

In early 1998 it was agreed that at least most of the MDI "Campaign" data should be available through the main SOHO archive at GSFC. The intent was to as a minimum develop a method to place data descriptions for MDI data into the SOHO catalog. The data repository would remain at Stanford but identification of useful data and exprot requests could be accomplished remotely by users of the SOHO main catalog.

This document describes a draft plan to accomplish that goal. It was generated in June 2000 in consultation with Luis Sanchez Duarte (lsanchez@esa.nascom.nasa.gov) and Isabelle Scholl (scholl@medoc-ias.u-psud.fr) of the Institut d'Astrophysique.

Points of Implementation

1.) MDI provides "per observation" info for lev1.5 data. An "observation" is either an MDI dataset (e.g. fd_V_01h, hr_M_01h, etc.. lev1.4 for loi data.) or an image taken from an MDI dataset that we will make available as a per minute observation. Initially our observations will be full datasets. Later we will decide and make available certain data as a per minute observation.

2.) Each observation that MDI makes in a day will have a line of keyword=value pairs. At the end of the day there will be a file called, for example soho_catalog_rcd.2000.06.12, that will contain all the lines of keywords pairs. Each line is to be considered self contained and no information is to be considered implied by the sequence of lines. After midnight each day the file will appear in a directory available for an ftp request from the EOF at their discretion. We actually keep a seperate file of keyword pairs for each ds. These files are created by a new module called sainfo (SOHO Archive info) which is run after cpinfo for all the lev1.5 and loi 1.4 processing. All the seperate files and concatenate at the end of the day. One file per day appears for all data processed on that day. The actual data processed does not have to represent that day. These files will accumulate and remain permanently online.)

3.) This shows what will be a single line in a sample soho_catalog_rcd.yyyy.mm.dd file. Note there are no spaces in a line. The keywords are deliniated by "|".

FILENAME=prog:mdi,level:lev1.5,series:fd_V_01h[64895]|
TELESCOP=SOHO|
INSTRUME=MDI|
DETECTOR=MDI|
ORIGIN=SOI_Science_Support_Center|
DATE_MOD=2000.06.08_10:26:05|
LEV_NAME=lev1.5|
LEV_NUM=2|
BYTES=1.26969e+08|
OBS_MODE=fd|
DATATYPE=Dopplergram|
SER_NAME=fd_V_01h|
T_START=2000.05.27_23:00:00_TAI|
T_STOP=2000.05.28_00:00:00_TAI|
COORDSYS=Solar_Cropped|
XCEN=2.218792675781E-01|
YCEN=1.448794357910E+00|
NAXIS1=1024|
NAXIS2=1024|
P_ANGLE=0.000000000000E+00|
IXWIDTH=1898.73|
IYWIDTH=1898.73|
XUNT_PIX=arcsec|
YUNT_PIX=arcsec|
CDELT1=1.977840000000E+00|
CDELT2=1.977840000000E+00|
PCT_LOSS=0.0

Where:

Keyword Source Definition Comment
FILENAME overview.{sn}.rdb DSNAME physical file name filename = dataset name
TELESCOP Constant the observatory 'SOHO'
INSTRUME Constant the instrument 'MDI'
DETECTOR Constant its detector 'MDI'
ORIGIN Constant Where the file is processed 'SOI_Science_Support_Center'
DATE_MOD Local time this line was created time of the file last update yyyy.mm.dd_hh:mm:ss
LEV_NAME overview.{sn}.rdb DSNAME level_name level of data eg. lev1.4 or lev1.5
LEV_NUM overview.{sn}.rdb DSNAME level_number level number of the dataset
BYTES overview.{sn}.rdb BYTES total #of bytes of the dataset or number of bytes for an image
OBS_MODE lookup table (convert.rdb) on series_name observation mode
DATATYPE lookup table (convert.rdb) on series_name observation type
SER_NAME overview.{sn}.rdb DSNAME series_name observation name
T_START overview.{sn}.rdb T_START (_TAI) observation date begin This will be DATE_OBS (_UT) for an individual image.
T_STOP overview.{sn}.rdb T_STOP (_TAI) observation date end This will be DATE_OBS (_UT) for an individual image.
COORDSYS Constant Coordinate system used 'Solar_Disc' or 'Solar_Cropped'
XCEN {sn}.record.rdb XCEN first valid rcd x position of the field of view
YCEN {sn}.record.rdb YCEN first valid rcd y position of the field of view
NAXIS1 lookup NAXIS1 on series_name (convert.rdb) x size of the image
NAXIS2 lookup NAXIS2 on series_name (convert.rdb) y size of the image
P_ANGLE {sn}.record.rdb P_ANGLE first valid rcd angle of the field of view
IXWIDTH Solar_Disc: {sn}.record.rdb CDELT1 * NAXIS1
Solar_Cropped: crop radius from DPC *2*CDELT1 unless width!=0 in convert.rdb
x width of the field of view
IYWIDTH Solar_Disc: {sn}.record.rdb CDELT2 * NAXIS2
Solar_Cropped: same as IXWIDTH
y width of the field of view
XUNT_PIX Constant x Spatial Unit per pixel 'arcsec'
YUNT_PIX Constant y Spatial Unit per pixel 'arcsec'
CDELT1 {sn}.record.rdb CDELT1 first valid rcd x spatial resolution value
CDELT2 {sn}.record.rdb CDELT2 first valid rcd y spatial resolution value
PCT_LOSS 100(1 - total_datavals/max_datavals) percentage of loss data

4.) The lookup table (convert.rdb) refered to above is:

SERIES_NAME NAXIS1 NAXIS2 OBS_TYPE OBS_MODE WIDTH COORD_SYS PER_IMAGE
____________ _______ _______ _________ __________ _________ ____________ ____________
fd_I0_01h 1024 1024 Filtergram fd 0 Solar_Cropped N
fd_I0_1024x500_01h 1024 500 Filtergram fd_1024x500 0 Solar_Disc N
hr_I0_01h 1024 1024 Filtergram hr 0 Solar_Disc Y
hr_I0_1024x500_01h 1024 500 Filtergram hr_1024x500 0 Solar_Disc N
hr_I0_1024x600_01h 1024 600 Filtergram hr_1024x600 0 Solar_Disc N
hr_I0_1024x640_01h 1024 640 Filtergram hr_1024x640 0 Solar_Disc N
hr_I0_1024x800_01h 1024 800 Filtergram hr_1024x800 0 Solar_Disc N
hr_I0_1024x850_01h 1024 850 Filtergram hr_1024x850 0 Solar_Disc N
hr_I0_500x1024_01h 500 1024 Filtergram hr_500x1024 0 Solar_Disc N
hr_I0_700x700_01h 700 700 Filtergram hr_700x700 0 Solar_Disc N
fd_Ic_01h 1024 1024 Continuum fd 0 Solar_Cropped N
fd_Ic_30s_01h 1024 1024 Continuum fd 0 Solar_Cropped N
fd_Ic_6h_01d 1024 1024 Continuum fd 0 Solar_Cropped Y
fd_Ic_bin2x2_30s_01h 512 512 Continuum fd_bin2x2_30s 0 Solar_Cropped N
hr_Ic_1024x600_01h 1024 600 Continuum hr_1024x600 0 Solar_Disc N
hr_Ic_1024x800_01h 1024 800 Continuum hr_1024x800 0 Solar_Disc N
hr_Ic_700x700_01h 700 700 Continuum hr_700x700 0 Solar_Disc N
loi_Ic_01d 180 1 Continuum loi 1964.16 Solar_Cropped N
rwbin_Ic_01d 128 128 Continuum rwbin 0 Solar_Disc N
vw_Ic_06h 200 200 Continuum vw !!TBD Solar_Cropped N
fd_Ld_01h 1024 1024 LineDepth fd 0 Solar_Cropped N
hr_Ld_1024x700_01h 1024 700 LineDepth hr_1024x700 0 Solar_Disc N
hr_Ld_1024x750_01h 1024 750 LineDepth hr_1024x750 0 Solar_Disc N
hr_Ld_700x700_01h 700 700 LineDepth hr_700x700 0 Solar_Disc N
hr_Ld_bin2x2_01h 512 512 LineDepth hr_bin2x2 0 Solar_Disc N
loi_Ld_01d 180 1 LineDepth loi 1964.16 Solar_Cropped N
rwbin_Ld_01d 128 128 LineDepth rwbin 0 Solar_Disc N
vw_Ld_01d 200 200 LineDepth vw !!TBD Solar_Cropped N
vw_Ld_06h 200 200 LineDepth vw !!TBD Solar_Cropped N
fd_M_01h 1024 1024 Magnetogram fd 0 Solar_Cropped Y
fd_M_96m_01d 1024 1024 Magnetogram fd 0 Solar_Cropped Y
fd_M_1024x500_01h 1024 500 Magnetogram fd_1024x500 0 Solar_Disc Y
fd_M_1024x750_01h 1024 750 Magnetogram fd_1024x750 0 Solar_Disc Y
hr_M_01h 1024 1024 Magnetogram hr 0 Solar_Disc Y
hr_M_1024x500_01h 1024 500 Magnetogram hr_1024x500 0 Solar_Disc Y
hr_M_1024x640_01h 1024 640 Magnetogram hr_1024x640 0 Solar_Disc Y
hr_M_1024x850_01h 1024 850 Magnetogram hr_1024x850 0 Solar_Disc Y
hr_M_500x1024_01h 500 1024 Magnetogram hr_500x1024 0 Solar_Disc Y
hr_M_700x700_01h 700 700 Magnetogram hr_700x700 0 Solar_Disc Y
hr_M_bin2x2_01h 512 512 Magnetogram hr_bin2x2 0 Solar_Disc Y
fd_V_01h 1024 1024 Dopplergram fd 0 Solar_Cropped N
fd_V_30s_01h 1024 1024 Dopplergram fd 0 Solar_Cropped N
fd_V_6h_01d 1024 1024 Dopplergram fd 0 Solar_Cropped Y
fd_V_1024x750_01h 1024 750 Dopplergram fd_1024x750 0 Solar_Disc N
fd_V_bin2x2_01h 512 512 Dopplergram fd_bin2x2 0 Solar_Cropped N
fd_V_bin2x2_30s_01h 512 512 Dopplergram fd_bin2x2 0 Solar_Cropped N
hr_V_01h 1024 1024 Dopplergram hr 0 Solar_Disc N
hr_V_12s_01h 1024 1024 Dopplergram hr 0 Solar_Disc N
hr_V_1024x600_01h 1024 600 Dopplergram hr_1024x600 0 Solar_Disc N
hr_V_1024x640_01h 1024 640 Dopplergram hr_1024x640 0 Solar_Disc N
hr_V_1024x700_01h 1024 700 Dopplergram hr_1024x700 0 Solar_Disc N
hr_V_1024x750_01h 1024 750 Dopplergram hr_1024x750 0 Solar_Disc N
hr_V_700x700_01h 700 700 Dopplergram hr_700x700 0 Solar_Disc N
hr_V_bin2x2_01h 512 512 Dopplergram hr_bin2x2 0 Solar_Disc N
loi_V_01d 180 1 Dopplergram loi 1964.16 Solar_Cropped N
vw_V_01d 200 200 Dopplergram vw !!TBD Solar_Cropped N
vw_V_06h 200 200 Dopplergram vw !!TBD Solar_Cropped N
fd_Vc_01h 1024 1024 Dopplergram fd 0 Solar_Cropped N
fd_Vm_01h 1024 1024 Dopplergram fd 0 Solar_Cropped N
fd_Vm_1024x500_01h 1024 500 Dopplergram fd_1024x500 0 Solar_Disc N
fd_Vm_1024x750_01h 1024 750 Dopplergram fd_1024x750 0 Solar_Disc N
fd_Vm_bin2x2_01h 512 512 Dopplergram fd_bin2x2 0 Solar_Cropped N
hr_Vm_01h 1024 1024 Dopplergram hr 0 Solar_Disc N
hr_Vm_1024x500_01h 1024 500 Dopplergram hr_1024x500 0 Solar_Disc N
hr_Vm_1024x600_01h 1024 600 Dopplergram hr_1024x600 0 Solar_Disc N
hr_Vm_1024x640_01h 1024 640 Dopplergram hr_1024x640 0 Solar_Disc N
hr_Vm_500x1024_01h 500 1024 Dopplergram hr_500x1024 0 Solar_Disc N
hr_Vm_700x700_01h 700 700 Dopplergram hr_700x700 0 Solar_Disc N
hr_Vm_bin2x2_01h 512 512 Dopplergram hr_bin2x2 0 Solar_Disc N

5.) When the SOHO catalog user finaly has selected observations to export there will be done an export request like so:

      ssh dummy@tarax.Stanford.EDU lpr -PXFA < export_file

      where export_file looks like for example:

      prog:mdi,level:lev1.5,series:hr_M_01h[37809]
      prog:mdi,level:lev1.5,series:hr_M_01h[37810]
      prog:mdi,level:lev1.5,series:hr_M_01h[37811]
      prog:mdi,level:lev1.5,series:hr_M_01h[37812],sel:[5]
      prog:mdi,level:lev1.5,series:hr_M_01h[37828]
      email=jim@shoom.Stanford.EDU
      kbytes=62123
      export=ftpa
      dir=/ftp/ftp/data
      source=soho
where tarax has a speical "dummy" user that can only perform the lpr command. The "dummy" .shosts file has a pre-approved host or host/user list, by agreement with each archive site.

6.) When we receive an export request for a single record we will at first export the full dataset but at some time in the future we will export individual image extracted datasets. At that time we will also have code in place to sort and combine multiple requests so that sets of images from the same source dataset will be sent as full datasets if some filling-factor threshhold is passed.

7.) We will need to generate some new keywords from calculations using our present header information. A list of the target keywords will be made in collaboration with the EOF and MEDOC SOHO archives.

8.) MDI will sometimes supply duplicate records but in each case the filename field can be used to eliminate the duplicates - i.e. if the filename is the same as one already in the database, the one with the more recent create time should be used and the other should be deleted or overwritten or simply ignored.

9.) There should be some method to mark entries as obsolete, perhaps a record with filename, create date, and instrument=OBSOLETE so that a search would never match.

10.) If shorter names are desired for the user interface, the JAVA code could use just the part of the filename field that follows "series:". I.e. the series name, series number, and possibly the select and record number fields of the DSNAME. If this is chosen we should make the rule that the intrument field is derived from our "prog" name. Then mdi_eof data would show up as INSTRUMENT=mdi_eof and not be confused with definitive MDI data.