SOI TN 01-145
R S Bogart
2001.08.09
The data files and programs referred to in this note are currently scattered in a variety of personal and scratch-space directories, principally ~rick/ton and /surge2/ton/. A considerable amount of raw data and Level-0 data have already been ingested into the DSDS under prog:ton as described in TN 98-140.
140.114.80.234 = astro4.phys.nthu.edu.twunder the directories pub/ton-header/XX, where XX is one of the four standard TON site codes: bb, hr, tf, and ub. The most recent update to these header files at the ftp site was Dec. 11, 2000. The entire contents of these directories were copied to /surge2/ton/tape_lists on Dec. 19, 2000.
The general outline of the required data processing steps is:
}> tonexpect file= /surge2/ton/tape_lists/bb/bb008.dat(Note that depending on the longitude of the site, the roughly contiguous data from daylight hours may span two UT days; we have decided to organize all the data by UT day corresponding approximately to SOI mission day. A typical Big Bear data set will thus contain 8-12 hours of data in two separate blocks, one at the beginning of the day and the other at the end.)
summary of /surge2/ton/tape_lists/bb/bb008.dat:
1994.07.10 14:43:56 ( 883) - 23:59:56 (1439) 339
1994.07.11 00:00:56 ( 0) - 23:59:56 (1439) 690
1994.07.12 00:00:56 ( 0) - 23:59:56 (1439) 702
1994.07.13 00:00:56 ( 0) - 01:53:56 ( 113) 115
1846 images total
The format of the headers is not uniform; there are several different variants, described under the section on Level-0 processing below, and the module needs to be trained to recognize the different variants as they come up. There may still be cases of unknown formats, especially among the earlier data.
The images are normally time-ordered in the tape header files, although occasionally headers from groups of days are replicated. The program detects this of course. For example:
}> tonexpect file= /surge2/ton/tape_lists/tf/tf129.dat(There are also cases in which the entries are clearly spurious: for example, ub031.dat starts with 12 headers dated 1970.01.01 before the next ones dated 1996.11.24.) The files can be suitably edited to remove the duplicate entries; there are about a dozen such cases I have noted so far, of the 690 verification files on line. Of more concern is the fact that there are various errors in the files that cause the interpreting program to fail:
summary of /surge2/ton/tape_lists/tf/tf129.dat:
1995.05.05 16:15:56 ( 975) - 19:25:56 (1165) 191
1995.05.06 06:53:56 ( 413) - 16:26:56 ( 986) 574
1995.05.07 06:28:56 ( 388) - 19:32:56 (1172) 785
1995.05.08 07:34:56 ( 454) - 19:25:56 (1165) 558
1995.05.06 06:53:56 ( 413) - 16:26:56 ( 986) 574
1995.05.07 06:28:56 ( 388) - 19:32:56 (1172) 785
1995.05.08 07:34:56 ( 454) - 13:39:56 ( 819) 367
3834 images total
}> tonexpect file= /surge2/ton/tape_lists/hr/hr135.datThe typical problem is that the newline character was dropped from the header, often with some or all of the digits of the preceding sequence number (which can be inferred). The files can be edited and the interpreting program rerun until all such errors are removed. I have placed copies of the files to be edited on /surge2/ton/fixed_lists. Thus,
summary of /surge2/ton/tape_lists/hr/hr135.dat:
1996.07.03 23:33:56 (1413) - 23:59:56 (1439) 28
1996.07.04 00:00:56 ( 0) - 23:59:56 (1439) 518
1996.07.06 00:00:56 ( 0) -
unexpected string @ line 2103:
; O(551.2,734.6) B=18552 R=493.8 E= 9 Sx=0.01217 Sy=0.01627 Av=17650.2 No=1052
08:23:56 ( 503) 505
1051+ images total
}> tonexpect file= /surge2/ton/tape_lists/ub/ub014.datThere is a log of the edits made on ~rick/ton/fixlog and summary output for all verification files of each site XX on ~rick/ton/sum.XX - these are made by the script mksum. A run of this script for a particular site will produce a list of the files that still need to be edited and a summary of the number of `good' and `bad' files:
summary of /surge2/ton/tape_lists/ub/ub014.dat:
1996.08.10 02:24:56 ( 144) - 13:23:56 ( 803) 660
1996.08.11 02:36:56 ( 156) - 13:23:56 ( 803) 648
1996.08.12 02:21:56 ( 141) - 13:21:56 ( 801) 656
1996.08.13 01:55:56 ( 115) -
unexpected string @ line 4945:
; O(540.0,540.0) B= 0 R= 0.0 E= 2 Sx=0.00687 Sy=0.00818 Av= 980.4 No=2474
10:21:56 ( 621) 508
2472+ images total
}> tonexpect file= /surge2/ton/fixed_lists/ub/ub014.dat
summary of /surge2/ton/fixed_lists/ub/ub014.dat:
1996.08.10 02:24:56 ( 144) - 13:23:56 ( 803) 660
1996.08.11 02:36:56 ( 156) - 13:23:56 ( 803) 648
1996.08.12 02:21:56 ( 141) - 13:21:56 ( 801) 656
1996.08.13 01:55:56 ( 115) - 10:43:56 ( 643) 530
2494 images total
}> mksum hrOnce the files are `fixed', the tonexpect module could be modified to write out more detailed information per expected image (to a file) if that is useful for processing.
hr079.dat
hr093.dat
hr : 76 good, 2 bad
Another problem occurs in certain verification files in which the headers for images from a given day are repeated without intervening days. Examples of such files are ub010 and ub081:
summary of /surge2/ton/fixed_lists/ub/ub010.dat:In these cases the sequence numbering in the headers continues (normally they restart at 1 on a new day), so that the headers for images with the same time are not exactly identical. It is not clear if this is significant. There could in fact be duplicated images in these cases, but there probably aren't. We need to look at a case when we have a corresponding tar tape.
1996.07.30 02:55:56 ( 175) - 11:19:56 ( 679) 480
1996.07.30 02:55:56 ( 175) - 11:18:56 ( 678) 480
960 images total
summary of /surge2/ton/fixed_lists/ub/ub081.dat:
1998.04.16 03:41:56 ( 221) - 12:10:56 ( 730) 502
1998.04.16 03:42:56 ( 222) - 12:50:56 ( 770) 541
1998.04.17 02:51:56 ( 171) - 09:16:56 ( 556) 386
1998.04.18 04:59:56 ( 299) - 10:40:56 ( 640) 339
1998.04.20 06:59:56 ( 419) - 11:58:56 ( 718) 300
1998.04.23 10:50:56 ( 650) - 12:24:56 ( 744) 95
1998.04.27 03:19:56 ( 199) - 13:01:56 ( 781) 583
1998.04.28 08:18:56 ( 498) - 11:27:56 ( 687) 191
2937 images total
Occasionally individual image header times are repeated even though the images are apparently distinct based on the other parameters in the header. For example, there are two entries with reference time 1996/07/21_00:11:56020 in hr136.dat. There is only one actual image in the corresponding tar directory, and it corresponds to the second of the two headers in the verification file.
A few of the header files do not bear the names of the corresponding tapes: hreclipse.dat contains the headers from tape HR 163, for example, and hr148b.dat those from HR 159. ub009a.dat appears to contain only a subset of the contents of ub009..dat
}> tonexpect file= /surge2/ton/fixed_lists/hr/hr135.dat log= /surge2/ton/verify/hrwould produce (or append to) files named 1279, 1280, 1281, 1282, 1283, 1288, and 1292 in the directory /surge2/ton/verify/hr.
}> cat /surge2/ton/verify/hr/1279(Note the repeated entries corresponding to a pair of distinct image headers with the same reference time in the verification file!)
HR135 960703.1414
HR135 960703.1415
HR135 960703.1415
HR135 960703.1416
...
HR135 960703.1438
HR135 960703.1439
HR135 960703.1440
260_016:40:00W_028:18:00N
N No. 1, 617 GMT1996/08/01_18:43:55670 O(538,542) M( 2, -2) B= 4736 R=496 T=226 E=0 size=1080,1080 expousre=1500 ms
; O(537.6,541.7) B= 4736 R=496.0 E= 0 Sx=0.01011 Sy=0.01301 Av= 5197.9 No= 1
One problem with the existing module is that the serial numbers in the level-0 data set are based on the minute number of the reference time, which is generally, maybe always, one less than the minute number of the observing time, since the exposures are set to be centered on the minute tick, and the clock is started about 4 seconds earlier.
I1ij = (I0ij - DCij)/FFij