GATHER_INFO(1)
NAME
gather_info
SYNOPSIS
gather_info [-qNXf]
[in={datacollection} (NOT SPECIFIED for default)]
[keylist={datacollection} (NOT SPECIFIED for default)]
out={datacollection}
[D_VARS= (1 by default)]
[MASKS= (NOT SPECIFIED for default)]
[TIME_KEY= (T_OBS by default)]
[TIME_MATCH= (T_REC by default)]
[TIME_KIND= (SOI by default)]
[ALT_KEY= (NOT SPECIFIED for default)]
[CONFORMS= (TS_EQ by default)]
[PROTOCOL= (RDB.FITS by default)]
[T_EPOCH", SOI_BASE_EPOCH by default)],
[T_BLOCK= (1h by default)]
[EXT_INFO= (1s by default)]
[DATAKIND= (NOT SPECIFIED for default)]
[T_START= (1601.01.01_00:00:00_UT by default)]
[T_STOP= (2499.12.31_23:59:59_UT by default)]
[T_FIRST= (NOT SPECIFIED for default)]
[T_LAST= (NOT SPECIFIED for default)]
DESCRIPTION
gather_info is a module to gather per-record info into a
conforming dataset.
gather_info is given a target collection of datasets, an
collection of input datasets, and a list of keywords which
point to information that needs to be collected from the
input datasets and included in the output datasets.
The output datasets must be SOI "conforming" datasets.
If the output datasets do not yet exist, they will be
created and the overview file will be generated. The struc-
ture of the dataset is set when it is first created.
The program logic is as follows:
{
Get and validate the list of keywords to be gathered
into the output
datasets. If the keyword list is not explicitly speci-
fied,
all keywords and data is included in the output.
Get info about target datasets including conforms,
t_epoch, t_start,
t_stop, t_block, t_step if needed. First look for
"out_nsets"
to see how many datasets are to be updated. Then look
for them
in order; when one is found use the overview file to
derive the
target global information. If none of the overview
files are
present, use the command line arguments.
The command line arguments "T_FIRST" and T_LAST" are
used to
possibly limit the range in output series numbers gen-
erated.
In the case of TS_EQ and TS_BLOCKED datasets, only whole
datasets
are selected, so if T_FIRST and T_LAST specify a range
smaller than
a single dataset from a dataseries, that whole dataset
will be
gathered. In the case of TS or MISC datasets, T_FIRST
and T_LAST
are used as the included limits of the gathered data.
The target times for the output dataset depend on the
conform type
and on the input datasets. In the case of TS_EQ only
the output
times are derived from the conform info. In the cases
of
TS, TS_BLOCKED, and MISC the output times are copies of
the
input times found limited by the blocking; in the case
of
TS_BLOCKED and by T_FIRST and T_LAST. If the output
dataset already
exists the target times will be taken from a time key-
word already
in the output dataset. The time keyword used is speci-
fied in the
"TIME_MATCH" argument, default "T_REC".
Open all the input datasets and sort records by time,
alternate key
value, quality, and input dataset index. If the "-q"
flag is set
and if the matching MDI log file is found, the quality
info is taken
from the MDI log file; otherwise it is set to the same
default value
for all records. In order to use the quality info the
log file must
have the TIME_KEY time parameter. Presently
gather_qual produces
the log file and includes both T_REF and T_OBS.
Now, for each output dataset:
{
Compute t_start and t_stop.
If the overview file is not present, make one.
(should: If one is present, check for consistency.)
Build up a list of output variables. Currently only
one
variable is supported.
open the output record info files. If not found,
create empty AT tables.
Build up a list of input datasets that will be
needed from their
t_start, etc.
Open the keyword description file. if not present,
then finish by closing the output dataset.
Maybe only the overview file was wanted.
for each keyword do:
Decide which variables are to be updated, of if
global value
for overview file.
If global, look for keyword in each input
dataset until it
is found. If never found, take default.
If record keywords, for each t_obs in the target
dataset,
choose the input data record which best matches
in quality,
closness in time, closeness in alternate key
value, and
dataset index. The record must be within =-
half of the
output dataset t_step from t_obs. The record
quality must
also test sucessfully against reject masks which
may be
specified in the parameter list.
Take the keyword value and add to the appropri-
ate AT table.
KEYWORD LIST FORMAT for gather_info program.
The list of keywords will be passed to the pro-
gram as a
dataset name using the parameter "keylist"
The list will be stored as an ".rdb" file.
The list will contain a list of keywords to be
added to
the global (overview) file and to the per-record
information
files. For each keyword listed there are five
fields
of information in the rdb file.
Scope: A comma delimited list of variable
numbers in
the target dataset, or the word
"ALL" or the
word "GLOBAL" or the word "DATA".
ALL means
all variables. GLOBAL means the
attribute
will be added to the overview file
rather than record files. DATA
means to take
the data record rather than a key-
word.
DEFAULT is like GLOBAL but SourceVar
will be
ignored.
Keyword: Contains the attribute name in the
target output
dataset. A single "*" means take
all keywords
from the source variable.
Default: Contains an optional default value
to use if the
input keyword is not present in the
input
dataset, or if there is no input
dataset.
SourceVar: Contains the input variable number
or name.
SourceKey: Contains the input attribute keyword
name.
Must be absent if Keyword contains
an "*".
If the SourceKey begins with a "$"
character
the value will be generated by
gather_info.
If the SourceKey name begins with a
"#"
character the value will be looked
for in
the "quality" dataset if the -q flag
is set
and is otherwise an error.
close the output dataset.
}
close the input datasets.
done.
}
Gather_info must be called for each input dataseries that
will be used to contribute global or per_record info to the
output datasets.
The default datasets are TS_EQ, starting at the default
SOI_BASE_EPOCH which is 1993.01.01_00:00_00_TAI. The default
is hourly datasets with one second records.
If there is a pre-existing dataset but the existing struc-
ture is to be replaced, the "-N" flag must be set. Other-
wise the record time structure will be taken from the exist-
ing dataset. Also, alt_key sorts are only allowed when
creating the output dataset from scratch.
If NO data records are found in the input datasets the pro-
gram exits with GATHER_INFO_NO_INPUT_DATA unless the "-X"
flag is set.
If the "f=1" or "-f" flag is set any DATA records gathered
will be forced to floating point.
There are several virtual SourceKeys that may be used as the
source of keyword values to "gather" from for record info.
These are:
$DSNAME The 4-part dataset name,
prog,lev,series[#series] of the
current source with the record spec added as
a
"sel:[recnumber]"
$RUNTIME The UT time at which this run of gather_info
was made.
These may be placed in the SourceKey position and placed in
the global or record info as desired.
If the target SourceKey is not found the Default value is
used. If the Default value is the reserved value "$REQUIRED"
gather_info exits with the REQUIRED_KEYVALUE_MISSING error
code set.
QUALITY NOTES
Rejections and quality comparisons are based on a .rdb file
whose name can be specified in the argument list with the
keyword MASKS. If a filename is not explicitly specified in
the arguments, a configured file corresponding to the output
series name is used. The file should have the columns "COM-
PARE" and "REJECT". The entries should be strings which can
be converted to unsigned ints by strtoul.
Example:
COMPARE REJECT
------- ------
0x01 0x80
0x7C
No more than MAX_MASKS rows of the table will be used. The
number of entries in each column may differ. The order of
the COMPARE entries is significant.
The function compare_quality compares two unsigned int qual-
ities by testing them successively against the comparison
masks. The two qualities are bitwise ANDed with a com-
parison mask and the two results are compared NUMERICALLY.
Qualities are compared against the masks in order until the
results are unequal or until all masks have been used. The
return value from compare_quality is the result of the first
non-zero NUMERICAL comparison or 0 if all comparisons were
equal.
The function reject_quality "rejects" an unsigned int qual-
ity if the bitwise AND of the quality with any of the reject
masks is non-zero.
FOR EXAMPLE, if the mask table above is used, any quality
without bit 0 (MDI_TELEM_ERRS) set is better than any qual-
ity with bit 0 set. And if two qualities have the same bit
0, the quality with the smallest NUMERICAL bitwise AND with
bits 2-6 (MDI_MISSVALS) is better. Qualities with bit 7
(MDI_ONBRD_PROC_ERR) set are rejected. Note that the numer-
ical comparison makes it possible to test sets of bits.
If the masks table cannot be accessed or contains errors,
default masks which are equivalent to the previous version
of gather_info are used. These are set by the function
get_masks to compare as in the example above but nothing is
rejected.
FILES
/home/soi/CM/include/soi_at.h
/home/soi/CM/include/data_qual.h
/home/soi/CM/include/soi_error.h
/home/soi/CM/tables/observables/masks/{series}_masks.rdb
/soidata/info/mdi_log/{level}/{series}/{#%06d#series}.record.rdb
SEE ALSO
vds(3)
DIAGNOSTICS
Writes informational messages to the history log.
Writes warning and error messages to the error log.
Returns non-zero error code on failure.
BUGS