Eva and YAML files#
Eva uses YAML formatted files to direct data load, data transformation, and plot generation. YAML is a human-readable data-serialization language. It is written with a .yml or .yaml (preferred) file extension. Much like python, indentation is critical in yaml.
Eva contains examples of working YAML input files at eva/src/eva/tests/config
.
YAML general notes#
In YAML the #
character precedes comments.
Keys are identified by a following colon mykey:
, and are followed by their corresponding value (which may be a single item,
other keys, lists, etc.).
List members are denoted by a leading dash and space - mylist
, with one list member per line. Lists can also be
specified by square brackets with each entry separated by a comma [item1, item2]
.
Indentation is critical in YAML, as it defnies relationships between elements.
Eva’s requirements for input#
Eva expects to find two top level keys datasets:
and graphics:
in an input YAML file. (Top level here means no indentation in the line.)
Additionally a third top level key transforms:
may be specified as well.
datasets:
- name: experiment
type: IodaObsSpace
filenames:
- ${data_input_path}/ioda_obs_space.aircraft.hofx.2020-12-14T210000Z.nc4
groups:
- name: ObsValue
variables: &variables [airTemperature, windEastward]
The value of the type:
key above must reference a specific data ingest class. Note that the specification in the YAML file is in camel case (CamelCase), while the
corresponding data ingest class will be in snake case (snake_case). See https://github.com/JCSDA-internal/eva-docs/tree/develop/doc/eva_user_guide/data_ingest
for the available ingest classes.
Anchors and aliases#
Anchors and aliases are YAML constructs that allow for reduced repeating syntax and extending
existing data nodes. Anchors &
can be placed on a yaml component to mark a list or a multi-line section.
An alias *
can then call that anchor later in the document to reference the anchor contents.
Eva extends this construct. Consider this example transform:
transforms:
# Generate omb for GSI
- transform: arithmetic
new name: experiment::ObsValueMinusGsiHofXBc::${variable}
equals: experiment::ObsValue::${variable}-experiment::GsiHofXBc::${variable}
for:
variable: *variables
Here variable is specified as the alias *variables. This means that all variables specified by the preceeding variables anchor
variables: &variables [airTemperature, windEastward]
will be applied here. The transform will step through the variables
plugging them, one at a time, into the transform where ${variable}
is specified.
This same mechanism may be applied to the plots in graphics:
as well.