Python API
yamlprocessor.dataprocess
Process includes and variable substitutions in a YAML file.
For each value {"INCLUDE": "filename.yaml"}
, load content from include file
and substitute the value with the content of the include file.
For each string value with $NAME
or ${NAME}
syntax, substitute with
value of corresponding (environment) variable.
For each string value with $YP_TIME_*
or ${YP_TIME_*}
syntax,
substitute with value of corresponding date-time string.
Validate against specified JSON schema if root file starts with either
#!<SCHEMA-URI>
or # yaml-language-server: $schema=<SCHEMA-URI>
line.
CLI usage allows multiple positional arguments.
In usage 1, the final positional argument is the output file name, and the other arguments are input file names.
In usage 2, with --output=FILENAME
(-o FILENAME
) option,
all positional arguments are input file names.
In either case, all input files will be concatenated together (as text), before being parsed as a combined YAML file.
- class yamlprocessor.dataprocess.DataProcessor
Process YAML and compatible data structure.
Import sub-data-structure from include files. Process variable substitution in string values. Process date-time substitution in string values. Validate against JSON schema.
- .include_paths: list
Locations for searching include files. Default is the value of the
YP_INCLUDE_PATH
environment variable split into a list.
- .schema_prefix: str = os.getenv("YP_SCHEMA_PREFIX")
Prefix for JSON schema specified as non-existing relative paths. See also
YP_SCHEMA_PREFIX
.
- .time_formats: dict = {'': '%FT%T%z'}
Default and named time formats. See also
YP_TIME_FORMAT
andYP_TIME_FORMAT_
.
- .time_now: datetime.datetime
Date-time at instance initialisation.
- .time_ref: datetime.datetime
Reference date-time. Default is the value of the
YP_SCHEMA_PREFIX
environment variable asdatetime.datetime
ortime_now
if the environment variable is not defined.
- get_filename(filename: str, parent_filenames: list) str
Return absolute path of filename.
If filename is a relative path, look for the file but looking in the directories containing the parent files, then the current working directory, then each path in .include_paths.
- Parameters:
filename – File name to expand or return.
parent_filenames – Stack of parent file names.
- static load_file(filename: str | IO) object
Load content of (YAML) file into a data structure.
- Parameters:
filename – file (name) to load content.
- Returns:
the loaded data structure.
- static load_file_schema(filename: str | IO) object
Load schema location from the schema association line of file.
- Parameters:
filename – name of file to load schema location.
- Returns:
a string containing the location of the schema or None.
- load_include_file(value: object, parent_filenames: list, variable_map: dict) tuple
Load data if value indicates an include file.
- Parameters:
value – Value that may contain file name to load.
parent_filenames – Stack of parent file names.
variable_map –
variable_map
in the local scope, may have additional variables.
- log_settings()
Log (info) current settings of the processor.
- process_data(in_filenames: str | Iterable[str], out_filename: str) None
Concatenate input files and load resulting data.
Dump results in output file.
- Parameters:
in_filenames – input file name str or input file names list.
out_filename – output file name.
- process_variable(item: object, variable_map: dict = None) object
Substitute (environment) variables into a string value.
Return item as-is if not .is_process_variable or if item is not a string.
For each $NAME and ${NAME} in item, substitute with the value of the environment variable NAME.
If NAME is not defined in the .variable_map and .unbound_placeholder is None, raise an UnboundVariableError.
If NAME is not defined in the .variable_map and .unbound_placeholder equals to the value of DataProcessor.UNBOUND_ORIGINAL, then leave the original syntax unchanged.
If NAME is not defined in the .variable_map and .unbound_placeholder is not None, substitute NAME with the value of .unbound_placeholder.
- Parameters:
item – Item to process. Do nothing if not a str.
variable_map –
variable_map
in the local scope, may have additional variables.
- Returns:
Processed item on success.
- exception yamlprocessor.dataprocess.UnboundVariableError
An error raised on attempt to substitute an unbound variable.
- yamlprocessor.dataprocess.configure_basic_logging(level=20)
Configure basic logging, suitable for most CLI applications.
Basic no-frill format. Stream handler prints message on STDERR.
Normal usage:
>>> from yamlprocessor.dataprocess import DataProcessor >>> processor = DataProcessor() >>> # ... Customise the `DataProcessor` instance as necessary ..., then: >>> processor.process_data(in_file_name, out_file_name)
- yamlprocessor.dataprocess.construct_yaml_timestamp(constructor, node)
Return a method to add to the YAML constructor to parse datetime.
yamlprocessor.schemaprocess
Modularise a JSON schema
Modularise a JSON schema and allows it to accept a data structure that can be composed of include files.
Two positional arguments are expected:
The file name of the JSON schema file.
The file name of a configuration file in JSON format.
The configuration file expects a mapping, where the keys are the file names (relative paths to current working directory) of the output sub-schema files, and the values are sub-schema break point locations (expressed as JMESPath format) in the input JSON schema document.