metobs_toolkit.Dataset.update_gaps_and_missing_from_outliers#
- Dataset.update_gaps_and_missing_from_outliers(obstype='temp', n_gapsize=None)[source]#
Interpret the outliers as missing observations.
If there is a sequence of these outliers for a station, larger than n_gapsize than this will be interpreted as a gap.
The outliers are not removed.
- Parameters:
obstype (str, optional) – Use the outliers on this observation type to update the gaps and missing timestamps. The default is ‘temp’.
n_gapsize (int, optional) – The minimum number of consecutive missing observations to define as a gap. If None, n_gapsize is taken from the settings defenition of gaps. The default is None.
- Return type:
None.
Note
Gaps and missing observations resulting from an outlier on a specific obstype, are assumed to be gaps/missing observation for all obstypes.
Note
Be aware that n_gapsize is used for the current resolution of the Dataset, this is different from the gap check applied on the inported data, if the dataset is coarsend.
Examples
>>> import metobs_toolkit >>> >>> # Import data into a Dataset >>> dataset = metobs_toolkit.Dataset() >>> dataset.update_settings( ... input_data_file=metobs_toolkit.demo_datafile, ... input_metadata_file=metobs_toolkit.demo_metadatafile, ... template_file=metobs_toolkit.demo_template, ... ) >>> dataset.import_data_from_file() >>> dataset.coarsen_time_resolution(freq='1h') >>> >>> # Apply quality control on the temperature observations >>> dataset.apply_quality_control(obstype='temp') #Using the default QC settings >>> dataset Dataset instance containing: *28 stations *['temp', 'humidity', 'wind_speed', 'wind_direction'] observation types *10080 observation records *1676 records labeled as outliers *0 gaps *3 missing observations *records range: 2022-09-01 00:00:00+00:00 --> 2022-09-15 23:00:00+00:00 (total duration: 14 days 23:00:00) *time zone of the records: UTC *Coordinates are available for all stations. >>> # Interpret the outliers as missing/gaps >>> dataset.update_gaps_and_missing_from_outliers(obstype='temp') >>> dataset Dataset instance containing: *28 stations *['temp', 'humidity', 'wind_speed', 'wind_direction'] observation types *10080 observation records *0 records labeled as outliers *2 gaps *1473 missing observations *records range: 2022-09-01 00:00:00+00:00 --> 2022-09-15 23:00:00+00:00 (total duration: 14 days 23:00:00) *time zone of the records: UTC *Coordinates are available for all stations.