metobs_toolkit.Dataset.get_qc_stats#

Dataset.get_qc_stats(obstype='temp', stationname=None, make_plot=True)[source]#

Get quality control statistics.

Compute frequency statistics on the qc labels for an observationtype. The output is a dataframe containing the frequency statistics presented as percentages.

These frequencies can also be presented as a collection of piecharts per check.

With stationnames you can subset the data to one ore multiple stations.

Parameters:
  • obstype (str, optional) – Observation type to analyse the QC labels on. The default is ‘temp’.

  • stationname (str, optional) – Stationname to subset the quality labels on. If None, all stations are used. The default is None.

  • make_plot (Bool, optional) – If True, a plot with piecharts is generated. The default is True.

Returns:

dataset_qc_stats – A table containing the label frequencies per check presented as percentages.

Return type:

pandas.DataFrame

Examples

>>> import metobs_toolkit
>>>
>>> # Import data into a Dataset
>>> dataset = metobs_toolkit.Dataset()
>>> dataset.update_settings(
...                         input_data_file=metobs_toolkit.demo_datafile,
...                         input_metadata_file=metobs_toolkit.demo_metadatafile,
...                         template_file=metobs_toolkit.demo_template,
...                         )
>>> dataset.import_data_from_file()
>>> dataset.coarsen_time_resolution(freq='1h')
>>>
>>> # Apply quality control on the temperature observations
>>> dataset.apply_quality_control(obstype='temp') #Using the default QC settings
>>> dataset
Dataset instance containing:
     *28 stations
     *['temp', 'humidity', 'wind_speed', 'wind_direction'] observation types
     *10080 observation records
     *1676 records labeled as outliers
     *0 gaps
     *3 missing observations
     *records range: 2022-09-01 00:00:00+00:00 --> 2022-09-15 23:00:00+00:00 (total duration:  14 days 23:00:00)
     *time zone of the records: UTC
     *Coordinates are available for all stations.
>>>
>>> #Get quality control statistics
>>> stats = dataset.get_qc_stats(make_plot=False)
>>> stats
({'ok': 83.37301587301587, 'QC outliers': 16.6269841269...