metobs_toolkit.Dataset.get_qc_stats#
- Dataset.get_qc_stats(obstype='temp', stationname=None, make_plot=True)[source]#
Get quality control statistics.
Compute frequency statistics on the qc labels for an observationtype. The output is a dataframe containing the frequency statistics presented as percentages.
These frequencies can also be presented as a collection of piecharts per check.
With stationnames you can subset the data to one ore multiple stations.
- Parameters:
obstype (str, optional) – Observation type to analyse the QC labels on. The default is ‘temp’.
stationname (str, optional) – Stationname to subset the quality labels on. If None, all stations are used. The default is None.
make_plot (Bool, optional) – If True, a plot with piecharts is generated. The default is True.
- Returns:
dataset_qc_stats – A table containing the label frequencies per check presented as percentages.
- Return type:
pandas.DataFrame
Examples
>>> import metobs_toolkit >>> >>> # Import data into a Dataset >>> dataset = metobs_toolkit.Dataset() >>> dataset.update_settings( ... input_data_file=metobs_toolkit.demo_datafile, ... input_metadata_file=metobs_toolkit.demo_metadatafile, ... template_file=metobs_toolkit.demo_template, ... ) >>> dataset.import_data_from_file() >>> dataset.coarsen_time_resolution(freq='1h') >>> >>> # Apply quality control on the temperature observations >>> dataset.apply_quality_control(obstype='temp') #Using the default QC settings >>> dataset Dataset instance containing: *28 stations *['temp', 'humidity', 'wind_speed', 'wind_direction'] observation types *10080 observation records *1676 records labeled as outliers *0 gaps *3 missing observations *records range: 2022-09-01 00:00:00+00:00 --> 2022-09-15 23:00:00+00:00 (total duration: 14 days 23:00:00) *time zone of the records: UTC *Coordinates are available for all stations. >>> >>> #Get quality control statistics >>> stats = dataset.get_qc_stats(make_plot=False) >>> stats ({'ok': 83.37301587301587, 'QC outliers': 16.6269841269...