metobs_toolkit.Dataset.fill_gaps_linear#

Dataset.fill_gaps_linear(obstype='temp', overwrite_fill=False)[source]#

Fill the gaps using linear interpolation.

The gapsfilldf attribute of the Datasetinstance will be updated if the gaps are not filled yet or if overwrite_fill is set to True.

Parameters:
  • obstype (string, optional) – Fieldname to visualise. This can be an observation or station attribute. The default is ‘temp’.

  • overwrite_fill (bool, optional) – If a gap has already filled values, the interpolation of this gap is skipped if overwrite_fill is False. If set to True, the gapfill values and info will be overwitten. The default is False.

Returns:

gapfilldf – A dataframe containing all the filled records.

Return type:

pandas.DataFrame

Notes

A schematic description of the linear gap fill:

  1. Iterate over all gaps.

  2. The gap is converted into a set of missing records (depending on the time resolution of the observations).

  3. Find a leading (the last observations before the gap) record and a trailing record (the last observation after the gap).

  4. By using the leading and trailing record an interpolation is applied to fill the missing records. A maximum consecutive fill threshold is applied, if exceeded the fill values are Nan’s.

  5. The gap is updated with the interpolated values (metobs_toolkit.Gap.gapfill_df)

Examples

>>> import metobs_toolkit
>>>
>>> # Import data into a Dataset
>>> dataset = metobs_toolkit.Dataset()
>>> dataset.update_settings(
...                         input_data_file=metobs_toolkit.demo_datafile,
...                         input_metadata_file=metobs_toolkit.demo_metadatafile,
...                         template_file=metobs_toolkit.demo_template,
...                         )
>>> dataset.import_data_from_file()
>>> dataset.coarsen_time_resolution(freq='1h')
>>>
>>> # Apply quality control on the temperature observations
>>> dataset.apply_quality_control(obstype='temp') #Using the default QC settings
>>>
>>> # Interpret the outliers as missing/gaps
>>> dataset.update_gaps_and_missing_from_outliers(obstype='temp')
>>> dataset
Dataset instance containing:
      *28 stations
      *['temp', 'humidity', 'wind_speed', 'wind_direction'] observation types
      *10080 observation records
      *0 records labeled as outliers
      *2 gaps
      *1473 missing observations
      *records range: 2022-09-01 00:00:00+00:00 --> 2022-09-15 23:00:00+00:00 (total duration:  14 days 23:00:00)
      *time zone of the records: UTC
      *Coordinates are available for all stations.
>>>
>>> #Update the gapfill settings (else the defaults are used)
>>> dataset.update_gap_and_missing_fill_settings(gap_interpolation_max_consec_fill=35)
>>>
>>> # Fill the gaps
>>> dataset.fill_gaps_linear(obstype='temp')
                                          temp   temp_final_label
name      datetime
vlinder05 2022-09-06 21:00:00+00:00  21.378710  gap_interpolation
          2022-09-06 22:00:00+00:00  21.357419  gap_interpolation
          2022-09-06 23:00:00+00:00  21.336129  gap_interpolation
          2022-09-07 00:00:00+00:00  21.314839  gap_interpolation
          2022-09-07 01:00:00+00:00  21.293548  gap_interpolation
          2022-09-07 02:00:00+00:00  21.272258  gap_interpolation
          2022-09-07 03:00:00+00:00  21.250968  gap_interpolation
          2022-09-07 04:00:00+00:00  21.229677  gap_interpolation
          2022-09-07 05:00:00+00:00  21.208387  gap_interpolation
          2022-09-07 06:00:00+00:00  21.187097  gap_interpolation
          2022-09-07 07:00:00+00:00  21.165806  gap_interpolation
          2022-09-07 08:00:00+00:00  21.144516  gap_interpolation
          2022-09-07 09:00:00+00:00  21.123226  gap_interpolation
          2022-09-07 10:00:00+00:00  21.101935  gap_interpolation
          2022-09-07 11:00:00+00:00  21.080645  gap_interpolation
          2022-09-07 12:00:00+00:00  21.059355  gap_interpolation
          2022-09-07 13:00:00+00:00  21.038065  gap_interpolation
          2022-09-07 14:00:00+00:00  21.016774  gap_interpolation
          2022-09-07 15:00:00+00:00  20.995484  gap_interpolation
          2022-09-07 16:00:00+00:00  20.974194  gap_interpolation
          2022-09-07 17:00:00+00:00  20.952903  gap_interpolation
          2022-09-07 18:00:00+00:00  20.931613  gap_interpolation
          2022-09-07 19:00:00+00:00  20.910323  gap_interpolation
          2022-09-07 20:00:00+00:00  20.889032  gap_interpolation
          2022-09-07 21:00:00+00:00  20.867742  gap_interpolation
          2022-09-07 22:00:00+00:00  20.846452  gap_interpolation
          2022-09-07 23:00:00+00:00  20.825161  gap_interpolation
          2022-09-08 00:00:00+00:00  20.803871  gap_interpolation
          2022-09-08 01:00:00+00:00  20.782581  gap_interpolation
          2022-09-08 02:00:00+00:00  20.761290  gap_interpolation
          2022-09-08 03:00:00+00:00  20.740000  gap_interpolation
          2022-09-08 04:00:00+00:00  20.718710  gap_interpolation
          2022-09-08 05:00:00+00:00  20.697419  gap_interpolation
          2022-09-08 06:00:00+00:00  20.676129  gap_interpolation
          2022-09-08 07:00:00+00:00  20.654839  gap_interpolation
>>> dataset.get_gaps_info()
Gap for vlinder05 with:...