sensortoolkit.datetime_utils._time_averaging.interval_averaging
- interval_averaging(df, freq='H', interval_count=60, thres=0.75)[source]
Average DataFrame to the specified sampling frequency (‘freq’).
Numeric columns are averaged for for each interval and a completeness threshold (default 75%) must be met, otherwise averages are null. Columns of type ‘object’ (i.e. text) are aggregated within each interval by the mode of unique object values.
- Parameters
df (pandas DataFrame or pandas Series) – Dataframe or Series for which averages will be computed.
freq (str) – The frequency (averaging interval) to which the DataFrame will be averaged. Defaults to
H
. Pandas refers to these as ‘offset aliases’, and a list is found here (https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases).interval_count (int) – The number of datapoints expected within the passed DataFrame for the specified averaging interval (‘freq’). Defaults to 60 for 1-hour averages. E.g., if computing 1-hour averages (freq=’H’) an the passed DataFrame is for a sensor that recorded measurements at 1-minute sampling frequency, interval_count will equal 60 (expect 60 non-null data points per averaging interval).
thres (float) – Threshold (ranging from 0 to 1) for ratio of the number of data points recorded within a given averaging interval vs. the number of expected data points. Defaults to
0.75
(i.e., 75%).
- Returns
Dataframe averaged to datetimeindex interval specified by ‘freq’.
- Return type
avg_df (pandas DataFrame)