sensortoolkit.calculate._regression_stats.regression_stats

regression_stats(sensor_df_obj, ref_df_obj, deploy_dict, param, serials, verbose=True)[source]

Compute OLS regression statistics.

Module is used to compute the following regressions:

  • Sensor vs. FRM/FEM

  • Sensor vs. Inter-sensor average

For each instance, the dependent and independent variables are assigned as hourly/daily sensor data vs. hourly/daily reference data; please note the ref_df_obj object can be either a DataFrame containg FRM/FEM concentratons, or a DataFrame containing intersensor averages depending on the use case. The ‘ref’ label refers moreso to the fact that the dataset is used as the independent variable for regressions.

Note

The DataFrames within the sensor_df_obj and ref_df_obj arguments should contain data reported at the same sampling frequency (e.g., if a sensor DataFrame containing data at 1-hour averaged intervals is passed to the sensor_df_obj, the reference DataFrame passed to ref_df_obj must also contain data at 1-hour averaged intervals).

Parameters
  • sensor_df_obj (pandas DataFrame or list of pandas DataFrames) – Either a DataFrame or list of DataFrames containg sensor parameter measurements. Data corresponding to passed parameter name are used as the dependent variable.

  • ref_df_obj (pandas DataFrame) – Reference DataFrame (either FRM/FEM OR Inter-sensor averages). Data corresponding to passed parameter name are used as the independent variable.

  • deploy_dict (dict) – A dictionary containing descriptive statistics and textual information about the deployment (testing agency, site, time period, etc.), sensors tested, and site conditions during the evaluation.

  • param (str) – Parameter name for which to compute regression statistics.

  • serials (dict) – A dictionary of sensor serial identifiers for each unit in a testing group.

  • verbose (bool) – If true, print statements will be displayed in the console output. Defaults to True.

Returns

Statistics DataFrame for either sensor vs. FRM/FEM or sensor vs. intersensor mean OLS regression.

Return type

stats_df (pandas DataFrame)