core.stats package

Submodules

core.stats.analysis module

core.stats.analysis.analysis(dataset, threshold, save_to=None, print_out=False)[source]

The full analysis for the given core.data.dataset.Dataset

Parameters
  • dataset (core.data.dataset.Dataset) – Dataset object that want to perform evaluation

  • threshold (int) – the maximum time differences in seconds between two consecutive timestamp to not mark them as a gap

  • save_to (str) – the file name of function’s output result. if None, then do not write analysis result to a file. Otherwise, write analysis result to save_file

  • print_out (bool) – decide if analysis result should print to stdout or not

Return type

dict(str, result)

Returns

Analysis result in human readable format

core.stats.dropout_rate module

core.stats.dropout_rate.dropout_rate(dataset, dataset_level=False)[source]

Compute the dropout rate for a given dataset. Dropout rate is the percent of rows that is invalid

Parameters
  • dataset (core.data.dataset.Dataset) – Dataset object that want to compute the dropout rate. The dropout rate is the percentage of data points missing in Dataset

  • dataset_level (bool) – decide the result is separate for each room in room_list or combine for the whole dataset together

Return type

str or dict(str, str)

Returns

the room name with its corresponding dropout rate

core.stats.frequency module

core.stats.frequency.frequency(dataset, dataset_level=True)[source]

Compute the average sample frequency base on the given dataset

Parameters
  • dataset (core.data.dataset.Dataset) – Dataset object that want to compute the average frequency. The average frequency is the average second of all consecutive timestamp

  • dataset_level (bool) – decide the result is separate for each room in room_list or combine for the whole dataset together

Return type

str or dict(str, str)

Returns

the room name with its corresponding average sampling frequency

core.stats.gap_detect module

core.stats.gap_detect.gap_detect(dataset, threshold, sensor_level=False)[source]

Compute the gaps in the given dataset. Gap is a time sequence that two consecutive row have timestamp differences greater than threshold

Parameters
  • dataset (core.data.dataset.Dataset) – Dataset object that want to find the gaps

  • threshold (int) – the maximum time differences in seconds between two consecutive timestamp to not mark them as a gap

  • sensor_level (bool) – decide the result is separate for each sensor in feature_list or combine for the whole dataset together

Return type

dict(str, list(str)) or dict(str, dict(str, list(str)))

Returns

the room name corresponds to the name of sensor with its corresponding dropout rate

core.stats.occupancy_evaluation module

core.stats.occupancy_evaluation.occupancy_distribution_evaluation(dataset, dataset_level=True)[source]

Compute the distribution of the occupancy level on given Dataset

Parameters
  • dataset (core.data.dataset.Dataset) – Dataset object that want to compute the occupancy distribution

  • dataset_level (bool) – decide the result is separate for each room in room_list or combine for the whole dataset together

Return type

dict(int, str) or dict(str, dict(int, str))

Returns

the room name with its each possible occupancy level corresponding to distribution

core.stats.uptime module

core.stats.uptime.uptime(dataset, threshold, gaps=None)[source]

Compute the uptime in the given dataset. Uptime is the length of time a sensor reported value

Parameters
  • dataset (core.data.dataset.Dataset) – Dataset object that want to compute the uptime

  • threshold (int) – the maximum time differences in seconds between two consecutive timestamp to not mark them as a gap

  • gaps (dict(str, list(str)) or dict(str, dict(str, list(str)))) – a dictionary result from the core.stats.gap_detect

Return type

dict(str, tuple(str)) or dict(str, dict(str, tuple(str)))

Returns

the room name corresponds to the name of sensor with its corresponding uptime

Module contents