core.stats package¶
Submodules¶
core.stats.analysis module¶
-
core.stats.analysis.
analysis
(dataset, threshold, save_to=None, print_out=False)[source]¶ The full analysis for the given core.data.dataset.Dataset
- Parameters
dataset (core.data.dataset.Dataset) – Dataset object that want to perform evaluation
threshold (int) – the maximum time differences in seconds between two consecutive timestamp to not mark them as a gap
save_to (str) – the file name of function’s output result. if None, then do not write analysis result to a file. Otherwise, write analysis result to save_file
print_out (bool) – decide if analysis result should print to stdout or not
- Return type
- Returns
Analysis result in human readable format
core.stats.dropout_rate module¶
-
core.stats.dropout_rate.
dropout_rate
(dataset, dataset_level=False)[source]¶ Compute the dropout rate for a given dataset. Dropout rate is the percent of rows that is invalid
- Parameters
dataset (core.data.dataset.Dataset) – Dataset object that want to compute the dropout rate. The dropout rate is the percentage of data points missing in Dataset
dataset_level (bool) – decide the result is separate for each room in room_list or combine for the whole dataset together
- Return type
- Returns
the room name with its corresponding dropout rate
core.stats.frequency module¶
-
core.stats.frequency.
frequency
(dataset, dataset_level=True)[source]¶ Compute the average sample frequency base on the given dataset
- Parameters
dataset (core.data.dataset.Dataset) – Dataset object that want to compute the average frequency. The average frequency is the average second of all consecutive timestamp
dataset_level (bool) – decide the result is separate for each room in room_list or combine for the whole dataset together
- Return type
- Returns
the room name with its corresponding average sampling frequency
core.stats.gap_detect module¶
-
core.stats.gap_detect.
gap_detect
(dataset, threshold, sensor_level=False)[source]¶ Compute the gaps in the given dataset. Gap is a time sequence that two consecutive row have timestamp differences greater than threshold
- Parameters
dataset (core.data.dataset.Dataset) – Dataset object that want to find the gaps
threshold (int) – the maximum time differences in seconds between two consecutive timestamp to not mark them as a gap
sensor_level (bool) – decide the result is separate for each sensor in feature_list or combine for the whole dataset together
- Return type
- Returns
the room name corresponds to the name of sensor with its corresponding dropout rate
core.stats.occupancy_evaluation module¶
-
core.stats.occupancy_evaluation.
occupancy_distribution_evaluation
(dataset, dataset_level=True)[source]¶ Compute the distribution of the occupancy level on given Dataset
- Parameters
dataset (core.data.dataset.Dataset) – Dataset object that want to compute the occupancy distribution
dataset_level (bool) – decide the result is separate for each room in room_list or combine for the whole dataset together
- Return type
- Returns
the room name with its each possible occupancy level corresponding to distribution
core.stats.uptime module¶
-
core.stats.uptime.
uptime
(dataset, threshold, gaps=None)[source]¶ Compute the uptime in the given dataset. Uptime is the length of time a sensor reported value
- Parameters
dataset (core.data.dataset.Dataset) – Dataset object that want to compute the uptime
threshold (int) – the maximum time differences in seconds between two consecutive timestamp to not mark them as a gap
gaps (dict(str, list(str)) or dict(str, dict(str, list(str)))) – a dictionary result from the core.stats.gap_detect
- Return type
- Returns
the room name corresponds to the name of sensor with its corresponding uptime