core.data package¶
This package contains multiple functions and one class related to read and write operations.
core.data.dataset module¶
-
class
core.data.dataset.
Dataset
Dataset
class is the standard data structure for data sets.Sample code for loading built-in data in class
core.data.dataset.Dataset
:>>> import core >>> dataset_built_in = core.data.load_sample("lbl-all") >>> type(dataset_built_in) <class 'core.data.dataset.Dataset'>
Sample code for creating a new
core.data.dataset.Dataset
object:>>> import core >>> dataset_new = core.data.Dataset() >>> type(dataset_new) <class 'core.data.dataset.Dataset'>
Sample code for importing data to a new
core.data.dataset.Dataset
object:>>> import core >>> dataset_import = core.data.import_data("Your-data-file.csv", 0, room_name="Your-room") >>> type(dataset_import) <class 'core.data.dataset.Dataset'>
-
add_room
(data, occupancy=None, room_name=None, header=True) add_room()
allows user to add a single room tocore.data.dataset.Dataset
object. If no occupancy data, core will assignnumpy.NaN
values to the room. If room_name is not given, core will assign the room name to the first available natural number. If header is not provided, core will assume the name for each column is a sequence of positive integer.:>>> import numpy >>> data = numpy.array([[1, 2], [3, 4]]) >>> occ = numpy.array([[0], [2]]) >>> dataset_new.add_room(data, occupancy=occ, room_name="test", header=["name 1", "name 2"]) >>> dataset_new.room_list ['test'] >>> dataset_new.data array([[1., 2.], [3., 4.]])
-
change_feature_mapping
(feature_mapping) change_feature_mapping()
method enables the user to completely change the header. The new header must be a bidirectional dictionary.:>>> dataset_new.feature_list ['name 1', 'name 2'] >>> dataset_new.change_feature_mapping({0: "New 1", 1: "New 2", "New 1": 0, "New 2": 1}) >>> dataset_new.feature_list ['New 1', 'New 2']
-
change_feature_name
(old, new) change_feature_name()
method changes the name of one feature in the header list.:>>> dataset_new.feature_list ['New 1', 'New 2'] >>> dataset_new.change_feature_name("New 1", "Old 1") >>> dataset_new.feature_list ['Old 1', 'New 2']
-
set_feature_name
(feature_list) set_feature_name()
method modifies the name of every feature without using a bidirectional dictionary.:>>> dataset_new.feature_list ['Old 1', 'New 2'] >>> dataset_new.set_feature_name(["New 1", "Old 2"]) >>> dataset_new.feature_list ["New 1", "Old 2"]
-
change_occupancy
(occupancy) change_occupancy()
method updates the occupancy states completely.:>>> dataset_new.occupancy array([[0.], [2.]]) >>> dataset_new.change_occupancy(numpy.array([[8],[1]])) >>> dataset_new.occupancy array([[8], [1]])
-
change_room_mapping
(room) change_room_mapping()
method updates the room mapping. The new mapping must be a bidirectional dictionary.:>>> dataset_new.room_list ['test'] >>> dataset_new.change_room_mapping({0: "Room 1", "Room 1": 0}) >>> dataset_new.room_list ['Room 1']
-
change_values
(data) change_values()
method updates all sensor values in the data set.:>>> dataset_new.data array([[1., 2.], [3., 4.]]) >>> dataset_new.change_values(numpy.array([[3, 4], [1, 2]])) >>> dataset_new.data array([[3, 4], [1, 2]])
-
remove_feature
(features, error=True) remove_feature()
method removes one feature or a list of features from the whole data set.:>>> dataset.feature_list ['time', 'indoor-temp', 'indoor-humidity', 'indoor-flow', 'indoor-radiant', 'indoor-lumen', 'indoor-co2', 'outdoor-temp', 'outdoor-humidity', 'ourdoor-flow'] >>> dataset.remove_feature("indoor-temp") >>> dataset.feature_list ['time', 'indoor-humidity', 'indoor-flow', 'indoor-radiant', 'indoor-lumen', 'indoor-co2', 'outdoor-temp', 'outdoor-humidity', 'ourdoor-flow'] >>> dataset.remove_feature(["indoor-humidity", "indoor-flow"]) >>> dataset.feature_list ['time', 'indoor-radiant', 'indoor-lumen', 'indoor-co2', 'outdoor-temp', 'outdoor-humidity', 'ourdoor-flow'] >>> dataset.remove_feature(["indoor-humidity", "indoor-flow"]) Traceback (most recent call last): File "<input>", line 1, in remove_feature raise KeyError("The feature {} does not exist in the dataset!".format(feature)) KeyError: 'The feature indoor-humidity does not exist in the dataset!' >>> dataset.remove_feature(["indoor-humidity", "indoor-radiant"], error=False) >>> dataset.feature_list ['time', 'indoor-lumen', 'indoor-co2', 'outdoor-temp', 'outdoor-humidity', 'ourdoor-flow']
-
select_feature
(features, error=True) select_feature()
method selects one feature or a list of features in the data set similar toremove_feature()
.
-
split
(percentage) split()
method separates the data set into two data sets given a split point.:>>> dataset.room_list ['data_1', 'data_2', 'data_3', 'data_4', 'data_5', 'data_6', 'data_7', 'data_8', 'data_9', 'data_10', 'data_11', 'data_12', 'data_13', 'data_14', 'data_15', 'data_16', 'data_17', 'data_18', 'data_19', 'data_20', 'data_21', 'data_22', 'data_23', 'data_24'] >>> dataset1, dataset2 = dataset.split(0.5) >>> dataset1.room_list ['data_1', 'data_2', 'data_3', 'data_4', 'data_5', 'data_6', 'data_7', 'data_8', 'data_9', 'data_10', 'data_11', 'data_12', 'Partially data_13'] >>> dataset2.room_list ['Partially data_13', 'data_14', 'data_15', 'data_16', 'data_17', 'data_18', 'data_19', 'data_20', 'data_21', 'data_22', 'data_23', 'data_24']
-
core.data.import_data module¶
-
core.data.import_data.
import_data
(file_name, time_column_index=None, mode='csv', header=True, room_name=None, tz=0) import_data()
function reads a raw data set. It is assumed that each row represents a data point and each column represents a feature. Features must be numerical values.:>>> import core >>> dataset = core.data.import_data("Your-data-file.csv", 0, room_name="Your-room") >>> type(dataset) <class 'core.data.dataset.Dataset'>
core.data.io module¶
-
core.data.io.
read_dataset
(file_name)
-
core.data.io.
save_dataset
(dataset, file_name) read_dataset()
andsave_dataset()
functions dumpcore.data.dataset.Dataset
to a binary file stored on the disk.:>>> import core >>> dataset = core.data.load_sample("lbl-all") >>> core.data.save_dataset(dataset, "temp") >>> dataset_2 = core.data.read_dataset("temp")
core.data.load_sample module¶
-
core.data.load_sample.
load_sample
(sample_name) load_sample()
function can load binary file fromcore/data/binary_dataset/
. Use'-'
to represent the directory, e.g., to load data named core/data/binary_dataset/sdu/all, the sample_name should be set to ‘sdu-all’.:>>> import core >>> dataset = core.data.load_sample("lbl-all")