Feature and Targets

A module for creating advanced time series features and targets including sliding windows and polynomial transformations

source

FeatureAndTargetGenerator

 FeatureAndTargetGenerator (context_len:int=10, target_len:int=10,
                            poly_degree:int=1)

Transforms time series data into feature matrices suitable for machine learning models. Creates lagged features using a sliding window and optionally generates polynomial features to capture non-linear relationships between variables. It also creates a target vector for the number of timesteps to predict.

Type Default Details
context_len int 10 number of previous timesteps to use as features
target_len int 10 number of timesteps to predict
poly_degree int 1 degree of polynomial features

We will first open the data

DATA_PATH = Path("../testing_data")

data = pd.read_csv(
    DATA_PATH/'hydro_example.csv', 
    usecols=['time', 'smoothed_rain', 'Q_mgb', 'Q_obs'], 
    index_col='time',
    converters={"time": pd.to_datetime}
    )

Then we need to create an instance of the generator setting the context and the target length and the polynomial degree.

generator = FeatureAndTargetGenerator(context_len=1, target_len=2, poly_degree=2)

Then we can generate the feature and target matrixes as follows


source

FeatureAndTargetGenerator.generate

 FeatureAndTargetGenerator.generate (df:pandas.core.frame.DataFrame,
                                     x_col:list[str], y_col:list[str])

Generates a feature matrix and target vector from the input data.

x_col, y_col = ['smoothed_rain','Q_mgb'], ['Q_obs']
data_x, data_y = generator.generate(data, x_col=x_col, y_col=y_col)

The generated data will look as follows

data_x.head(3)
0 1 2 3 4 5
time
2012-01-01 1.0 0.299868 15.22 0.089921 4.563995 231.6484
2012-01-02 1.0 0.299767 14.84 0.089860 4.448548 220.2256
2012-01-03 1.0 0.265321 14.48 0.070395 3.841846 209.6704
data_y.head(3)
t+1 t+2
time
2012-01-01 67.500000 67.349998
2012-01-02 67.349998 66.800003
2012-01-03 66.800003 66.739998