cybench.util package
Submodules
cybench.util.data module
- cybench.util.data.data_to_pandas(data_items, data_cols=None)
Convert data items as dict to pandas DataFrame
- Parameters:
data_items – list of data items, each of which is a dict
data_cols – list of keys to include as columns
- Returns:
pd.DataFrame
- cybench.util.data.trim_time_series_data(sample: dict, num_time_steps: int, time_series_keys: list)
Trims time series data to provided number of time steps
- Parameters:
sample (dict) – key is name, value is np.ndarray
num_time_steps (int) – number of time steps to keep
time_series_keys (list) – keys for time series data
- Returns:
the same sample with modified data
cybench.util.features module
- cybench.util.features._add_period(df: DataFrame, period_length: str)
Add a period column.
- Parameters:
df – pd.DataFrame
period_length – string, which can be “month”, “fortnight” or “dekad”
- Returns:
pd.DataFrame
- cybench.util.features._aggregate_by_period(df: DataFrame, index_cols: list, period_col: str, aggrs: dict, ft_cols: dict)
Aggregate data into features by period.
- Parameters:
df – pd.DataFrame
index_cols – list of indices, which are location and year
period_col – string, column added by add_period()
aggrs – dict containing columns to aggregate (keys) and corresponding aggregation function (values)
ft_cols – dict for renaming columns to feature columns
- Returns:
pd.DataFrame with features
- cybench.util.features._count_threshold(df: DataFrame, index_cols: list, period_col: str, indicator: str, threshold_exceed: bool = True, threshold: float = 0.0, ft_name: str = None)
Aggregate data into features by period.
- Parameters:
df – pd.DataFrame
index_cols – list of indices, which are location and year
period_col – string, column added by add_period()
indicator – string, indicator column to aggregate
threshold_exceed – boolean
threshold – float
ft_name – string name for aggregated indicator
- Returns:
pd.DataFrame with features
- cybench.util.features.dekad_from_date(dt: datetime)
Get the dekad number from date.
- Parameters:
dt – date
- Returns:
- Dekad number, e.g. “YYYY0101” to “YYYY0110” -> 1,
”YYYY0111” to “YYYY0120” -> 2, “YYYY0121” to “YYYY0131” -> 3
- cybench.util.features.design_features(crop: str, input_dfs: dict)
Design features based domain expertise.
- Parameters:
crop (str) – crop name, e.g. maize
input_dfs (dict) – keys are input names, values are pd.DataFrames
- Returns:
pd.DataFrame of features
- cybench.util.features.fortnight_from_date(dt: datetime)
Get the fortnight number from date.
- Parameters:
dt – date
- Returns:
Fortnight number, “YYYY0101” to “YYYY0115” -> 1.
- cybench.util.features.growing_degree_days(df: DataFrame, tbase: float)
- cybench.util.features.unpack_time_series(df: DataFrame, indicators: list)
Unpack time series data to rows per date.
- Parameters:
df – pd.DataFrame
indicators – list of indicators to unpack
- Returns:
pd.DataFrame
cybench.util.torch module
- cybench.util.torch.batch_tensors(*ts)