cybench.runs package
Submodules
cybench.runs.agml_workshop module
- class cybench.runs.agml_workshop.LSTMModel(time_series_have_same_length=False, num_rnn_layers=1, rnn_hidden_size=64, num_outputs=1, *args, **kwargs)
Bases:
BaseModel
,Module
- _abc_impl = <_abc._abc_data object>
- _get_validation_splits(all_years, num_folds=1, num_valid_years=5)
- _optimize_hyperparameters(train_dataset, param_space, loss, batch_size, epochs, save_model_path)
- _train_epoch(train_loader, loss, optimizer)
- fit(train_dataset, optimize_hyperparameters=False, epochs=10, **fit_params)
Fit or train the model.
- Parameters:
dataset – Dataset
**fit_params – Additional parameters.
- Returns:
A tuple containing the fitted model and a dict with additional information.
- forward(X_ts, X_rest)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- classmethod load(model_name)
Deserialize a saved model.
- Parameters:
model_name – Filename that was used to save the model.
- Returns:
The deserialized model.
- predict(test_dataset)
Run fitted model on data.
- Parameters:
dataset – Dataset
**predict_params – Additional parameters.
- Returns:
A tuple containing a np.ndarray and a dict with additional information.
- predict_items(X: list, device: str = 'cpu', **predict_params)
Run fitted model on a list of data items.
- Parameters:
X (list) – a list of data items, each of which is a dict
device (str) – str, the device to use
**predict_params – Additional parameters
- Returns:
A tuple containing a np.ndarray and a dict with additional information.
- save(model_name)
Save model, e.g. using pickle.
- Parameters:
model_name – Filename that will be used to save the model.
- cybench.runs.agml_workshop.date_from_dekad(dekad, year)
Reconstruct date string from dekad and year. NOTE: Don’t use this with CY-Bench data aligned to crop season. For aligned data, KEY_YEAR and year in “date” can be different. So it’s incorrect to infer data based on dekad and year.
- Parameters:
dekad (int) – a number from 1-36 indicating ~10-day periods
year (int) – year in YYYY format
- Returns:
datetime in YYYYmmdd format
- cybench.runs.agml_workshop.get_cybench_data()
Reproduce results from AgML 2024 for LSTM models using CY-Bench data. Compare the workshop LSTM implementation and benchmark LSTM implementation to validate their performance on the same data. NRMSE must be around 25%. These results were produced with
- inputs:
static: [“awc”] time series: [“tmin”, “tmax”, “tavg”, “prec”, “cwb”, “rad”] + [“fpar”]
- NOTE: These should match the definitions of STATIC_PREDICTORS
and TIME_SERIES_PREDICTORS.
- NOTE: All time series inputs are at the same (dekadal) resolution.
This means BaselineLSTM does not need to aggregate time series data.
epochs=10 lr=0.0001 weight_decay=0.0001. Since BaselineLSTM uses weight_decay=0.00001, the same value is now used for the workshop LSTMModel implementation above.
- cybench.runs.agml_workshop.get_cybench_data_aligned_to_crop_season()
- cybench.runs.agml_workshop.get_workshop_data()
Reproduce results from AgML 2024 for LSTM models. Compare the workshop LSTM implementation and benchmark LSTM implementation to validate their performance on the same data. NRMSE must be around 25%. These results were produced with
- inputs:
static: [“awc”] time series: [“tmin”, “tmax”, “tavg”, “prec”, “cwb”, “rad”] + [“fpar”]
- NOTE: These should match the definitions of STATIC_PREDICTORS
and TIME_SERIES_PREDICTORS.
- NOTE: All time series inputs are at the same (dekadal) resolution.
This means BaselineLSTM does not need to aggregate time series data.
epochs=10 lr=0.0001 weight_decay=0.0001. Since BaselineLSTM uses weight_decay=0.00001, the same value is now used for the workshop LSTMModel implementation above.
- cybench.runs.agml_workshop.validate_agml_workshop_results(df_y, dfs_x, time_series_have_same_length=False)
cybench.runs.process_results module
- cybench.runs.process_results.df_to_markdown(df, formatted_df)
- cybench.runs.process_results.format_row(row, metric)
- cybench.runs.process_results.results_to_metrics()
- cybench.runs.process_results.results_to_residuals(model_names)
- cybench.runs.process_results.write_results_to_table()
cybench.runs.results_plots module
- cybench.runs.results_plots.box_plots_metrics(data, crop, countries, metric, metric_label, subplots_per_row=4)
- cybench.runs.results_plots.box_plots_residuals(data, crop, countries, residual_cols, residual_labels, ymin, ymax, subplots_per_row=4)
- cybench.runs.results_plots.plot_bars(df, metric, metric_label, title_label, file_name)
- cybench.runs.results_plots.plot_graph(df, x_col, hue_col, x_label, metric, metric_label, title, file_name, rotation=45)
- cybench.runs.results_plots.plot_metrics(df: DataFrame, metric: str | None = None)
- cybench.runs.results_plots.plot_yearly_metrics(data, crop, country, metric, metric_label)
- cybench.runs.results_plots.plot_yearly_residuals(data, crop, country, residual_cols, residual_labels)
cybench.runs.run_benchmark module
- cybench.runs.run_benchmark.compute_metrics(run_name: str, model_names: list | None = None) DataFrame
Compute evaluation metrics on saved predictions. :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str :param model_names: names of models :type model_names: list
- Returns:
a pd.DataFrame containing evaluation metrics
- cybench.runs.run_benchmark.get_prediction_residuals(run_name: str, model_names: dict) DataFrame
Get prediction residuals (i.e., model predictions - labels). :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str :param model_names: A mapping of model name (key) to a shorter name (value) :type model_names: dict
- Returns:
a pd.DataFrame containing prediction residuals
- cybench.runs.run_benchmark.load_results(run_name: str) DataFrame
Load saved results for analysis or visualization. :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str
- Returns:
a pd.DataFrame containing the predictions of benchmark models
- cybench.runs.run_benchmark.run_benchmark(run_name: str, model_name: str | None = None, model_constructor: callable | None = None, model_init_kwargs: dict | None = None, model_fit_kwargs: dict | None = None, baseline_models: list | None = None, dataset_name: str = 'maize_NL', sel_years: list | None = None, nn_models_epochs: int | None = None) dict
Run CY-Bench. :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str :param model_name: The name of the model. Will be used to store log files and model results :type model_name: str :param model_constructor: The constructor of the model. Will be used to construct the model :type model_constructor: Callable :param model_init_kwargs: The kwargs used when constructing the model. :type model_init_kwargs: dict :param model_fit_kwargs: The kwargs used to fit the model. :type model_fit_kwargs: dict :param baseline_models: A list of names of baseline models to run next to the provided model.
If unspecified, a default list of baseline models will be used.
- Parameters:
dataset_name (str) – The name of the dataset to load
sel_years (list) – a list of years to run leave one year out (for tests)
nn_models_epochs (int) – Number of epochs to run for nn-models (for tests)
- Returns:
a dictionary containing the results of the benchmark
- cybench.runs.run_benchmark.run_benchmark_on_all_data()
cybench.runs.validate_model module
- cybench.runs.validate_model.validate_single_model(run_name: str, model_name: str, model_constructor: callable, model_init_kwargs: dict | None = None, model_fit_kwargs: dict | None = None, baseline_models: list | None = None, dataset_name: str = 'test_maize_us', test_years_to_leave_out: list | None = None) dict
Run a single model on a single outer fold and return validation results. Test is is left out completely and not used for training or validation. Not used for benchmarking. Use run_benchmark instead. Hyperparameters should be optimized in each outer fold in the benchmark. This function should only be used for exploration of initial hyperparameter settings.
- Parameters:
run_name (str) – The name of the run. Will be used to store log files and model results
model_name (str) – The name of the model. Will be used to store log files and model results
model_constructor (Callable) – The constructor of the model. Will be used to construct the model
model_init_kwargs (dict) – The kwargs used when constructing the model.
model_fit_kwargs (dict) – The kwargs used to fit the model.
baseline_models (list) – A list of names of baseline models to run next to the provided model. If unspecified, a default list of baseline models will be used.
dataset_name (str) – The name of the dataset to load
- Returns:
a dictionary containing the results of the benchmark