cybench.runs package

Submodules

cybench.runs.agml_workshop module

class cybench.runs.agml_workshop.LSTMModel(time_series_have_same_length=False, num_rnn_layers=1, rnn_hidden_size=64, num_outputs=1, *args, **kwargs)

Bases: BaseModel, Module

_abc_impl = <_abc._abc_data object>
_get_validation_splits(all_years, num_folds=1, num_valid_years=5)
_optimize_hyperparameters(train_dataset, param_space, loss, batch_size, epochs, save_model_path)
_train_epoch(train_loader, loss, optimizer)
fit(train_dataset, optimize_hyperparameters=False, epochs=10, **fit_params)

Fit or train the model.

Parameters:
  • dataset – Dataset

  • **fit_params – Additional parameters.

Returns:

A tuple containing the fitted model and a dict with additional information.

forward(X_ts, X_rest)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod load(model_name)

Deserialize a saved model.

Parameters:

model_name – Filename that was used to save the model.

Returns:

The deserialized model.

predict(test_dataset)

Run fitted model on data.

Parameters:
  • dataset – Dataset

  • **predict_params – Additional parameters.

Returns:

A tuple containing a np.ndarray and a dict with additional information.

predict_items(X: list, device: str = 'cpu', **predict_params)

Run fitted model on a list of data items.

Parameters:
  • X (list) – a list of data items, each of which is a dict

  • NOTE – All items in X are expected have time series with the same length.

  • device (str) – str, the device to use

  • **predict_params – Additional parameters

Returns:

A tuple containing a np.ndarray and a dict with additional information.

save(model_name)

Save model, e.g. using pickle.

Parameters:

model_name – Filename that will be used to save the model.

cybench.runs.agml_workshop.date_from_dekad(dekad, year)

Reconstruct date string from dekad and year. NOTE: Don’t use this with CY-Bench data aligned to crop season. For aligned data, KEY_YEAR and year in “date” can be different. So it’s incorrect to infer data based on dekad and year.

Parameters:
  • dekad (int) – a number from 1-36 indicating ~10-day periods

  • year (int) – year in YYYY format

Returns:

datetime in YYYYmmdd format

cybench.runs.agml_workshop.get_cybench_data()

Reproduce results from AgML 2024 for LSTM models using CY-Bench data. Compare the workshop LSTM implementation and benchmark LSTM implementation to validate their performance on the same data. NRMSE must be around 25%. These results were produced with

inputs:

static: [“awc”] time series: [“tmin”, “tmax”, “tavg”, “prec”, “cwb”, “rad”] + [“fpar”]

NOTE: These should match the definitions of STATIC_PREDICTORS

and TIME_SERIES_PREDICTORS.

NOTE: All time series inputs are at the same (dekadal) resolution.

This means BaselineLSTM does not need to aggregate time series data.

epochs=10 lr=0.0001 weight_decay=0.0001. Since BaselineLSTM uses weight_decay=0.00001, the same value is now used for the workshop LSTMModel implementation above.

cybench.runs.agml_workshop.get_cybench_data_aligned_to_crop_season()
cybench.runs.agml_workshop.get_workshop_data()

Reproduce results from AgML 2024 for LSTM models. Compare the workshop LSTM implementation and benchmark LSTM implementation to validate their performance on the same data. NRMSE must be around 25%. These results were produced with

inputs:

static: [“awc”] time series: [“tmin”, “tmax”, “tavg”, “prec”, “cwb”, “rad”] + [“fpar”]

NOTE: These should match the definitions of STATIC_PREDICTORS

and TIME_SERIES_PREDICTORS.

NOTE: All time series inputs are at the same (dekadal) resolution.

This means BaselineLSTM does not need to aggregate time series data.

epochs=10 lr=0.0001 weight_decay=0.0001. Since BaselineLSTM uses weight_decay=0.00001, the same value is now used for the workshop LSTMModel implementation above.

cybench.runs.agml_workshop.validate_agml_workshop_results(df_y, dfs_x, time_series_have_same_length=False)

cybench.runs.benchmark_summary module

cybench.runs.benchmark_summary.dataset_summary(dataset_name: str = 'maize_NL', min_year: int = 2003, max_year: int = 2024) dict

Output a summary of dataset. :param dataset_name: The name of the dataset to load :type dataset_name: str :param min_year: minimum year (soil moisture data starts from 2003) :type min_year: int :param max_year: maximum year (some regions in AR have data points for 2024) :type max_year: int

Returns:

pd.DataFrame with a summary for given dataset

cybench.runs.benchmark_summary.run_benchmark_summary(output_file: str = None)

cybench.runs.process_results module

cybench.runs.process_results.df_to_markdown(df, formatted_df)
cybench.runs.process_results.format_row(row, metric)
cybench.runs.process_results.results_to_metrics()
cybench.runs.process_results.results_to_residuals(model_names)
cybench.runs.process_results.write_results_to_table(output_file: str)

cybench.runs.results_plots module

cybench.runs.results_plots.box_plots_metrics(data, crop, countries, metric, metric_label, subplots_per_row=4)
cybench.runs.results_plots.box_plots_residuals(data, crop, countries, residual_cols, residual_labels, ymin, ymax, subplots_per_row=4)
cybench.runs.results_plots.plot_bars(df, metric, metric_label, title_label, file_name)
cybench.runs.results_plots.plot_graph(df, x_col, hue_col, x_label, metric, metric_label, title, file_name, rotation=45)
cybench.runs.results_plots.plot_metrics(df: DataFrame, metric: str = None)
cybench.runs.results_plots.plot_yearly_metrics(data, crop, country, metric, metric_label)
cybench.runs.results_plots.plot_yearly_residuals(data, crop, country, residual_cols, residual_labels)

cybench.runs.run_benchmark module

cybench.runs.run_benchmark.compute_metrics(run_name: str, model_names: list = None) DataFrame

Compute evaluation metrics on saved predictions. :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str :param model_names: names of models :type model_names: list

Returns:

a pd.DataFrame containing evaluation metrics

cybench.runs.run_benchmark.get_prediction_residuals(run_name: str, model_names: dict) DataFrame

Get prediction residuals (i.e., model predictions - labels). :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str :param model_names: A mapping of model name (key) to a shorter name (value) :type model_names: dict

Returns:

a pd.DataFrame containing prediction residuals

cybench.runs.run_benchmark.load_results(run_name: str) DataFrame

Load saved results for analysis or visualization. :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str

Returns:

a pd.DataFrame containing the predictions of benchmark models

cybench.runs.run_benchmark.run_benchmark(run_name: str, model_name: str = None, model_constructor: callable = None, model_init_kwargs: dict = None, model_fit_kwargs: dict = None, baseline_models: list = None, dataset_name: str = 'maize_NL', sel_years: list = None, nn_models_epochs: int = None) dict

Run CY-Bench. :param run_name: The name of the run. Will be used to store log files and model results :type run_name: str :param model_name: The name of the model. Will be used to store log files and model results :type model_name: str :param model_constructor: The constructor of the model. Will be used to construct the model :type model_constructor: Callable :param model_init_kwargs: The kwargs used when constructing the model. :type model_init_kwargs: dict :param model_fit_kwargs: The kwargs used to fit the model. :type model_fit_kwargs: dict :param baseline_models: A list of names of baseline models to run next to the provided model.

If unspecified, a default list of baseline models will be used.

Parameters:
  • dataset_name (str) – The name of the dataset to load

  • sel_years (list) – a list of years to run leave one year out (for tests)

  • nn_models_epochs (int) – Number of epochs to run for nn-models (for tests)

Returns:

a dictionary containing the results of the benchmark

cybench.runs.run_benchmark.run_benchmark_on_all_data()

Module contents