save_run#

gaitmap_challenges.results.save_run(challenge: BaseChallenge, entry_name: Union[str, Tuple[str, ...]], *, custom_metadata: Dict[str, Any], path: Optional[Union[str, Path]] = None, debug_run: Optional[bool] = None, stored_filenames_relative_to: Optional[Union[str, Path]] = None, use_git: bool = True, git_dirty_ignore: Sequence[str] = (), debug_folder_prefix: str = '_', _force_local_debug_run_setting: bool = False, _caller_file_path_stack_offset: int = 0)[source]#

Save the results of a challenge run.

This is expected to be called with a challenge instance on which run was already executed as the first argument.

This method stores the results exported by the save_core_results method of the challenge class and a large amount of metadata in a folder structure. The folder name is generated as follows: {challenge_module}.{challenge_class}/{challenge_version}/{entry_name}/{timestamp}. In case of debug runs, the final folder name is prefixed with the specified debug_folder_prefix.

The metadata stroed with the results includes:

The challenge name
The challenge version
Whether this is a debug run (see Notes section)
A string representation of the challenge, the dataset, and the pipeline
Some metadata about the dataset
Start and end time of each run
Runtime
System Info including the Python version, the OS, the CPU and installed packages
The global config used for the run
The file path of the caller of this method
The git status of the repo (if available). This includes the current hash, and if the git repo is dirty (i.e. has uncommited changes).

Parameters:

challenge: The instance of the challenge with the results to save.
entry_name: The name identifying the entry. This should be a unique name in the context of the specific challenge.
custom_metadata: Custom metadata to save with the run. Must be json serializable.
path: The path to the folder where the run results should be stored. If not given, the global config is used. See the config module to learn more about that.
debug_run: Whether the run should be considered a debug run. This setting will be overwritten by the global config, if specified there and _force_local_debug_run_setting is True. To understand, when a run is actually considered a debug run, see the Notes section.
stored_filenames_relative_to: The path relative to which the filenames in the results are stored. This will convert all paths in the results to be relative to this path. This is useful to avoid storing personal information from absolute paths in the results
use_git: Whether to store the status of the current git repo with the results (e.g. commit hash, dirty status, …). Note, that this checks your current git repo, not the git repo of the challenge. The goal of saving additional metadata, is to make runs reproducible. This will only work, if all relevant code is actually tracked in your current git repo and not imported via relative paths from outside the repo.
git_dirty_ignore: A list of files and folders to ignore when checking if the repo is dirty.
debug_folder_prefix: When a run result is considered a debug run, this prefix is added to the folder name. This can be used to easily distinguish debug runs from normal runs (e.g. in a gitignore file).
_force_local_debug_run_setting: If True, the debug_run parameter provided to this function will be overwritten by the global config if provided. This is helpful to control the debug run setting via ENV variables, without modifying the code.
_caller_file_path_stack_offset: We store the path to the file that calls the save_run method. In case you wrap the save_run method, by your own method, this information becomes useless. In this case, you can set this parameter to the number of stack frames that are between your method and the save_run method, so basically to store the file that calls your method.

Notes

Debug runs are runs that are not considered final results. Runs are considered debug, if the user explicitly specifies them to be using the debug_run=True or via the global config. In addition, we will flag runs as debug (independent of the user setting) if the git repo is dirty, as runs produced with a dirty git repo, can not be reproduced and should not be considered proper benchmark results. This means, to produce actual runs, you need to set the explicit config to debug_run=False (or via the global config) and make sure that you don’t have any uncommited changes in your git repo. In case, you want to exclude some files from the git dirty check, you can use the git_dirty_ignore parameter. This can be helpful, to exclude for example the docs or results folder, that will not influence the reproducibility of your results.

save_run

Contents

save_run#