Integrated Reproducibility with Self-describing Machine Learning Models


Researchers and data scientists frequently want to collaborate on machine learning models. However, in the presence of sharing and simultaneous experimentation, it is challenging both to determine if two models were trained identically and to reproduce precisely someone else’s training process. We demonstrate how provenance collection that is tightly integrated into a machine learning library facilitates reproducibility. We present MERIT, a reproducibility system that leverages a robust configuration system and extensive provenance collection to exactly reproduce models, given only a model object. We integrate MERIT with Tribuo, an open-source Java-based machine learning library. Key features of this integrated reproducibility framework include controlling for sources of non- determinism in a multi-threaded environment and exposing the training differences between two models in a human-readable form. Our system allows simple reproduction of deployed Tribuo models without any additional information, ensuring data science research is reproducible. Our framework is open-source and available under an Apache 2.0 license.

2023 ACM Conference on Reproducibility and Replicability