Joseph Wonsil

Joseph Wonsil

PhD Student of Computer Science

University of British Columbia


I am a computer science PhD student with the Systopia Lab at The University of British Columbia. My research interests include data provenance, research reproducibility, and systems broadly construed. I am currently exploring provenance collection for large-scale data processing pipelines and the use of data provenance and containers to increase the reproducibility of data analyses. I have previously worked on provenance-based debuggers, nano-satellite computer systems, geospatial public health analyses, and bringing technology to theater.


  • Data Provenance
  • Reproducible Research
  • Systems Broadly Construed


  • PhD in Computer Science, Current

    The University of British Columbia

  • MSc in Computer Science, 2021

    The University of British Columbia

  • BA in Computer Science, Geographic Information Science, and Environmental Science, 2019

    Carthage College

Recent Publications

(2023). Integrated Reproducibility with Self-describing Machine Learning Models. 2023 ACM Conference on Reproducibility and Replicability.

Code Dataset Project DOI

(2023). Reproducibility as a Service. Software: Practice and Experience.

Journal Publication Thesis

(2023). Making Provenance Work for You. The R Journal.



MERIT: A Machine-Learning Reproducibility System

A system to facilitate reproducibility for Tribuo-trained machine-learning models

Reproducibility as a Service (RaaS)

A system to facilitate reproducibility for computational experiments that are missing a computational environment.

Provenance-Based Debugging

Multilingual provenance-based post-mortem debuggers

CaNOP CubeSat

Multispectral imaging nano-satellite with the Wisconsin Space Grant Consortium



Research Assistant

Oracle Labs

May 2023 – Aug 2023 Massachusetts, US
I worked on designing and implementing provenance collection for a large-scale data-processing pipeline.

Collaborative Research Assistant

UBC in collaboration with Oracle Labs

Jun 2021 – Oct 2021 Vancouver, Canada
I contributed to the open-source machine-learning library Tribuo. I added a reproducibility framework, MERIT, which can automatically reproduce any ML model that Tribuo with one if its own implementations.

Research Assistant

Harvard University

May 2018 – Aug 2018 Massachusetts, US
I worked on creating tools for data scientists to use that leveraged data provenance from the R language. These included writing R packages that can parse provenance and provide interesting debugging information to users.

Research Assistant

Wisconsin Space Grant Consortium

Jun 2016 – Sep 2017 Wisconsin, US
I worked on the CaNOP CubeSat for two summers as member, and then team leader, of the Command & Data Handling team.