Enhancing CPS Data Reproducibility
Through this interactive tutorial, participants will be taken through the process of taking their existing validation set, and producing both static and active ways for individuals to explore it onto the Cyber-Physical Systems Virtual Organization. Tutorial elements will include issues of reproducibility, and guide participants through approaches they can take to maximize the impact of data they have gathered through the course of research in cyber-physical systems.
The tutorial is divided into (1) a morning session that empowers data owners to publish their data and validation algorithms, with their workshop peers running the validation algorithms on their own computers, and (2) an afternoon session where participants learn how to take publicly available data, and create validation paths that execute online, or use Docker and other technologies to execute the validation results. Participants can take part in the morning, the afternoon, or both sessions. A full agenda will be released at a later date.
At the conclusion of the tutorial, researchers will have more context into how they can broaden access to the replicability of their results.
The Cyber Physical Systems Virtual Organization (https://cps-vo.org/) is a broad community of interest for CPS researchers and developers. The CPS-VO includes institutions from academia and industry, and people who work on a wide range of related disciplines with different approaches, methods, tools and experimental platforms. Through this tutorial, the CPS-VO will empower researchers to more effectively disseminate the data that drive their discoveries, and software that reproduces their research results.
The CPS Community needs more access to data repositories, as convergence research continues to accelerate discovery. Data repositories in computer vision and pattern recognition communities have served as reliable benchmarks for new and updated algorithms, but it is important to note that such communities may have a common set of problems whose solutions may be measured with the same data. There are three important motivations that explain the CPS community-driven need for data.
- Reproducibility in model-based and data-driven CPS research: CPS architectures increasingly incorporate Learning Enabled Components (LEC). Consequently, the models, the datasets used for their training, as well as the models of their training processes are essential for the validation, evaluation, and exploitation of research results.
- Translatability is key: Members of the CPS community bridge application domains, and it is important to be able to disseminate the results of validation experiments so that experts in other application domains can explore them. However, it is also important to see from the examples of other application domains relevant to CPS how the data from those validation experiments provide evidence of success. As more researchers depend on data for their CPS applications to train models, datasets must provide some kind of ground truth.
- Ease of publication will facilitate dissemination: A key goal of the CPS-VO data architecture is to support several use cases for researchers. This tutorial builds on a previous tutorial titled “Publishing Your Validation Data to the CPS-VO”, offered at CPSWeek 2022. This tutorial supports researchers who want to build interactive data exploration tools on the VO.