Data Workspaces (DWS) is an open source framework for maintaining the state of a (data) science project. It originated at the Max Planck Institute for Software Systems in Kaiserslautern, Germany. DWS was motived by pains we experienced on various (data) science projects: difficulties tracking all the different parts of a project, inability to explain why results changed from day to day, challenges in collaboration, etc.
The goals of the project are to:
- Capture all the parts of a project in one place, including data sets, intermediate data, results, and code.
- Support collaboration by sharing code and data, through synchronization with any Git-based service (GitHub, GitLab, BitBucket, etc.).
- Automatically track experimental results and the lineage (history) of data used in experiments.
- Provide an integrated set of reporting and analysis tools for working with (data) science projects.
For more information, see the Overview.