A step-by-step overview of how to use the repository.
Overview
Time estimation: 40 minutes
Version: main
Last update: 2025-12-10
Questions:Objectives:
How to start looking for a Data-Management Solution?
What benefits are there to having a cross-domain repository?
How to extend a Data-Management solution using APIs and other platforms?
Understand the purpose and scope of the Data Management Platforms repository.
Learn how to navigate and interpret the listed platforms.
Contribute new or updated information to the repository.
Be able to contribute to it,
Know its limits and what to do about them,
Table of Contents In this tutorial we will deal with:
- Scope
- Prerequisites
- Setting-up a Data Management Solution, a short overview
- Introduction to the Data Management Platform Repository
- Hands-on: Using the Repository
- Step 1: Access the Repository
- Step 2: Understand the Main Goals
- Step 3: Using the repository
- Hands On: Find a commercial DM platform for a domain
- Answer
- Hands On: Find details about a specific entry
- Answer
- details
- Hands On: Find details about a specific entry
- Answer
- Step 3: Generalists vs Specialists
- Hands On: Find one generalist platform and a specialist one
- Answer
- Hands On: How could you combine 2 platforms
- Answer
- Step 4: Considering Set-up Options
- Step 5: Be Aware of What’s NOT Included
- warning
- Step 6: Next Technical Steps
- Step 7: Contribute to the Repository
- Conclusion
This tutorial is not a comprehensive guide to doing Data Management, but a “been-there-done-that” introduction followed by how to use the DeKCD registry of FAIR data management platforms for the Cloud. It is mostly on the technical side of Data Management.
To follow this tutorial:
For a good use of the repository:
The repository is mostly a technical overview, and as such this section is mostly scoped to the technical aspects.
Several online resources help with a Data Management Plan, making it FAIR and various aspects, such as RDM Kit (in Life Sciences), FairCookBook, Data Stewardship Wizard, FairWizard.
The choice of a Data Management Solution is deeply dependent on the given goals and purposes. A solution for supporting a one-year work period could use a simple web application installed using Docker on a local server, while a solution meant for 10 years and several projects might benefit from being on the cloud, using several connected web applications.
Unfortunately there is no silver bullet, as the best solution will depend on your specific conditions, if you are part of a consortium proposing some infrastructure, if your institution offers some solutions, and if you have the means to get a technical person for the duration of the project.
Some Data Management platforms are easy to set up, easy to update, and easy to use. Some are not, but they might address your needs better. In that case, it is probably good to consider the highest point of friction: what will cost the most in the long run. But also critical points: what cannot be accepted.
A critical point is the impossibility of doing updates. If your Data Management platform is online, it needs to be updated for security reasons. It might not be too critical if the application is behind a VPN, but that can change and might be a bigger problem later.
Points of friction are the ease of use, the difficulty to update, and the difficulty to set it up, from most important to less important, as a difficult-to-use application might simply be avoided. Users might also use some of their own solutions, making future data management really difficult. But you have to be able to set it up and update it, and a difficult setup might turn an emergency into a long downtime.
Similarly, an assembly can be desirable: one application closer to the lab communicating with a sharing platform, for instance. But the API communication needs to be secure, robust, and well documented (also on your setup).
Versioning is always a good option for all your customisations, APIs, templating, parameters… When the customisation is not easily versionable, it is a good idea, when it is available as text, to work on a versioned copy. This works particularly well with a test setup, which is also always good to have: apply the working changes only on test, and apply them on production only when committed in the Version Control System.
Constraints might be funding constraints, data usage constraints due to data privacy policy, but also non-commercial-use-only of software or data…
They might be considered as a chain: your Data Management platform needs to pass all elements of the chain.
Some constraints might be extremely costly, like adapting a GPL-licensed platform in a commercial environment: the licence forces you to share all changes, which could be a no-go for a private entity. Non-compliance with data privacy could result in a fine, and a leak of personal data could be devastating.
They have strong connections with your means. You should have some leeway in order to manage constraints: i.e. the setup should not be so complex that the person(s) managing it can only focus on the needed technical aspect.
Some decisions related to constraints must also be made before setting up the platform. For instance, if you work on patient-related data and need encrypted storage, you need to choose a platform supporting it.
The Data Management Platforms (DMP) repository is a curated collection of major platforms used in research data management.
It emphasizes the FAIR principles (Findable, Accessible, Interoperable, Reusable) and evaluates how platforms integrate with cloud infrastructure.
This tutorial will guide you through using the repository, understanding its scope, and contributing to it.
The repository highlights platforms with:
Note: While most platforms align with FAIR principles, not all do completely.
The menus are on the top, the left, and the right:
The top menu will typically be used once or twice, to check the About page.

The left menu will be used to select in which category to look for:

The right menu will be used to go to a specific topic and most of the time to a specific domain. Some topics have subtopics that become visible in the menu when in the parent topic.

Hands On: Find a commercial DM platform for a domain
From the Home Page, navigate to the Commercials subtopic of Geomatics in the list of Major Data Management platforms.
Answer
First click on the Major Data Management Platforms link in the left menu.
Then on Geomatics on the right menu, and finally on Commercials in the sub-menu.
Not all entries are identical, as there is some extra information for some entries. But all entries should have a common base, and most entries will stick to it:
Hands On: Find details about a specific entry
In the Major Data Management Platforms page, find the details about pyiron, which is in Materials Science. Check if it is cross-domain.
Answer
Click on Materials Science on the right menu, then eventually scroll down to find the details about pyiron, including if it is cross-domain.
The repository provides a search function, powered by the document system Quarto, which is accessed by clicking on the magnifying lens icon at the top right of the top menu.
In the pop-up that appears, typing should automatically start a search.

Clicking on one result should lead to the paragraph containing the result. It is possible to use the browser “Search in Page” to go to the actual entry.

details
Depending on the size of the page, there could be 2 magnifying lenses. They have both the exact same function.
Hands On: Find details about a specific entry
Using the search, look for CIViC.
Answer
Click on the magnifying lens.
Then type CIV in the search bar.
Finally click on the search result.
The result should roughly be in the middle of the page.
The repository also lists, in a simplified format:



As for the Data Management Platforms, they do not pretend to be exhaustive and all contributions are welcome.
One main consideration when choosing a Data Management platform is the choice between generalist platforms (i.e. not dedicated to a domain) and specialist platforms, dedicated and adapted to a domain. A specialist platform might still be used in another domain, eventually with some caveats. The last point of each entry explains how cross-domain an entry is.
While choosing a generalist platform might allow you to quickly start working and storing data, a specialised platform might offer some clear benefits: pre-existing metadata and/or ontologies/taxonomies, a structured storage of data making reusing and sharing easier, …
Hands On: Find one generalist platform and a specialist one
Using the right menu, look for Generalists. Find a Generalist platform of your choice.
Then select a specific domain to find a Specialist platform.
Answer
Click on the Generalists in the right menu.
The section lists the Generalist Data Management Platforms. The next section lists the Authority control platforms, which are generally online and also generalist, such as ORCID. The last list is for platforms which are not exactly a Data Management platform but can be used as one or used by another platform and lists, at the time of this writing, only NextCloud.
NextCloud can be used as a flexible Data Management platform, with the limit that it is difficult to structure the stored data, or as a storage solution for another platform.
Hands On: How could you combine 2 platforms
Your institution is already using NextCloud as a generic distributed storage facility. Inside a shared folder, you store many large microscopy images. You would like to have an online tool to visualise and work on these images.
Answer
Omero is an online imaging platform for microscopy outputs, and can be connected to other platforms using APIs. A central login is also possible using a central identity service (but this information is not part of the current registry).
Using both NextCloud and Omero APIs, it is possible to bridge both, eventually with a small script fetching the images from NextCloud and pushing them to Omero. Omero also proposes specific import scenarios.
While not giving a detailed explanation on how to set up a platform, an entry will say if a containerized setup exists, as well as an integration in Kubernetes.
Containerized platforms are generally easier to set up and should be easier to maintain and update. But it is important to check thoroughly how well this support is: if the image is updated regularly, if an update process exists.
An integration with Kubernetes might allow easier continuous operation, where Kubernetes will take care of the lifeline of the platform. But the setup will generally be more complex as well.
In both cases, it is important to know where the data will be stored, as both are based on images, so the data will be stored externally (most probably in volumes in both cases). The data should be securely stored and backed up regularly.
Some platforms will come with a Docker Compose or a Helm chart, and in that case the storage might be set up as part of the configuration. But generally not the backup.
A lack of containerized setup does not mean that the platform will be difficult to set up. But with a standard setup, it will always be necessary to consult the installation setup documentation and follow the given procedure.
Finally, each entry tries to detail if there exists an API access, enabling interoperability and data extraction.
Adding an API access is not a simple task, so the need to have one should be clarified before choosing a platform. This access can be via an HTTP REST API, generally easy to use, a CLI, which will generally be non-standard, or a language API: Python, Java, … Interfacing using the same language will be easy, probably easier than through a REST API (or very similar in the case of Python), but interfacing with a different language might be difficult.
These technical aspects might give a security risk: if the platform is hard to update, if the API is too open, if the setup is too complex and might let some openings. In some cases, simpler is better, so it is possible to set up the platform with a reasonable level of security. If IT support is available, it is also advisable to involve them early.
The repository does not provide:
warning
Pitfall: Modifying open-source software may block future updates. It is advised to prefer configuring over modifying.
A Quickstart guide for setting up a Data Management platform is planned:
You can contribute via GitHub - the link is also provided on the top header with the GitHub logo:
A typical entry should include:
* [Name, quick description](https:URL of main web page)
+ Description and/or link to potential Docker image/Docker compose/Kubernetes manifest/…
+ Link to API(s), quick clarification on how well documented it is.
+ Interoperability **NONE/LOW/MEDIUM/HIGH** / No interoperability: explanation on why.
+ **Not/Partly/Mostly/Fully** cross-domain: explanation on why.
You now know how to use and navigate the Data Management Platforms repository.
It is a living resource focused on FAIR principles and cloud compatibility, and it depends on community contributions to grow and remain current.
For questions or contributions, contact:
Alain Becam – Alain.Becam@bioquant.uni-heidelberg.de
Project site: https://datenkompetenz.cloud/
Key Points
The repository lists cross-domain Data Management platforms with a FAIR and cloud focus.
It does not replace a full Data Management Plan.
Platforms can be generalist or specialist, depending on needs.
Contributions are welcome via GitHub issues and pull requests.
Contributions
Author(s): Alain Becam
Editor(s): AB
Supported by: