Workshop on FAIR for Physical Objects – Summary and Conclusions
Author: Esther Plomp
On the first of March 2023, the FAIR workshop for physical objects was organised by the Thematic Digital Competence Centre for the Natural & Engineering Sciences (TDCC NES). Thanks to Jacquelijn Ringersma, the current Network Manager for TDCC NES, Mira Stanic (Community Coordinator) and Zita Bernhoeft (Project Support Officer), a group of 30 were gathered to discuss anything related to physical object management.
After introductions by all the attendees, the workshop kicked off with a presentation on ISCRIC by Stephan Mantel and Luís Duque Moreira de Sous. ISRIC is a custodian of global soil information. For example, 75% of their soil materials are reference samples that are well documented and can be used for research and education purposes. Once soil profiles are collected they are photographed and chipped (radio-frequency identification). All profiles have a unique identifier following ISO standards. ISRIC also houses thin section collections from other institutes that can no longer maintain them.
Some of the challenges identified by ISRIC include:
- Gaps by incomplete procedure in acquisition
- Partial information sources for parts of the collection that are not integrated
- Accountability/traceability is limited
- Currently not possible to include attributions in the metadata
ISRIC also maintains a catalogue of spatial datasets (implementing the ISO19139 standard for spatial metadata, next to Dublin Core). For each metadata record a DOI is created (for example, the museum monolith collection is part of this database). Next to addressing the challenges described above, ISRIC is currently working on selecting a suitable metadata ontology, adopting a semantic web approach and setting up a new URL policy.
Next, Wouter Addink introduced Distributed System of Scientific Collections (DiSSCo). DiSSCo’s goal is to digitally unify the fragmented landscape into a collection with common curation, access policies and practices, following the FAIR principles (Findable, Accessible, Interoperable and Reusable).
DiSSCo places the digital specimen at the centre, focusing on accurate information about the digital specimen and any information derived from the specimen. They want to adopt Digital Object Architecture (in connection to the FAIR Digital objects forum). All objects should be FAIR: they should have a persistent identifier and metadata information. Information should be retrievable, traceable and also allow annotation history (interpretations are part of the scientific and historical record). They have worked on minimum information levels. DiSSCo uses the Handle System internally, but they also use DOI’s because they are more known and widely accepted.
Digitally linked specimens enable experts to collaboratively work on the collections and more easily build on each other’s efforts. It also facilitates data citation and attribution, and enables AI services that can speed up digitisation.
Then, Simone Kortekaas from the National Library (KB) provided a library perspective on FAIR for physical objects. She highlighted the work by WorldCat, a global catalogue for library materials. In terms of persistent identifiers they use barcodes/RFID-tags for the physical objects, and they employ international standards such as ISBN, ISSN, ISNI. The shared cataloguing means that metadata can be reused.
The National Library (KB) has a legal task to preserve materials published in and about the Netherlands. They preserve all these materials in a collection that needs to be protected, and information about restoration needs to be added to the metadata. Books themselves may not only be read, but also subject of other types of research which requires a preserved physical item.
Following, Esther Plomp from TU Delft gave a presentation about the work of the Research Data Alliance (RDA) on physical samples. This work is primarily done by the Interest group on Physical Samples and Collections in the Research Data Ecosystem, which recognises the need for FAIR samples so that physical samples are reusable, reproducible and so that work around physical samples can receive recognition. Some of the objectives of this interest group are to identify existing systems and challenges to linking physical samples with digital data, align different stakeholders, and facilitate international cooperation on the topic of physical samples.
The interest group has worked on multiple topics in the past couple of years, of which Esther highlighted four specifically:
1. IGSN: International Generic Sample Number which partnered with DataCite to register IGSNs via DataCite.
2. Collaboration with Earth Science Information Partners (ESIP) on sample citation recommendations (as part of their Physical sample cluster).
3. RDA & ESIP Physical Samples Webinar Series (2021) on electronic lab notebooks, persistent identifiers in the biomedical literature and interdisciplinary metadata.
4. 23 Things for Physical Samples: an introductory output for anyone that wants to get started with physical sample management
After a short break Madeleine de Smaele presented on the work done by 4TU.ResearchData, an international data repository for science, engineering and design. 4TU.ResearchData was founded in 2010, and they are a trusted repository with a Core Trust Seal. The repository is about to move from a commercial provider (Figshare) to an inhouse developed open source solution, with an open data model based on RDF.
There is also the 4TU.ResearchData community, with several working groups focusing on more specific data topics.
4TU.ResearchData normally assigns DOIs to datasets and are now looking at how to assign IGSNs via DataCite. The benefits of using IGSNs is that it is possible to add a RelatedIdentifier that describes the relationships between the physical sample and related sources (sub samples, publications etc). An important point to consider is, however, that the DataCite DOI policy requires you to only register DOIs for content that you’re responsible for. That would mean that the repository can only register DOIs for samples within the institute.
After the presentations the group split into two discussion groups. Some of the topics that were brought up included:
· When is it enough? Is it necessary to apply FAIR to physical objects? Who should decide this?
· It is difficult to get community agreement on what needs to be preserved, what information needs to accompany it and what needs to be shared.
· FAIR for physical samples is especially important when damaging analysis methods are applied and objects are transformed (for example, restoration of paintings).
The conclusions of the discussion were in two opposite directions
- On one side it was felt that we should really focus on the physical object and develop a standardised method on how to make these objects Findable and Accessible, so that a good step towards Reusable is made. We agreed that Interoperable would be hard to achieve. By focusing on the physical object itself, it follows that less focus should be placed on the derived Digital Object (or Twin).
- On the other side, the key to making physical objects FAIR was seen through applying FAIR principles to their digital surrogates. This followed from a consensus of a necessity for a 1:1 relationship between the object and the digital specimen. A need for shared guidelines on how the objects are described in the digitisation process was identified (the types of objects – exhaustible samples, reproducible objects, etc. or lifecycle state – active, archived, split, etc.). At the same time, a strong focus on the quality and longevity of metadata as well as mapping between ontologies was discussed as key towards FAIRness of digital specimens.