LSH FAIR fellow: Sören Wacker

Bio

Sören Wacker is a Senior Research Engineer at Delft University of Technology, where he develops research platforms and supports FAIR data practices for projects including CropXR. He holds a PhD in Computational Biophysics, with a background spanning molecular simulations, machine learning, and large-scale data infrastructure. At the University of Calgary, he was team leader in data science managing one of the world's largest microbial multi-omics datasets—100TB of genomics, proteomics, and metabolomics data across 35,000 bacterial samples. There he built automated pipelines and developed open-source tools for large-scale data processing and analysis. His technical expertise includes Python, high-performance computing, platform development, and data science. His work centers on streamlining processes through automation and standardization, enabling researchers and teams to focus on scientific discovery rather than data wrangling.

Use Case Title

Harmonizing ISA and MIAPPE for AI-Ready Plant Phenotyping Data.

Use Case Description

ISA (Investigation-Study-Assay) and MIAPPE (Minimum Information About Plant Phenotyping Experiments) are both widely adopted standards in plant science, yet they model experiments differently. ISA is process-centric, tracking material transformations through protocols. MIAPPE is entity-centric, describing biological materials, observation units, and measured variables. This structural discrepancy forces researchers to choose one framework or maintain translation layers between them. This project will analyze the gap between both standards and develop a unified data model where ISA and MIAPPE work together without translation.

What are the biggest challenges you anticipate facing in your use case over the next months?

The core challenge are structural and conceptual incompatibility. ISA models experiments as material transformations through protocols (process-centric). MIAPPE models them as observations on biological materials within spatial hierarchies like field, block, plot, and plant (entity-centric). These are fundamentally different ways of organizing the same experimental reality. A second challenge is community alignment. Both standards have active maintainer communities with different priorities. Any proposed harmonization must be acceptable to both, which requires engagement with these communities. Finally, there is ecosystem compatibility. A unified model should work with existing tools (ISA-tools, BrAPI) and validate against both ISA-JSON schema and MIAPPE checklist requirements.

What specific skills or knowledge do you hope to gain through the fellowship programme?

This fellowship offers the opportunity to learn how to extend general metadata frameworks to support domain-specific standards while maintaining interoperability, and how to design schemas that serve both human annotation and machine processing. Other domains, particularly clinical and biodiversity informatics, have addressed similar cross-standard harmonization challenges, and their approaches can inform this work. The fellowship also provides insight into how to build a data platform around a unified data model. Finally, it enables connections to the ISA, MIAPPE, and Dutch data stewardship communities for ongoing collaboration.

What motivated you to apply for this TDCC LSH fellowship?

ResilienceHub is actively being built, so architectural decisions made now will shape the platform for years. The fellowship offers structured learning and access to expertise outside the ResilienceHub consortium that would inform these decisions. Understanding how clinical informatics handles data quality or how biodiversity informatics addresses heterogeneous data would directly apply to our federated context.

In one compelling sentence, why does your project matter?

Developing climate-resilient crops requires AI-driven research on phenotyping data, which is only possible when the underlying metadata standards work together.

Want to connect with Soren? Use LinkedIn or view the research profile on ORCID.