Job Title: Data Mesh Architect / Lead – Centralized Entity Registry & Federated Governance
Role Overview
We are seeking a visionary and technically adept Data Mesh Architect to lead the design and implementation of a federated data architecture. This role will focus on establishing a centralized entity registry (CDEs), enabling domain-level flexibility while enforcing cross-domain data contracts and governance. The ideal candidate will have deep experience in data mesh principles, cloud-native data platforms, and user-centric design—especially for scientific users like biologists.
Key Responsibilities
1. Centralized Entity Registry & CDEs
* Architect and implement a centralized registry that acts as a metadata catalogue and semantic layer.
* Define and enforce Core Data Entities (CDEs) that standardize identity and lineage across domains.
* Enable domain-specific data products to register via CDE contracts while maintaining autonomy.
2. Data Product Ownership & Federation
* Collaborate with domain teams to identify initial data domains and guide them in creating compliant data products.
* Promote federated ownership and stewardship models across scientific and technical domains.
* Support transition from monolithic views to modular, domain-driven data products.
3. Governance Framework
* Establish federated governance policies for data quality, lineage tracking, access control, and compliance.
* Facilitate steering committee discussions to align CDE definitions across multi-domain products.
* Implement scalable governance workflows that support AI-readiness and research acceleration.
4. Technical Integration
* Integrate with existing platforms including AWS S3, Snowflake, and Palantir Foundry.
* Design interoperability between centralized registry and domain-specific catalogues/systems.
* Ensure compatibility with orchestration frameworks and metadata layers (e.g., AWS Glue, DataZone).
5. Business Outcomes
* Drive faster research cycles through scalable data governance and discoverability.
* Enable AI-ready data environments that support analytics, modelling, and scientific workflows.
* Improve cross-domain collaboration and data reuse through standardized metadata and lineage.
6. User Experience & Adoption
* Address user acceptance challenges, especially among biologists using GraphPad Prism.
* Design workflows that allow seamless file-based access and integration with scientific tools.
* Advocate for intuitive interfaces and minimal disruption to existing research practices.
Required Skills & Experience
* 8+ years in data architecture,
data engineering, or data governance roles.
* Proven experience with data mesh, metadata catalogues, and semantic modelling.
* Hands-on expertise in AWS (S3, Glue, Athena), Snowflake, and Palantir Foundry.
* Familiarity with FAIR principles, data lineage, and domain-driven design.
* Experience working with scientific data modalities and tools like GraphPad Prism.
* Strong communication and stakeholder management skills.
Preferred Qualifications
* Experience in pharma/biotech or scientific informatics domains.
* Exposure to tools like Denodo, Databricks, Airflow, and Azure Synapse.
* Understanding of user-centric design for scientific workflows.