Data mesh in structure-based drug design: A transformative approach

23 May 2025

Unlocking the power of decentralised data for drug discovery success

by Dr. Neil Taylor

In this article, we take a look at how the adoption of data mesh principles is reshaping structure-based drug design (SBDD), helping pharmaceutical companies streamline discovery, empower scientists, and accelerate innovation.

Data mesh architecture represents a paradigm shift in how pharmaceutical companies manage and leverage their data for structure-based drug design (SBDD). It’s a decentralised approach to data management which addresses many challenges faced by traditional centralised systems, particularly in the complex, data-intensive field of drug discovery.  DesertSci was an early adopter of this decentralised, domain oriented data strategy, and it has been at the core of our software architecture for more than 20 years.

Structure-based drug design relies on understanding the three-dimensional structures of biological targets and how potential drug molecules might interact with them. This process generates enormous volumes of heterogeneous data from multiple sources including crystallography, cryo-EM, molecular dynamics simulations, virtual screening campaigns, and experimental assays. Traditional, centralised data architectures are not able to handle this complexity efficiently, in particular, exploiting high-value 3D protein structure data.

Proasis, DesertSci’s flagship product, offers significant advantages for managing this complicated data landscape by applying four fundamental datamesh principles:

  1. Enabling domain oriented ownership
  2. Treating data as a product
  3. Providing a self-service data platform
  4. Implementing federated governance

This type of data mesh architecture transforms SBDD workflows. It aligns perfectly with the multidisciplinary nature of drug discovery, where computational chemists, structural biologists, medicinal chemists, and pharmacologists must collaborate effectively as data producers and data consumers across various domains.

Key benefits of data mesh in structure-based drug design

1. Empowering domain experts as data producers

The data mesh approach enables domain experts, possessing in-depth understanding of specific data, to manage and curate their own datasets. For example, structural biologists maintain protein structure data; computational chemists oversee docking results; medicinal chemists manage SAR data. Crucially, this decentralised ownership ensures higher data quality, appropriate context, and easier discoverability for all data consumers across the organisation.

2. Accelerating decision-making

By removing the bottleneck of relying on centralised data engineering teams, drug discovery teams can access the data they need via standardised interfaces and APIs within the data pipeline. This immediacy speeds up the iterative SBDD cycle – from structure determination and compound design to synthesis and testing – improving project timelines and outcomes.

3. Scaling effectively with growth

As organisations grow their compound libraries, expand screening capacities, or adopt new computational methods, data mesh prevents the bottlenecks typical in centralised systems and allows the data pipeline to scale naturally and flexibly as demand increases.

4. Enhancing governance while maintaining agility

Lastly, federated governance within a data mesh ensures that data standards and compliance requirements are met, while also allowing domain teams to innovate and evolve their data products to meet project-specific needs.

Transforming SBDD workflows across the drug discovery pipeline

Throughout the drug discovery process, data mesh principles provide significant improvements:

  • Target identification and validation: Structural biology teams create data products combining protein sequences, structures, binding site analyses, and druggability assessments. These are readily accessible to other teams, accelerating the path from target identification to lead discovery.
  • Virtual screening campaigns: Computational chemistry groups manage extensive docking results as data products, with complete metadata regarding methods, scoring functions, and confidence metrics. Medicinal chemists, acting as data consumers, can access these results effortlessly to inform synthesis decisions.
  • Lead optimisation: Integrated data products draw together structural insights, computational predictions, and experimental data, enabling domain teams to maintain ownership while supporting seamless, cross-functional collaboration through the data pipeline.

Beyond enhancing current workflows, data mesh addresses one of the pharmaceutical industry’s biggest challenges: effectively leveraging historical data for new projects. Instead of being hidden away in siloed archives, past screening results, SAR data, and structural analyses become well-documented, easily discoverable data products, maximising the value of accumulated institutional knowledge.

As pharmaceutical companies increasingly turn to AI and machine learning to drive drug discovery, having a solid data mesh foundation ensures that clean, contextual, accessible data flows through the data pipeline – precisely the environment needed for successful AI applications.

While the transition to data mesh approach requires strategic foresight, the rewards are substantial: faster discovery cycles, better decision-making, and stronger collaboration across disciplines – all giving pharmaceutical companies a vital competitive edge.

At DesertSci, we have long recognised the transformative power of decentralised data strategies in drug discovery. By embedding data mesh principles into our solutions like Proasis, we enable pharmaceutical organisations to streamline their structure-based drug design (SBDD) workflows, accelerate innovation, and fully leverage the expertise of their cross functional teams. With over two decades of experience pioneering domain-oriented data platforms, DesertSci continues to support our partners in building the flexible, scalable data pipelines needed for the next generation of breakthroughs.

Dr. Neil Taylor, founder of DesertSci, is a leading expert in applying data mesh architecture to structure-based drug design – connect with him on LinkedIn to learn more.

Posted in: Current

Comments: (0)

Leave a Comment