Blogs & News
As health research increasingly depends on vast amounts of data, organisations face mounting pressure to manage this data efficiently, securely, and sustainably. The UK’s health data landscape, as highlighted in Professor Cathie Sudlow’s recent independent review, “Uniting the UK’s Health Data: A Huge Opportunity for Society” underscores the critical challenges and opportunities within this space. Among the most significant barriers identified in the report are the underfunding and undervaluation of data management and sustainability, alongside inefficiencies in accessing and curating health data.
These challenges present an urgent call for innovation—and Secure Data Environments (SDEs) are emerging as a leading solution. By addressing the inefficiencies of traditional data management models, SDEs offer a cost-effective and scalable approach that unlocks the potential of health data while significantly reducing the total cost of ownership (TCO). This blog explores how SDEs deliver cost savings compared to traditional Full-Time Equivalent (FTE) staffing and DIY research cloud or High-Performance Computing (HPC) systems, highlighting the critical insights from Sudlow’s report and their implications for researchers, funders, and policymakers.
Professor Sudlow’s report makes a compelling case for greater investment in health data infrastructure, pointing out that research projects and clinical activities often fail to adequately budget for the critical components of data management and sustainability. This oversight leads to delays, inefficiencies, and missed opportunities for meaningful research. For example:
Adding to these challenges is the need for sustainable funding mechanisms that recognise data management as a core component of research. A model that mirrors open access publishing fees could be established, where data management costs are treated as a per-project or per-participant expense, explicitly included in research budgets. Such costs could ensure the secure storage, curation, and governance of data, much like approved costs for animal or cellular research are routinely funded. This approach would enable a proactive and equitable allocation of resources for SDEs across all projects.
Secure Data Environments (SDEs) provide a transformative approach to addressing these challenges. These platforms enable researchers to access and analyse de-identified health data in a secure, controlled environment, ensuring compliance with data privacy standards while fostering innovation. More importantly, they offer significant cost advantages over traditional methods.
Traditional data management systems often depend on large teams of data stewards, analysts, and IT staff to manually curate, clean, and govern data. This labour-intensive approach is not only slow but also expensive. By contrast, SDEs streamline these workflows through automation and built-in governance mechanisms, significantly reducing the need for manual intervention.
High-Performance Computing (HPC) systems, while powerful, require substantial upfront capital investment and ongoing maintenance costs irrespective of whether you are building and maintaining your own HPC or acting as a user of public or private facilities. HPC are traditionally designed for non-sensitive research workloads, such as astronomy simulations or materials chemistry modelling, where strict data governance and privacy regulations are not required.
For sensitive health data, HPC environments often fall short in key areas:
These limitations can expose health data to security and compliance risks that outweigh the benefits of raw compute power.
For comparison, Secure Data Environments (SDEs) are designed specifically for sensitive data, with built-in capabilities for:
SDEs, by addressing these challenges, provide a flexible, scalable, and compliant alternative to traditional HPC systems while ensuring the highest standards of data security and governance.
High-Performance Computing (HPC) systems, while powerful, require substantial upfront capital investment and ongoing maintenance costs. Many research projects overestimate their computational needs, resulting in underutilised HPC resources. SDEs offer a more flexible, on-demand model, allowing researchers to scale computing power as needed, avoiding unnecessary expenditure.
When considering the TCO of data management solutions, SDEs consistently outperform traditional approaches. Factors contributing to their cost-effectiveness include:
A case study within the Sudlow report emphasises the time and cost savings achieved when researchers could securely access linked datasets in real time—an efficiency that is far from standard practice in traditional systems. We’ve previously blogged on this issue in detail (The Economics of Building and Running a Trusted Research Environment; Building Versus Buying a TRE).
In addition to their cost-effectiveness, SDEs are powerful enablers of federated research. Federation allows data stored in different SDEs to remain in place while still being securely accessed and analysed across multiple sites. This approach eliminates the need for data duplication, enhances collaboration, and significantly reduces data transfer costs.
Aridhia has taken a leadership role in advancing federation by open-sourcing our federated node technology. This initiative reflects our commitment to open science and empowering health and research communities to conduct secure, collaborative analytics on a global scale. By enabling seamless connections between SDEs, we are helping researchers unlock new insights while maintaining the highest standards of data privacy and security.
Consider a mid-sized research project involving 50-100 TB of data distributed across 3-5 sites. Analysis is performed by a 5-10 person research team over 12 months.
Assumptions: Data stewards, analysts, and IT staff are needed to curate, clean, and govern the datasets manually for ~12 months of effort. Based on typical UK salaries, fully loaded costs per FTE (including benefits and overheads) range from £60,000+ per year.
High upfront capital investment to build or upgrade in-house HPC infrastructure, estimated at £400,000 or more. Operational expenses, including power, cooling, IT support, and hardware upgrades, typically cost an additional £50,000 to £100,000 annually.
Assumptions: The SDE automates significant portions of the data curation and governance process. This reduces the need for dedicated data stewards, cutting FTE costs by up to 50% (e.g., requiring 1-1.5 FTEs instead of 2-3), and streamlined workflows allow projects to progress faster, reducing overall labour requirements.
SDEs, leveraging Azure, provide compute resources on-demand. Researchers only pay for the resources they use, avoiding upfront infrastructure investments. By enabling data to remain in place while being accessed securely, SDEs reduce the need for expensive data transfer and storage duplication.
By adopting an SDE:
As noted earlier, see our earlier blog posts for a more detailed cost comparison (The Economics of Building and Running a Trusted Research Environment; Building Versus Buying a TRE).
To fully unlock the potential of health data and realise the cost efficiencies of SDEs, we need coordinated action from stakeholders across the ecosystem. Here are three key recommendations:
Funders must recognise the critical role of data management in research success and allocate dedicated budgets for infrastructure. This aligns with Sudlow’s call for long-term investment strategies rather than short-term fixes.
SDE providers can help lead the way by developing transparent, tiered pricing models that reflect the true costs of data management.
However, sustainable funding for data access and management also requires advocacy from researchers and policymakers. A broader approach would incorporate per-project or per-participant funding models, where these costs are explicitly recognised and budgeted in research funding applications. This mirrors the approach already taken for open access publishing fees or approved unit costs in other research areas, such as animal and cellular studies. Early adoption incentives could drive widespread uptake while ensuring equitable support for underfunded research teams.
A UK-wide accreditation system, as recommended in the Sudlow report, will establish trust and ensure consistency across SDE platforms. Aridhia, as a leader in this space, can play a pivotal role in shaping these standards, and we’ve blogged previously on how our DRE delivers and exceeds the Five Safes and SATRE standards.
The challenges outlined in the Sudlow report present a unique opportunity for SDE providers like Aridhia to demonstrate leadership. By addressing cost barriers and showcasing the value of SDEs through case studies and transparent cost analyses, we can build a compelling case for their widespread adoption. We note these challenges aren’t just restricted the UK, and relevant worldwide across the research landscape.
Federation between SDEs further enhances the value proposition by enabling collaborative research without compromising data security or privacy. Aridhia’s open-source federated node technology exemplifies our commitment to innovation and open science, providing researchers with the tools they need to tackle global health challenges.
One of the most valuable advantages of SDEs is their robust approach to data security and governance, which is crucial when working with sensitive health data. SDEs provide built-in compliance mechanisms that ensure adherence to stringent privacy regulations, such as GDPR, while offering a secure environment for managing and analysing de-identified data.
By contrast, other platforms or DIY solutions often lack these integrated safeguards, leaving organisations exposed to significant risks, including data breaches, unauthorised access, and non-compliance with legal requirements. The financial and reputational damages from such incidents can be catastrophic, particularly in healthcare, where trust is paramount.
SDEs also enhance auditability and transparency, enabling organisations to track data usage, access, and sharing. This level of oversight not only mitigates risks but also fosters greater collaboration and trust among stakeholders.
In an era where data breaches are increasingly common, investing in SDEs is not merely a cost consideration but a strategic necessity for any organisation handling sensitive health data.
December 18, 2024
Dr. Kim Carter is the Chief Data Officer at Aridhia, driving data strategy and empowering customers to maximize the potential of Aridhia's SaaS/PaaS platform. Before Aridhia, Kim was the Senior Manager of Data and Insights at Minderoo Foundation, where he led the global federated cancer data platform and a large D&I team.
Previously, Kim was a Senior Managing Consultant at Data Analysis Australia. He also served as Manager of Bioinformatics at the Telethon Kids Institute (10 years), where he developed a successful in-house data analytics platform. His career spans academic, commercial and philanthropic domains with a focus on innovation and delivering impactful data-driven solutions.