Blogs & News

Home Blogs & News

The Economics of Building and Running a Trusted Research Environment

In a previous blog post, I wrote about what whether organisations should build their own Trusted Research Environment (TRE), or subscribe to a managed service from a commercial partner such as Aridhia. I argued that, for most organisations, subscribing to a TRE would be quicker and more cost effective than trying to build their own, allowing institutions to focus their time on the research they want to encourage, rather than on a large software development project followed by an ongoing maintenance, deployment, and support budget.

In this blog I would like to dig deeper into one of the key considerations in this debate, which is the economics of building and running your own TRE (let’s call it the DIY approach). Specifically, I would like to detail what the representative costs are and how these are likely to develop over time, allowing us to calculate a Total Cost of Ownership (TCO) for the DIY approach.

Typical scenarios

Let us take two typical scenarios and calculate a 5-year TCO for each.

Scenario 1 is a UK-based organisation that already conducts research but wants to expand its project portfolio, its user base, and to be more widely recognised as a leader in its particular field. It has recognised that a TRE is needed to achieve these goals and has decided to follow a DIY approach. This institution:

• Has data assets it wants to make available to internal researchers and its user community.
• Wants to enable a significant number of research projects (e.g. 30, with each project requiring its own ‘workspace’).
• Wants to collaborate with external partners both nationally and internationally, to improve the scale, quality, and impact of its research projects.
• Plans to re-charge some of those external partners to recover costs and help to fund the investment in the TRE.

The organisation has an internal IT team with some software development and cloud engineering capability; however, the team is allocated to existing commitments, therefore additional hiring will be needed. A decision has been made that the TRE must be built in one of the main cloud providers, with data residing in the UK. It is hoped that a minimum viable product can be built during the first 12 months, which can then go into production, albeit with a limited set of features which will then be enhanced over time.

Scenario 2 is for an international organisation with a larger research portfolio, spanning hundreds of projects. It wants to establish a more complex multi-region network of TREs, where data is held in several different countries.

TRE scope

Let’s first consider what the scope of the TRE being built would be. The functionality of a TRE can vary widely, however a framework called SATRE (Standardised Architecture for Trusted Research Environments) has been developed by several partners, to try and introduce a level of consistency across TRE implementations. SATRE provides a reference architecture, covering topics such as information governance, computing technology, and data management. For this exercise we will assume that in both our example scenarios, the scope of the TREs being developed aligns with the SATRE framework.

Cost breakdown

A DIY TRE build is a major software development activity, so we would next look at what this is likely to incur. Based on our experience at Aridhia, I have grouped costs into the following five categories (albeit the numbers are scaled back from Aridhia’s larger cost profile). An organisation does not incur these costs directly when buying from a vendor, however they are incurred in a DIY scenario.

Development team

A development team needs a Product Owner to drive requirements, a Scrum Master to manage the Agile process, a Development Manager to manage the technical team, and a Software/Cloud Architect to lead the technical design. While some of those roles may be combined, that is at least 3 people, before a line of code is written. The team will then need software developers, testers, and operations/infrastructure engineers. I have assumed a total team size of 12 during the 5-year period, given the range of skills needed and the scope of what needs to be built. This team will build and maintain the TRE over the period and will also be responsible for deployment and 2/3rd line support activities.

In scenario 1 I have used average UK industry salaries plus 20% for staff benefits and overheads.

In scenario 2 I have used average US industry salaries which are significantly higher and, given the more complex scope of the TRE, a larger team size of 16.

Information Security and certification

A key requirement in any TRE is a strong security framework. This will need dedicated security expertise, annual penetration tests, cyber security tools, and gaining certifications which demonstrate standards compliance.

I have assumed that in both scenarios the team works towards ISO 27001 in year 1, adds ISO 27701 for GDPR compliance in year 2.

In scenario 2 I have assumed that HITRUST for HIPAA compliance is added in year 3, in addition to the ISO standards.

Dev and test cloud environments

Separate environments will be needed to develop and test, prior to running in production.

I have excluded the cost of the production cloud environment, given that this will depend on several factors including required storage volumes and compute, user numbers, etc. This cost can easily run into a six-figure per annum sum.

Software subscriptions and tools

A suite of software tools with license/subscription costs will be needed for source code control, an IDE, software project management, deployments, automated testing, vulnerability scanning et al.

Support team

This covers the Service Desk, Training, Knowledge Base, etc. Staff will be needed to deal with 1st line end user support once the TRE goes live and to develop training materials (e.g. knowledge base, videos). Projects are diverse and users of the TRE will need onboarding support, help with configuration of specialist pipelines and code. I have assumed these costs start in year 2 as the team moves towards making the MVP available.

Note that I have excluded certain costs from this exercise, such as finding/hiring your development team and training them, and for running particular software tools within the TRE which may have a license cost e.g. SAS. I have also excluded costs which will be similar irrespective of whether a TRE is built as a DIY exercise or is being bought as a product/service from a commercial vendor. This includes legal (e.g. to negotiate data sharing agreements), costs to prepare/curate data prior to ingestion into the TRE, and annual inflationary increases.

Calculating the Total Cost of Ownership

Based on the above categories, I estimate the following annual TCO costs for a DIY build in each scenario:
• Scenario 1, medium sized TRE, UK based, $1.6M cost per annum (excluding cloud production environment costs).
• Scenario 2, large multi-national TRE, $3.5M cost per annum (excluding cloud production environment costs).

The costs approximately break down as:

• The development team will account for circa 65-75% of the total budget
• Information security and certification for 10-15%
• Cloud dev/test environments circa 5-10%
• Software subscriptions and tooling circa 5%
• Support team circa 5-10%.

Observations

The development team will be by far the largest cost, and you may think that 12+ staff is larger than you will need. However, I would be very cautious before reducing this part of the budget, bearing in mind that:

• Once you have a minimum viable product available for users, you may consider that it is “largely built.” Our experience of TREs over many years is that that is not the case, and that is when the hard work really begins (i.e. adding new features, dealing with technical debt, integrating with new data sharing partners and improving security), all while maintaining a production level service.
• Your users are essentially conducting research & development within the TRE – there are always new use cases, new paradigms, and new tools appearing for your team to deal with.
• It is not just the build of end user features, it is the DevOps engineering to enable zero downtime deployments, back-ups, monitoring, alerting, disaster recovery plans et al.

To put my cost estimates into perspective, I note that the MRC is providing grants of up to $4.75M per project over several years to UK institutions which will partially fund projects on “Enhancing biomedical and health-related data and digital platform resources.” This funding seems to be aimed at building specific tools and features for TREs (data ingress, export and FAIR are mentioned), as opposed to building an entire end-to-end TRE. However, even with this more limited scope, it is recognised that a significant 7-figure investment may be needed.

How does this compare with buying?

When buying a product/service from a vendor, the cost profile and commercial terms should allow the customer to pay based on demand. For example, you may wish to start with a smaller proof-of-concept, say a handful of representative projects. It should then be possible to access an advanced level of features and capabilities, for a relatively small investment.
Following a successful PoC, the vendor’s pricing scheme should then allow the customer to scale up/down through various pricing tiers as needed, without being locked into an all-or-nothing annual payment.

In comparison, an internal DIY project needs a large upfront investment in people, tools, certifications etc. to get started, reach the MVP, and then be maintained. Irrespective of whether the organisation is fulfilling 10 projects or 100s, this baseline investment is unavoidable, and it will be more difficult to flex your internal cost based on demand, compared to a subscription style pricing scheme that you should be able to get from a vendor.

So, should you build your own?

There are many components which can easily be missed when trying to calculate the true TCO of a significant software development exercise, which is what building a TRE is. In this blog I have tried to produce representative TCO figures of a DIY build for two typical scenarios.

These TCO estimates can be compared against commercial offerings, and I would be very surprised if a customer cannot achieve significant savings over a DIY build.

In fact, the overall TCO saving will be higher, given that a commercial vendor should be able to deploy a working TRE in a matter of weeks, as opposed to an organisation taking 12 months or so (but still with all the resultant up-front costs) to get a minimum viable product ready before it can start to on-board projects.

As I have previously argued, I believe the DIY approach only stacks up for a small number of well-funded organisations, which already have a significant software development capability in place and extensive experience in this particular field which they want to retain.

For the majority however, the TCO of a commercial product combined with the much quicker time to value, should present a compelling case.