Every 14 minutes, someone is diagnosed with blood cancer. With a stem cell transplant from a matching donor, their life can be saved. Anthony Nolan, the first stem cell donor register, facilitates around 1300 life-saving transplants a year - and there are over 100 other such registers worldwide. But what does it take to be a match?

In this case study, I will walk you through the technical journey of building Atlas - the first open-source, cloud-hosted, genetic matching algorithm - comparing the genetics of patients and donors, and returning a list of donors who could be a match for transplantation.

We’ll take a brief tour of the history of matching stem cell donors, and walk through the algorithm that uses genes to determine the best genetic match between a patient with blood cancer and a stem cell donor. Then we’ll analyse the technical challenges that come with taking that to a global scale:

  • How much is enough memory to store nonillions (yes, this a real number!) of possible genotypes?
  • What are the scaling needs of an algorithm where the number of required computations can differ by multiple orders of magnitude?
  • What cloud services can we use to run billions of genetic comparisons within minutes?

Whether or not lives are on the line when developing software, domain expertise is essential. Without it, you can never be sure if your next best step is actually a technical optimisation, or whether you weren’t asking the right question all along.