Enable unprecedented levels of automation and agility with cloud computing solutions.
Podcast: Meet the doctor of the future, AI
[Editor's note: This podcast originally aired on Nov. 28, 2017. Find more episodes on the STACK That podcast page.]
Web apps and data center technologies have enabled us to innovate, and now disrupt, what healthcare looks like and what it will be in the future. Othman Laraki from Color Genomics, a company that is making major strides in mapping the human genome and helping the medical world discover things like predispositions to disease, joins Byron Reese and Florian Leibert to discuss how Color and the data center is disrupting healthcare.
Byron Reese: Hi, everyone. Welcome back to STACK That, brought to you by Hewlett Packard Enterprise. I am your host, Byron Reese, of Gigaom, and today we're going to be talking about healthcare and wellness disruption. I'm here today with my co-host, Florian Leibert, who is the co-founder and CEO of Mesosphere, which makes DC/OS, which is the most flexible platform for containerized data-intensive applications.
Byron Reese, Gigaom
When we first sequenced the genome, it was hoped we would find a gene for smartness, and a gene for alcoholism, and a gene for cancer, and so forth. But it didn't turn out that way. You have 23 pairs of chromosomes, and perhaps 20,000 different protein and coding genes, and they're made up of 3 billion base pairs, and that is your DNA. Almost every cell in your body, not counting your red blood cells, has a complete copy of it. Each, if it were stretched out, would be just a few molecules wide but six feet long, all crammed into your cell. Stitched together, that would stretch all the way past Pluto.
But if you start to think about genomics as a data science, and not a biology, you realize that you can think of each of your cells containing 625 megabytes of data. And studying that data and how it varies between people, that is an ideal use case for machine learning. One of the people working on this, at the forefront of it, is Othman Laraki, who is the CEO of Color Genomics. Welcome to the show, Othman.
Othman Laraki: Thanks for having me.
Florian Leibert: Awesome to have you, Othman. It's great to reconnect after a couple of years of not being in touch so closely, but we used to actually work together at Twitter, where we both worked on user growth.
Florian Leibert, Mesosphere
Laraki: Likewise, it's always great to hear your voice. It's reminding me of all the fun times working together.
Leibert: Likewise. So, you worked at web-scale companies, like Google and Twitter. How did that lead you to really jump into a totally different area and start Color Genomics?
Othman Laraki: Yeah, in some ways it was a combination of two threads, really. On one side, there's a personal dimension to my story on why I'm kind of working on what I'm doing, which is that my grandmother passed away from breast cancer. My mother survived two of them and at the time discovered that she is a carrier of a BRCA2 mutation, which is one of the genes that is associated with dramatically increasing people's risk of cancers. That includes breast cancer. In fact, I was at Google at the time, and that had given me a lot of interest in that connection between health and genetics, but it was really much more on a personal and curiosity level.
Othman Laraki, Color Genomics
The other thread that kind of really became that combination, which is a few years ago, it started becoming very apparent that genetics—and I think actually even healthcare in general—is increasingly becoming a part of the software and data problem. The first human genome was sequenced in 1999/2000, and that cost about $3 billion. And since then, the cost of sequencing has been dropping at this exponential rate. Actually, all the biologists love to say that it's faster than Moore's Law, to the point where now you can do a whole genome for less than $1,000.
And that kind of staggering change in cost structure I think really changed what is hard about genetics. Whereas before, there was a dataset, but it was so hard to access that the amount of data relative to effort and investment that was coming out was very small. Whereas now, we've crossed this threshold where getting the data out is in some ways the easy part. What is hard is that there is a massive amount of data, and biology is kind of this obviously incredibly complex world.
And so, now, there's both this challenge and opportunity to really bring to bear a lot of the tools that we all worked on building in the Internet world, where we built tooling to be able to manage and deal with massive amounts of data and information at scale. Now we can really start using some of these tools in a way that's really valuable and constructive, in part due to the possibility of saying, OK, now we can go and sequence every human being, and we can correlate the genomic information with health data and behavior data, etc.
So, that gets really exciting in terms of managing public health, in terms of discovering new therapies, etc. In many ways, that combination of those two things were my personal connections to it. But also, really, it comes to the point where it was something where that the work that we had all done on the tech side I think now is very, very relevant.
The lessons of building massive data centers
Leibert: So that sounds really interesting, Othman. Some of the work that you did at Twitter—and some of that probably was actually around open source as well, because you were in the engineering product group—has that had any impact on the space?
Laraki: The impact, I think, is not one from the point of saying the software that we built in companies like Twitter and Google had a direct effect. Some of it did, but I think in many ways, it's much more the learnings in that experience I think that has even a bigger impact. So, on the one hand, a lot of infrastructure or software—like even things like MapReduce and Hadoop, etc., even Mesos—these are ease of infrastructure that are very valuable in the genomics world.
But I think what is even a bigger impact is, for example, the experience of building and running massive data centers that are automating very large systems and able to run with a very low level of human interaction or human intervention. I think in many ways it's that kind of learning that has had even bigger impact, where that's what's fundamental and enabling us to take something that still today is being sold clinically for $2,000 to $4,000 and dropping that cost by 10 to 20 times.
So, literally, we took these genetic tests that normally are still being built for thousands of dollars, and we created the clinical-grade version but were able to commercialize that for $250. And that's entirely, I think, built on that infrastructure work on the beta and systems side. I think that has really enabled a lot of that cost transition and optimization, if you will.
Reese: So, tell us a little bit about your mission over at Color and how you were using machine learning to accomplish it.
Laraki: Yeah, so our general mission, our actual mission as a company, is to help every person have the healthiest life that science and medicine allows. And the breadth of that statement is very deliberate in the sense that for us, I think, the interesting thing is when you're able to take a technology like genomics and really bring it to scale, one of the interesting transitions I think that starts to happen is that you stop thinking of the product like genomics as being genomics, but rather you start thinking about it as a service around healthcare. So for us, we think about our product as prevention.
And really what Color is about, it's not about just delivering genetic tests to people, but rather if you have that information, you can use it to change the course of people's managements of key disease areas like cancer and cardiovascular health to dramatically impact outcomes. And I'll give you a concrete example here.
People can be the carriers of the mutation in a gene called BRCA1 that dramatically increases risk of breast and ovarian cancers. So, if a woman has a mutation in one of these genes, she will go from a 12 percent lifetime risk of breast cancer to over 80 percent. That was the example of Angelina Jolie a few years ago that got a lot of attention. If a woman knows about it, there's a lot that she can do from the prevention, early-detection standpoint that dramatically changes her risk of dying from it. The biggest problem today is that the majority of women who are carriers just don't know about it.
And you have the same example, whether it's in colon cancer, a number of areas in cardiovascular risk, like high cholesterol, etc., where there are clear management guidelines, but the big gap is that most people don't know about their risk. And so, what we've done is essentially we've kind of taken this approach to really scale it to make it available to a broad population, but around that building a service that helps them follow on that information, get updates, etc.
So an interesting way, like aside from the infrastructure dimension of the value we got from companies like Twitter and Google, etc., there's also a consumerization that is potentially equally valuable, where these tests and information used to be very difficult to access for people. And you get these results in a fax or PDF with a bunch of information that makes it really hard to understand. There's no follow-up. And what we've done is we've turned this into a service that sends people reminders, that sends people updates, where they can connect with their entire family tree and their family members—so really taking this information and building a dynamic service out of it, which I think has as big of an impact as giving the people the information in the first place.
'A mini-doctor sitting in your Apple Watch'
Reese: This is fascinating. Othman, aside from genomics, as you've already mentioned, how do you think personalized medicine gets to be in the future. Can you paint us a story of what's going to be possible?
Laraki: Sure. A few thoughts there. I think one of the interesting trends actually, given the audience and given that it's a technically oriented podcast, I think this analogy will resonate. The way we think about the operating unit in healthcare today, the core widget with which we do things in health today, is the doctor. We came from a world where health information was very hard to communicate and access, so the way we deal with it is we train a subset of the population and load up all that information into their heads, and we call them doctors. We put them in this central location so that they can access all their tools. We call those hospitals. And all of healthcare is designed around coordinating around this scarce resource.
Literally, I think a lot of the constraints and challenges in healthcare fall out of that. Now, I think what's becoming interesting ... A few things that are super-key that are changing on the precision side is that a few things are happening. One is that it's become much cheaper and much more easy to extract data out of people's bodies, including the genome, etc. Even things like Fitbits and Apple Watches I count in that category of biological data that used to be really hard to collect and now is becoming trivial to collect and aggregate.
The second piece is a lot of this data, you know, historically, the most important health data about you, was sitting in the file cabinet at a doctor's office. And then they turned that file cabinet into a computer, and that's called electronic medical record. But for all intents and purposes, it's almost as inaccessible as it was in the filing cabinet. But now, with genomics, with all the health data, a lot of your health data, or the majority of it, is actually in your control. So you decide whether you get sequenced. You're deciding how you collect data around heart rate and around activity, and I think that is causing a big shift in power dynamics.
And then the third thing that's also changing is all the telemedicine dimension, where a lot of healthcare now can be delivered over the phone or online. And so I think the result of that over a few years is we're going to re-factor the function of a doctor, right, so there's this function, or this class, that was this doctor that was extremely versatile, that was doing a million and one things. And now, I think, we have these perimeters that allow us to really start pulling out a lot of this functionality and treating those as extremely available.
Whereas, historically, your heart rate would get measured once a year, because you had to go to your doctor's office and they put a stethoscope on your back and listened to your heart rate, now you have a mini-doctor sitting in your Apple Watch collecting it 24/7. And so all these things that used to be very scarce now are becoming very available. And I think that allows a whole set of these functions to be, I think, pulled out. And so there's a whole theme around that that's going to be fascinating over the next few years.
Reese: So, before we continue, I want to do a big shout-out to Hewlett Packard Enterprise. They're the people who bring you STACK That. HPE, of course, is the leading provider of the next-generation services and solutions that help enterprises and small businesses navigate the rapidly changing technological landscape that we discuss on STACK That. They have the industry's most comprehensive portfolio spanning the cloud, to the data center, to the Intelligent Edge. HPE helps customers around the world make their operations more efficient, more productive, and more secure. So stay up to date in the latest in hybrid IT, Intelligent Edge, memory-driven computing, and more by visiting HPE.com.
I want to take what you're saying one step further. The way you describe it all, with the hospital and the gatekeepers and all your data behind a wall, I mean, it all sounds almost intractable. So, tell me how you think that is going to shift, and what will a doctor be in the future? And what will a hospital be in the future? And where will my data be in the future? How do you think that's all changing?
Building out application ecosystems over time
I think it might be similar in healthcare, where I don't think it's going to be, you know…healthcare is the biggest sector of our economy, so it's this massive machinery and system that's extremely complex. So, I don't think it's going to magically get changed all of a sudden. But I think what ends up happening is that you have cases, or new applications, that are enabled by new technology that are sufficiently valuable that you can take them to massive populations. When you do that, you have a lot of people that are now on board to a "new technology." And then applications that fall out of that start making sense.
So, for example, I think with genomics, I think the way millions of people are going to get sequences is not going to be 100 million people getting their whole genome just for kind of curiosity's sake. I think, for example, in genomics, there are a few applications that justify, make it worthwhile, to use that information for every person. These applications are the straightforward ones. You know, they're cancer-risk prevention. They're cardiovascular health. And when these applications become valuable enough—which I think they are today—those are the applications that bootstrap genomics at the population level.
And I think you'll get similar things, for example, with your Apple Watch. When your Apple Watch gets to the point where it has a material impact on your health, all of a sudden, the adoption, I think, explodes, and then you can start growing the types of applications that you build around wearables.
And so I think we are in that transition phase in a number of those technologies, and so my guess is that it's not going to be just one big change; it's more that there are going to be these applications that are used on a very large scale and around them ecosystems get built. And over time, it's like in 10 or 20 years, we'll look back and be like, wow, healthcare looks very different today because of these new technologies that got integrated.
Leibert: So, talking about ecosystems and technology, and actually given the fact that our podcast here is called STACK That, what does your technology stack at Color look like? And is it changing a lot? In today's world, we have all these open source technologies available like Hadoop, Spark, TensorFlow, Flink. You probably know a lot of them. How have you made the choices which technologies to employ, and which have you actually employed?
A mix of traditional web infrastructure and open source bi-informatics software
Laraki: Yeah, so, stack ... I mean, the way Color works, we actually run most things on Amazon. In terms of the specific technologies that we use, it's actually the ... when you run something like Color, where there's several pretty big components, first of all there's a lab site, where we're running one of the largest genetic slabs with robotics and integration there. So, that world, in some ways, looks like a very scaled-up web application that happens to talk to robots. It's essentially a workflow system.
Another part of the stack is what we call bi-informatics, but that is basically pulling data from sequence servers and turning that into insights that can then be used for people. That looks in some ways, from a pattern-matching sample, it looks very much like a search pipeline. So, for there, the choices that we've made as we built it, it ends up being a combination of a lot of the technologies that you would be seeing in a lot of the web companies around.
Actually, let me take a step back. One thing that's kind of interesting, it's a combination of traditional web infrastructure, whether it's some use of Hadoop, or etc., on the one side, but then combining that with open source software from the bi-informatics world. So there are a lot of tools that have been built over time in life sciences, but they have not been integrated with large-scale systems. So, I think that's one interesting piece there.
And then there's kind of like a whole kind of web component to what we do, and essentially, a large jangle application that from a web technology standpoint, it looks just like a large kind of web app.
Reese: All right, well, that is a good place, I think, to leave it. It must be really exciting to be in a kind of the dawn of something. Because you started off by saying 20 years ago, you didn't have a sequence genome. Getting the data out was so hard. You didn't have the kind of machine learning tools we have now, so you've got to feel like you're at the very beginning of some great transformation, or am I projecting that on you?
Laraki: No, I think you're absolutely right. I think this transformation obviously is built on decades of amazing work, and so, I think now what's exciting about this window in time is that a lot of this has shifted from science into engineering and products. What that means is that now is a time when we can take this to a very large number of people, which is very exciting.
Reese: All right, well, I want to thank you so much, Othman, for being on the show.
Laraki: Thank you for having me.
Leibert: Thank you, Othman. Great chatting with you.
This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.