How Families Are Solving Genetic Mysteries With Big Data



  • For the first time, a new generation of parents is raising families through a lens of genetic knowledge that has never been available before
  • Emerging technology in genotyping has implications in big-picture societal shifts as well as security issues in managing extremely private information

While the potential for medical breakthroughs is great, genetic testing raises new data storage and cybersecurity challenges for enterprises and parents-to-be

For months, Cai Maver's five-year-old son was experiencing health complications that doctors suspected were due to cystic fibrosis. But Maver was skeptical. Before the child showed any symptoms, genomic sequencing outfit 23andMe had scanned Maver's genome (not his son's) and reported that, in all likelihood, Maver did not carry the altered gene that causes the disorder. For cystic fibrosis to develop, both parents must pass copies of the altered gene to their child.

Maver and his wife put their son though a battery of tests, but they were confident the results would be negative despite the doctor's leanings. They were right. "Having my own results was helpful in relieving anxiety," Maver says.

Companies like 23andMe have made headlines by unlocking what makes every person unique—their DNA. With at home test kit and a saliva sample, 23andMe provides customers with reports on their genetic makeup, including whether they carry genetic conditions (say, cystic fibrosis) and other traits (like a fondness for sweet foods) that, to varying degrees, are encoded in our genomes.

A decade since its founding, 23andMe and other similar services are shifting into a complex new phase. More than one million adults today have 23andMe gene profiles, which means that a growing number of adults like Maver who are raising kids—or perhaps deciding whether to even have kids—are doing so with more actionable data at their fingertips. That data can influence decisions about family planning but also lifestyle, education and even financial planning. If a parent learns she's at high risk for a certain condition by age 40, she might make very different decisions about long-term spending, saving or even about where to live.

Crowdsourcing genetic meaning

What will make genetic data truly actionable in those scenarios? Many believe it comes down to a new kind of personal crowdsourcing. After all, we can only understand our own genomes if we understand what's going on in other people's. That context is critical to understanding whether someone has a certain genetic mutation—and the frequency and potential effects of such mutations.

"When you get back your genome it's just a bunch of variance," says Mark Kaganovich, CEO of SolveBio, a genomic reporting service for researchers, who also holds a PhD in genomics from Stanford. "The whole purpose is to understand whether you have something that will cause a disease."

Small wonder, then, why discussion forums are overflowing with people chatting about their test results. Crowdsourced discussions, in addition to information from health databases and other sources on the Web, let patients ask better questions of physicians and empower themselves to monitor their own care. Sometimes this causes disagreements. In other cases, patients catch something their physicians miss. "Physicians are in a complicated situation right now," says Kaganovich, adding that it's virtually impossible for the medical community to track genetic information.

Scaling up on storage and security

While crowdsourcing of genetic information is evolving to meet the demands of consumer genetic data, the enterprise storage and security requirements are also starting to emerge. Companies from Google to startups such as SolveBio and DNAStack are stepping in to help organize and store health data. The enterprise technology challenge is complex because genetics research occurs at countless institutions around the world. If medical science is going to start using this information to help people, data integration is imperative.

SolveBio wants to index structured data (such as basic genetic information or health metrics) and unstructured data (such as independent observations or images). That information won't necessarily generate an answer immediately, but that's not the point. Even if the information seems inconclusive now, Kaganovich says there could be better data in a year and it's important to put that new information at professionals' and patients' fingertips.

Marc Fiume, CEO of DNAStack, wants to integrate all these information repositories securely and says the world's genetic information should be organized "a lot more like the Internet" is built—as a federated structure.

Toward privacy-smart APIs

To stay in compliance with government regulations and to protect privacy when collaborating, organizations strip out personally identifiable information from patients' health data. But anonymizing information to protect privacy gets tricky: A genetic sequence is both health data and personally identifiable information. Today, when one organization queries another, it simply asks, "Do you have someone with X condition?" The queried institution replies yes or no, and then must follow through in compliance with government restrictions. But Fiume envisions a system where institutions deploy "privacy-aware APIs" to find people with very specific genetic sequences to understand patients.

Using genetics to do their targeting, doctors may soon be able to find the very best donor for a bone marrow transplant. If a patient has a rare genetic disorder, doctors could query for the other closest patients with such a disorder. But Fiume says such matchmaking algorithms are still in the future. Like most digital trends today, the system relies on people trusting institutions with sensitive information. As testing becomes easier, and results become clearer, for-profit companies and hackers will have an interest in getting hold of people's genomic information.

"There was somewhat of a leap of faith," Maver says of using 23andMe. But in the end, "I trust 23andMe for the same reason that I trust any data company. Their business model depends on being good stewards of the data we entrust to them."

