Data science and the search for MH370
In the 1890s, fingerprints were a cutting-edge tool for solving crime. A century later, DNA sequencing changed the face of forensics. Today, a new crime-fighting technology is on the rise: the vast information processing power of the cloud.
The power of this technology was on particularly compelling display in the hunt for Malaysia Airlines Flight MH370, the Boeing 777 that disappeared on March 8, 2014, with 239 passengers and crew while over the South China Sea. As officials set about their investigation, they relied purely on numbers and math, with nary a hair of physical evidence to guide their way. Data science was their only tool.
A mysterious disappearance
The mystery began at 1:20 a.m. local time, 40 minutes after the plane took off from Kuala Lumpur bound for Beijing. Six seconds after the plane crossed from Malaysian to Vietnamese airspace, all its electronic communications systems went dark and it vanished from air traffic control screens.
At first, that was all investigators knew. Then, days later, Malaysian sources revealed that it had tracked the plane on radar as it pulled a 180-degree turn and flew back across the Malay Peninsula and up the Malacca Strait, after which it had slipped beyond the country’s radar coverage zone and disappeared.
A week later, another bombshell dropped: Satellite communications provider Inmarsat announced it had found recorded signals in its archives that MH370 had sent for another six hours after it disappeared. The plane had been aloft and flying for that whole time—but where had it gone?
As Inmarsat scientists examined the signals, they saw that what they had was not data such as text messages or location information. Rather, the signals contained metadata: information about the signal itself. This was recorded as the satellite automatically contacted the plane’s communications system every hour to see if it was still logged on. Bafflingly, whoever had taken the plane hadn’t used the satcom system to communicate with the outside world, but had switched it off and then on again, leaving it able to exchange hourly “pings” with the satellite.
Some of the metadata related to extremely subtle variations in the frequency of the signal. “We’re talking about changes as big as one part in a billion,” says Inmarsat scientist Chris Ashton.
Nobody had tried to use this kind of data to try to locate an airplane before. At first, Ashton’s team didn’t know if the attempt would work. But painstakingly, over the course of weeks, the team figured out how the movement of the plane, the orbital wobble of the satellite, and the electronics within the satcom system all interacted to create the data values that had been received. “We had to create the model from scratch,” Ashton says.
Their work revealed that the plane had flown into the remote southern Indian Ocean. They didn’t know where exactly. But since there are no islands in that part of the world, it was impossible that anyone could have survived. For the first time in history, hundreds of people were declared legally dead based on mathematics alone.
Death by the numbers
Investigators still didn’t know who took the plane, or why. To figure that out, they would need to find the data recorded by the plane’s black boxes during the final hours of its flight. An international flotilla of ships and planes scoured the southern ocean for months, hoping to spot a crucial piece of floating debris. They found nothing but random bits of flotsam and jetsam. The Australian government, which had been put in charge of the search, decided that the only hope of finding the wreckage would be to dispatch ships to scan the seabed.
But where to send them? The task of narrowing down the search area was entrusted to the Defense Science and Technology Group (DSTG), a government agency similar to DARPA in the United States.
A DSTG team led by mathematician Dr. Neil Gordon set about developing a new technique to extract a path from a subset of the Inmarsat data called the Burst Timing Offset (BTO). This measured how quickly the aircraft responded each time the satellite pinged it, and was used to determine the distance between the satellite and the plane. Investigators used these calculations to draw a set of rings on the earth’s surface.
In theory, the plane could have been anywhere on each ring. However, Gordon’s team narrowed the possibilities considerably by taking into account the realities of airliner operation. For instance, although passenger jets can theoretically fly any which way from point A to point B, in practice, planes almost invariably travel in straight lines. They can turn, but between each change of heading, they fly on a beeline. This fact vastly reduced the number of paths that MH370 could have taken.
The DSTG used its computers to generate a huge number of possible routes and then test them to see which best fit the observed data. Their endpoints were pooled to generate a probabilistic “heat map” of the plane’s most likely resting places using a technique called Bayesian analysis. These calculations allowed the DSTG team to draw a box 400 miles long and 70 miles across, which contained about 90 percent of the total probability distribution. The impressive thoroughness of their work gave the countries responsible for the search—Australia, Malaysia, and China—the confidence to commit $150 million to a scan of the remote ocean seabed.
A sea change
A massive task lay ahead. The search area was the size of Pennsylvania, and in places, the seabed lay nearly three miles deep.
The ships started their work in October 2014, towing torpedo-like devices called towfish on six-mile-long tethers. Scooting along 500 feet above the seabed, the towfish emitted beams of high-frequency sound much like the ultrasound used to image babies in utero. The raw data was then electronically packaged and transmitted via satellite to a secure cloud facility called Back2Base. There, the data could then be analyzed in real time by technicians working in Australia and the United States. For safety’s sake, the data was also backed up to a central directory on one of each vessel’s three data servers and with extra duplicates stored on external hard drives.
Day after day, week after week, month after month, the ships steamed up and down parallel to the 7th arc, imaging the seabed in lawnmower strips. Where they encountered rough terrain, searchers deployed autonomous underwater vehicles, or AUVs, that could weave more nimbly amid obstructions. Through 2015 and into 2016, the ships continued to assemble their massive trove of data.
And then it was over. In January 2017, officials announced that the search zone had been thoroughly scanned and they were confident that the plane was not within it. “Despite every effort using the best science available, cutting-edge technology, as well as modeling and advice from highly skilled professionals who are the best in their field, unfortunately the search has not been able to locate the aircraft,” the jointly issued communiqué read. “Accordingly, the underwater search for MH370 has been suspended.”
In the aftermath, Australia issued several reports detailing the technical aspects of the search. Officials explained that the most likely reason for the failure was that the plane had actually flown somewhat to the north of the maximum probability area. They urged the governments of Malaysia and China to join them in funding a follow-up search. But having spent $150 million with nothing tangible to show, the partners were reluctant to fund a second round.
The largest, most expensive, and most technically complex search and rescue mission in history had ended in disappointment. Were investigators simply unlucky? Or had sophisticated hijackers outwitted them with a juke they’d been too slow to catch? We may never know.
That doesn’t mean efforts to find the plane were for naught. The scouring of the seabed yielded vast amounts of data about a quarter-million square miles of the earth that had never been observed in detail before, and the DSTG’s pioneering mathematical techniques opened new perspectives on electronic navigation.
In the long run, the most significant legacy of the search for MH370 may be in paving the way for the data-intensive investigations of the future. The world we live in has become complex beyond normal human comprehension, populated by machine-learning algorithms and armies of AI bots. Information technologies such as these can be used for good, but there will always be those who will try to use new techniques for nefarious ends. And the sophistication of these tools will only continue to increase.
The struggle to solve the mystery of MH370 may be over, but many more like it lie ahead.
Data science and the search for MH370: Lessons for leaders
Scientists pushed the envelope of information technology to try to crack an otherwise unsolvable case:
- As the world becomes increasingly information-intensive, data science will come to the fore as a forensic tool alongside traditional techniques.
- The powerful computational tools of cloud computing will also become available to those with nefarious intent, creating an arms race between those who want to do ill and those who want to stop them.
- The unique demands of each investigation and the constant evolution of information technology will require flexibility and innovation in order to achieve success.
This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.