Skip to main content

Government data: How open is too open?

The notion of "open government” appeals to both citizens and IT professionals seeking access to freely available government data. But is there such a thing as data access being too open? Governments may want to be transparent, yet they need to avoid releasing personally identifiable information.

There's no question that open government data offers many benefits. It gives citizens access to the data their taxes paid for, enables government oversight, and powers the applications developed by government, vendors, and citizens that improve people's lives.

However, data breaches and concerns about the amount of data that government is collecting makes some people wonder: When is it too much?

"As we think through the big questions about what kind of data a state should collect, how it should use it, and how to disclose it, these nuances become not some theoretical issue but a matter of life and death to some people," says Alexander Howard, deputy director of the Sunlight Foundation, a Washington nonprofit that advocates for open government. "There are people in government databases where the disclosure of their [physical] location is the difference between a life-changing day and Wednesday."

Open data supporters point out that much of this data has been considered a public record all along and tout the value of its use in analytics. But having personal data aggregated in a single place that is accessible online—as opposed to, say, having to go to an office and physically look up each record—makes some people uneasy.

Privacy breaches, wholesale

"We've seen a real change in how people perceive privacy," says Michael Morisy, executive director at MuckRock, a Cambridge, Massachusetts, nonprofit that helps media and citizens file public records requests. "It's been driven by a long-standing concept in transparency: practical obscurity." Even if something was technically a public record, effort needed to be expended to get one's hands on it. That amount of work might be worth it about, say, someone running for office, but on the whole, private citizens didn't have to worry. Things are different now, says Morisy. "With Google, and so much data being available at the click of a mouse or the tap of a phone, what was once practically obscure is now instantly available."

There are people in government databases where the disclosure of their [physical] location is the difference between a life-changing day and Wednesday.

Alexander Howarddeputy director, the Sunlight Foundation

People are sometimes also surprised to find out that public records can contain their personally identifiable information (PII), such as addresses, phone numbers, and even Social Security numbers. That may be on purpose or because someone failed to redact the data properly.

That's had consequences. Over the years, there have been a number of incidents in which PII from public records, including addresses, was used to harass and sometimes even kill people. For example, in 1989, Rebecca Schaeffer was murdered by a stalker who learned her address from the Department of Motor Vehicles. Other examples of harassment via driver's license numbers include thieves who tracked down the address of owners of expensive cars and activists who sent anti-abortion literature to women who had visited health clinics that performed abortions.

In response, in 1994, Congress enacted the Driver's Privacy Protection Act to restrict the sale of such data. More recently, the state of Idaho passed a law protecting the identity of hunters who shot wolves, because the hunters were being harassed by wolf supporters. Similarly, the state of New York allowed concealed pistol permit holders to make their name and address private after a newspaper published an online interactive map showing the names and addresses of all handgun permit holders in Westchester and Rockland counties.

Learn the future of public sectors in a citizen-centric digital world.

Other government open data issues are murkier. For example, many cities are equipping their police officers with body cameras to record encounters with the public. But what happens when police officers cover a demonstration or visit a crime scene? In the process, the police may record innocent people who could then be criticized or incriminated simply by being there.

During recent alt-right demonstrations, people used social media, video footage, and photographs to crowdsource attendees, and then notified their employers about their attendance. Several people were reportedly fired as a result, even though the demonstration attendees were exercising their constitutional right to freely assemble and weren't doing so during work time. Innocent people were falsely accused by such crowdsourcing identification efforts, such as after the 2013 Boston Marathon bombing.

In response, some cities have proposed laws that would block police bodycam footage from being a public record, but that would make the footage less available to would-be watchdogs. "Even well-meaning exemptions have negative repercussions," Morisy says. "If you say, 'It needs to be private,' the people most likely to take advantage of that are better off or more well-connected. The exemptions end up covering up misuse or abuse by people who take advantage of them."

Weaponizing the FOIA

U.S. citizens are becoming more familiar with the Freedom of Information Act (FOIA), enacted in 1967 to give them access to government data. Statistics on the use of the FOIA are remarkably difficult to find, but an AP report in 2015 said the number of requests had set a new record at more than 700,000.

Increasingly, though, people are "weaponizing" the FOIA by asking for large amounts of information as part of a fishing expedition, or even as a form of a denial-of-service attack whereby the request keeps an agency too busy to do its job. In response, some governments have sued people for using the FOIA, which means requesters have to hire a lawyer and incur that expense to gain access to the public records.

But that type of attitude has the potential to backfire, MuckRock's Morisy warns. "Every town has its 'drive-by requesters,' the people you and I might consider cranks, who are trying to dig into something year after year," he says. Governments should look at tools such as putting such requests at the back of the queue, or charging reasonable fees for large amounts of information. "You need to learn ways to negotiate and defuse and put it into perspective," he says. "When agencies put in a bunker mentality—'they're at war with us, we need to fight them'—that's when things escalate and aren't just worse for the requester, but worse for the agency." Instead, agencies should look at ways to proactively make data available without requiring an FOIA, he says.

All that stipulated, how should governments protect the data of ordinary people while still making their operations transparent? While it might make people's eyes glaze over, the first step really needs to be data inventory, according to Sunlight Foundation's Howard. That way, governments have a better idea of what data they have, what sort of protection they need, the sort of people who are legitimately going to request access for that data, and how vulnerable they might be to attack.

For example, data such as PII and health records requires more protection, and town councils have a different threat model than military computers that could be targeted by other nation states or hackers, Howard says. As an example, he points to the 2015 revelation of a breach of Office of Personnel Management personnel records, data that should have been protected so it couldn't be exfiltrated once someone got behind the firewall. "We need to make sure these are treated differently," he says. "There's a different order of harm associated with these records."

Governments also need to look at better solutions for redacting sensitive data, Howard says, citing the example in the movie "Hidden Figures" when someone read redacted data by holding a paper up to the light.

Ultimately, governments should look at setting up differential access that gives people different levels of access to open data, based on factors such as their roles, their location, and the time of day. "As data is disclosed more and more through public records and proactively published, governments will need to do differential disclosure, so the tax auditor can see something more, because they're coming from it internally, than someone coming from it externally," Howard says. This will require ongoing audits and ongoing log files that alert staff when someone is gaining access to data they shouldn't be. "Those kinds of systems are still in their infancy."

Open data: Lessons for leaders

  • The benefits of open data have ensured that it is here to stay. Get used to it.
  • When making data open, government agencies should be sure to redact personally identifiable information from people who don't need to see it.
  • Think about ways to give differential access to data, based on factors such as role, location, and time of day.

This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.