Eyes Wide Open But Algorithms Wide Shut?
Adobe's laudable push for open government butts up against the difficulty that machines have sussing out what's in its products.
Adobe hosted a one-day conference in Washington this week capping off an extravagant PR campaign — complete with billboards throughout the D.C. metro system and animated ads all over most local news Web sites — touting the idea that its tools help "open up government."
Barack Obama pledged to make bureaucracy more transparent, and the software provider wants to point out that its products will make this possible. The premise seems reasonable enough: Most technological neophytes at least know how to open a PDF. Government currently publishes everything from IRS forms to pages of the Federal Register in that format, viewable with the freely downloaded Adobe Reader.
But as the Sunlight Foundation's Clay Johnson suggested to Miller-McCune.com last week, Adobe's high-profile involvement in open government — ad campaign, conference and all — actually poses a big problem for the people who aren't neophytes. The programmers and developers who want to parse data released by government — turning it into databases that can be manipulated, or new applications for your iPhone — often can't work with PDFs or charts made through Adobe's technology.
The distinction may sound like inside baseball for computer geeks (as Johnson joked on the Sunlight Labs blog, he thought about picketing the Adobe conference with the chant, "Hey hey! ho ho! Your-binary-low-parsable-formats-for-government-data has got to go!").
In fact, the distinction has vast implications for entire markets that could be built upon public information. When government data is readable by humans, we're able to keep better tabs on the local congressman's voting record. But when it's readable by machines, the possibilities (many of them profitable) for new applications and information-sharing are endless.
The Idea Lobby asked new Federal Chief Technology Officer Aneesh Chopra if the administration was sensitive to this issue, and the concerns of angrily blogging coders, after a speech he gave yesterday.
"I want to be careful in [answering the] question because there are innovations within Adobe that will make information in a PDF more accessible," he said. "So it's not a question of either/or, it's a question of at its core, what is the problem you're trying to solve? And that is really the formation of secondary applications born out of the data we're making transparent to make your lives better, faster, easier."
He then pulled up on his cell phone just such an application, a healthy eating tracker that uploads dietary information straight from the USDA.
"I'm less concerned about whether one particular file format is better than the other," he said. "But I want at the end of the day to ensure that this entrepreneur can access that data with as little friction as possible so that they can create the value that we're seeing on apps like this."
That principle is in line with what the developers want, but Johnson argues that the particular file format does matter. "Here's a hint," he wrote on the Sunshine blog, "if the data format has an ® by its name, it probably isn't great for transparency or open data." The alternative is a non-proprietary format like XML.
All of this means the administration may need to have two simultaneous goals in mind as it continues opening up its deep vault of data.
"Should we provide more human-readable information? That has been public policy for a decade, and I'm happy to continue pushing for that innovation," Chopra said. "But second, to what extent might you enable machine-readable information to spur new application development? That's the new priority and one that I believe holds great promise for the American economy."
Are you on Facebook? Become our fan.
Follow us on Twitter.