Data.gov: on becoming ‘mom-friendly’

Interesting post by Sean Gallagher at GCN on the current state of Data.gov, one of the Open Government initiatives (OGI). The primary thesis in Sean’s post, in which he quotes Steve Drucker of Fig Leaf software as saying that Data.gov “fails the ‘mom test’”. Fundamentally, although volumes of data are available on Data.gov, the information is in a relatively useless form. More simply, it’s unrealistic that the average citizen, or the mother of that citizen, would have any fun, or more importantly, learn anything from surfing the current Data.gov website. A related analogy provided by Drucker that is described in Sean’s article is that of how a large law firm might treat a small law firm during the discovery phase of a lawsuit — bury them in data, but don’t give them any real information [W.C. Fields once made a more colorful quote along these lines].

Drucker, along with Dan Kasun, senior director of developer and platform evangelism at Microsoft’s U.S. Public Sector, further point out that it’s not just the volume of information, but the quality of information. Paraphrasing, they contend that developers face a fairly significant ETL challenge to get the data in useful form so that it can be more usefully processed. However, each cites at least one tool that might assist improving the quality of the data, Adobe’s open-government toolkit based on the ColdFusion web server, and Microsoft’s similarly free Open Government Data Initiative tools.

Update [based on comment by Sean Gallagher, author of the discussed article]: gist of Drucker & Kasun’s comments appears to be Data.gov [and the Open Government initiative in general] could have a much larger long-term payoff if government focused on exposing data in-place by building web services, or creating a public cloud to access data via web services. That would address the problem commercial developers face building applications against the data in its current form, and drive use of the data in new and innovative ways [and in ways that are independent of government funding].

Such thoughts & many related approaches can  move the nascent Data.gov initiative forward – the focus of the remainder of this post.

For instance, one approach that is perhaps may be more useful before continuing with Data.gov’s current “build it and they will come” approach is to consider other major initiatives that successfully share volumes of data with U.S. citizens and the larger global community:

  • One of the first that comes to mind is the “This We Know” initiative, which is working to add a semantic layer of knowledge to the raw data posted at Data.gov. In other words, the folks involved with “This We Know” are seeking to construct Data.gov 2.0, or in the context of Sean’s post, the “mom interface”. Among the other information more readily exposed from Data.gov at ThisWeKnow.org includes data about toxins, unemployment, cancer, and migration. Perhaps the most useful function is to simply type in a city/state combination to view the “one face to Data.gov” objective of “ThisWeKnow”, such as Austin, TX. As a quick note, and to perhaps highlight Sean’s thesis further – although ThisWeKnow does get closer to a “mom interface”, only a subset of the Data.gov information is accessible on that site, in part due to the ETL concern.
  • Government contracting & finance data: some of the premier examples include USAspending.gov, SBIR/STTR awards, FollowTheMoney.org, MAPLight.org, and OpenSecrets.org. Each of these sites attempts to take the morass of data available about finances and government, and as the colloquial saying goes, attempts, provide ways to “follow the money” in their respective niche areas. The potential for these sites is perhaps best captured in the “Apps for America” challenge to develop a mash-up app that best improves access to the growing catalog of information Data.gov. ThisWeKnow was the winner of the first “Apps for America” challenge — the  current contest ends 8/7/2010 — “get your app together” and submit!
  • Archive sites: the Smithsonian, NSA, the Library of Congress, the Federation of American Scientists, and other sites each provide useful examples on how archive data can be presented to the public. Although Data.gov may not have the funding to develop such sites, perhaps a theme can be chosen for each year to focus on and thus improve data access over time.
  • Governance: various Congressional sites, such as the THOMAS site hosted by the Library of Congress and the Senate Armed Services Committee, along with other congressional portals described in the Mouse Awards 2010, also provide useful examples for Data.gov to consider.
  • Agency initiatives: many of the major government agencies have initiatives to share data. This include the EPA’s “Data Finder“, the DOT’s RITA data portal on all things transportation, the census bureau’s repository [which needs a name], and NASA’s spotlight on science site [which includes J-TrackWorldWind, NASA's alternative to Google Earth, and many other data apps]. Some of this data is also available on Data.gov, although not in as beautiful or accessible a form as on the original agency site.

The challenge on Data.gov is to build a “one face to the world” portals. Some of the more interesting ideas on the Data.gov forums for building a world-class dynamic information portal are:

  • creating a Data.gov evangelists (similar to other firms, e.g., Google’s SketchUp evangelist roles) to spread the word on Data.gov, gather requirements, and work with the Data.gov team to develop showcase apps.
  • ensuring the existence of a data dictionary, e.g., via tagging or metata, and providing access to data dictionaries and the associated data via a standardized taxonomy
  • introducing collaborative/social aspects on the data, e.g., tracking what data people are using and how they are using it (Socrata and DataMasher cited as examples)
  • create separate/customized public, developer, and government portals (wikis, search interfaces, etc.) — needs of producers and consumers are very different
  • consolidating or integrating initiatives, e.g., Data.gov, Imagery For The Nation, Geospatial One Stop, NASA WorldWind, NGA, and NTIS archives

     

  • publish a roadmap — what are next steps for Data.gov — how is it increasing transparency, what are its successes and failures, what’s coming next
  • create better distribution channels, e.g., using BitTorrent or other large-scale cloud computing distribution technologies
  • ensuring consistent and useful access for all consumers, including for Section 508 purposes

Bottom line, Data.gov has significant potential to increase transparency at all levels of local, state, and national government in the United States. To do so in a way that improves the governance of the nation, while increasing our economic strength while only releasing data that maintains our national security is an audacious, but achievable goal, for the Open Government initiative. Hopefully these and other advances with the Data.gov effort will yield pass the “mom test”  in the near future.

    Related posts…

    1. Data.gov: over before it began?
    2. ODNI: 2009 Annual Data Mining Report
    3. Rumsfeld’s Right – ‘Unknown Unknowns’ in Data Science Apps
    4. SSA: Geographic Information System (GIS)
    5. Welcome, reader!
    6. Social media: government & society
    7. Government R&D @ Google News
 

2 Responses to “Data.gov: on becoming ‘mom-friendly’”

  1. Sean Gallagher said:

    May 11, 10 at 15:43

    Hi, Chris. Actually, it was Steve Drucker who made the “mom test” comment. And I think that what Drucker and Kasun were pitching the hardest was that there would be a much larger long-term payoff if government focused on exposing data in-place by building web services, or creating a public cloud that accessed the data from a database via web services. That would address the problem commercial developers face building applications against the data in its current form, and drive use of the data in new and innovative ways off the government’s dime.

  2. Chris Augeri said:

    May 11, 10 at 16:07

    Thanks, Sean. Appreciate the feedback! I’ve modified the post to properly attribute the ‘mom test’ comment to Drucker. I also included an update that summarizes your thoughts on the gist of what Drucker and Kasun were expressing.



 
 

» archives

» recent comments

» subscribe