Bulk Data Downloads Passed in Omnibus Spending Bill

By passing the Omnibus Spending Bill, Congress has made further inroads to reaching a 21st Century level of transparency.  Rep. Mike Honda (D-CA) inserted a measure directing Congress and its affiliated components to make raw data available to the public.  Raw data feeds are the building blocks that programmers use to create mashups that bring various information sources together.

However, you shouldn’t expect tons of data to be dropped out of Congressional vaults tomorrow.The Omnibus Bill requires that the Library of Congress (LOC), Congressional Research Service (CRS), the Government Printing Office (GPO), and appropriate entities of the House prepare a feasibility report to the Committees on Appropriations of both houses “within 120 days of the release of Legislative Information System 2.0.”  The fate of the entire measure rests in the hand of the Appropriation Committees.

The LOC database, THOMAS, provides a lot of good information and gives access to full text bills and Congressional Research Summaries.  However, it is outdated and lacks a decent user interface and persistent URL’s.  Browsing and searching are difficult…don’t even think about asking for an RSS feed.  GovTrack.us, OpenCongress.org, and MAPLight.org provide similar Congressional information but with a far more usable format.  The downside to them is that they are forced to rely on THOMAS as their source of information.  That is, until now.

Even more notably, public access to CRS and GPO documents has been a hot button issues for over a decade.  CRS, which provides reports to Congress on issues relevant to legislation, does not make their research readily available to the public.  They view their duty as solely to inform Congress and that releasing their reports to the public is beyond that mission.  Thus the public remains largely unaware of what advice Congress is receiving.  Further, the GPO provides bill texts for free, but they heavily charge for the Official Journals of Government. 

The raw data will come straight from the source and, as a result, will not be filtered by a government middle-man who often interprets the data makes it more confusing, or even omits relevant information.  The new measure should enable the public to piece together this information making it more useful than ever by contextualizing the data.  The question then is if the Congress will release data in formats which can be easily manipulated.  The bill does not include a measure for standardized data.

If the measure survives Appropriations, it will be a huge step forward for Congressional transparency.  As Josh Tauberer reports, other branches of the government have already recognized the usefulness of bulk data.  The Census Bureau and the Federal Elections Commission already provide a great deal of raw information to the public.  It’s about time Congress follows suit.

 Image by Flickr user Imamom used under a Creative Commons license.

back to Blog