Data of the world, lay down your chains

Prison Planet by AZRainman (cc) (from Flickr)
Prison Planet by AZRainman (cc) (from Flickr)

GitHub, that open source free repository of software, is taking on a new role, this time as a repository for municipal data sets. At least that’s what a recent article on the Atlantic.com website (see Catch my Diff: GitHub’s New Feature Means Big Things for Open Data) after GitHub announced new changes in its .GeoJSON support (see Diffable, more customizable maps)

The article talks about the fact that maps in Github (using .GeoJSON data) can be now DIFFed, that is see at a glance what changes have been made to it. In the one example in the article (easier to see in GitHub) you can see how one Chicago congressional district has changed over time.

Unbeknownst to me, GitHub started becoming a repository for geographical data. That is any .GeoJson data file can be now be saved as a repository on GitHub and can be rendered as a map using desktop or web based tools. With the latest changes at GitHub, now one can see changes that are made to a .GeoJSON file as two or more views of a map or properties of map elements.

Of course all the other things one can do with GitHub repositories are also available, such as FORK, PULL, PUSH, etc. All this functionality was developed to support software coding but can apply equally well to .GeoJSON data files. Because .GeoJSON data files look just like source code (really more like .XML, but close enough).

So why maps as source code data?

Municipalities have started to use GitHub to host their Open Data initiatives. For example Digital Chicago has started converting some of their internal datasets into .GeoJSON data files and loading them up on GitHub for anyone to see, fork, modify, etc.

I was easily able to login and fork one of the data sets. But there’s a little matter of pushing your committed changes to the project owner that needs to happen before you can modify the original dataset.

Also I was able to render the .GeoJSON data into a viewable map by just clicking on a commit file (I suppose this is a web service). The ReadME file has instructions for doing this on your desktop outside of a web browser for R, Ruby and Python.

In any case, having the data online, editable and commitable would allow anyone with GitHub account to augment the data to make it better and more comprehensive. Of course with the data now online, any application could make use of it to offer services based on the data.

I guess that’s what Open Data movement is all about, make government, previously proprietary data freely available in a standardized format, and add tools to view and modify it, in the hope that businesses see a way to make use of it in new ways. As such, In  the data should become more visible and more useful to the world and the cities that are supporting it.

If you want to learn more about Project Open Data see the blog post from last year on Whitehouse.gov or the GitHub Project [wiki] pages.

Comments?