The principles and workflows that drive open source software can also be applied to government open data. Open sourcing data on GitHub provides developers with a forum to surface feedback, exposes the data’s change history, and can even allow for crowd-sourced open data efforts.
GitHub.com supports rendering tabular data in the form of .csv
(comma-separated) and .tsv
(tab-separated) files. When viewed, any .csv
or .tsv
file committed to a GitHub repository will automatically render as an interactive table, complete with headers and row numbering. CSVs can be easily generated from any tabular data including Microsoft Excel and Access files.
After creating a repository, there are several ways to commit a CSV file to GitHub:
For more information, see the CSV rendering help article.
Any .geojson
file (an open, web-based geodata format) in a GitHub repository will be automatically rendered as an interactive, browsable map, annotated with your geodata. Most common geoformats (e.g., KML, ESRI Shapefile) can be converted to geojson using freely available tools. Once committed, geojson files can be embedded on any webpage that supports javascript. Geojson support is also available for Gists.
For more information, see the mapping geojson files on GitHub help article.
Another aspect of open data, beyond just the release of raw datasets, is opening up government data analysis. A common analysis format is the web-based, language agnostic Jupyter Notebook which commonly have the .ipynb
file extension. When these are pushed to a Github repo, all of the content within is rendered as static HTML.. This would include all of your code, the results, graphs (except for interactive Javascript plots), and annotations.
Any text-based format can make a great candidate for collaboration on GitHub, even if not not natively rendered. For some examples of other open data efforts, see the @unitedstates repository.