Government best practices

GitHub for open data

The principles and workflows that drive open source software can also be applied to government open data. Open sourcing data on GitHub provides developers with a forum to surface feedback, exposes the data’s change history, and can even allow for crowd-sourced open data efforts.

Tabular data

GitHub.com supports rendering tabular data in the form of .csv (comma-separated) and .tsv (tab-separated) files. When viewed, any .csv or .tsv file committed to a GitHub repository will automatically render as an interactive table, complete with headers and row numbering. CSVs can be easily generated from any tabular data including Microsoft Excel and Access files.

After creating a repository, there are several ways to commit a CSV file to GitHub:

  1. Copy and paste the content of the CSV into a new file via the GitHub.com web interface.
  2. Drag the file to the repository folder on your computer and use the GitHub for Windows or GitHub for Mac desktop clients to sync the repository.
  3. Use the Git command line interface to commit the file and push to GitHub.com

For more information, see the CSV rendering help article.

Geodata

Any .geojson file (an open, web-based geodata format) in a GitHub repository will be automatically rendered as an interactive, browsable map, annotated with your geodata. Most common geoformats (e.g., KML, ESRI Shapefile) can be converted to geojson using freely available tools. Once committed, geojson files can be embedded on any webpage that supports javascript. Geojson support is also available for Gists.

For more information, see the mapping geojson files on GitHub help article.

Data analysis

Another aspect of open data, beyond just the release of raw datasets, is opening up government data analysis. A common analysis format is the web-based, language agnostic Jupyter Notebook which commonly have the .ipynb file extension. When these are pushed to a Github repo, all of the content within is rendered as static HTML.. This would include all of your code, the results, graphs (except for interactive Javascript plots), and annotations.

Other open data formats

Any text-based format can make a great candidate for collaboration on GitHub, even if not not natively rendered. For some examples of other open data efforts, see the @unitedstates repository.