The Iceberg of Open Data (v0.2)


The motivation with this diagram is that the government open data movement has largely failed to live up to its early hype though there’s still tremendous potential. That may sound harsh though if we’re honest with ourselves the explosion of apps promised in the original Obama open by default EO didn’t happen.

There was a lot of consensus at the first CKAN summit in the US earlier this year though that we’ve gotten through the “trough of disillusionment” in the standard hype cycle and are on to the plateau of productivity. Lots of less sexy though subtle open data wins.

Particularly at the local government level that requires collaboration across municipalities to provide technical capacity (only 0.64% of cities have an open data portal) to deploy the data infrastructure required to make this work not another burden of civil servants.

There’s lots of exciting projects to securely share the raw, often PII underlying data (UCLA Policy Lab, CUSP data facility etc) which provides the infrastructure for more meaningful open data as well as lots of research benefits – like ARGO’s “California model” for data collaboratives.

Those computational social science research benefits aren’t as sexy as a shiny new app though have huge long term benefits in achieving delivery driven government. Getting that deep data infrastructure can also enable streamlining the existing reporting required of muni’s. For example:

Those are some early thoughts and greatly appreciate feedback as we’d like to sharpen this notion of the iceberg of open data to be most useful to the community. Thanks much!




This is awesome!

How much was the human component of the process of opening public data discussed? I feel like the stumbling block often comes not in the policy or even in the technical obstacles, but in the mindset shift needed to compel departments to comply and comply in a manner that is consistent, standardized, and streamlined for efficiency.

Even when it’s a top-down mandate, it’s hard often to get departments and service units on board, because it does requires a tad extra work at first, even though in the long run it can ultimately lead to productivity gains.

I’d also be interested in seeing any data out there on how open data initiatives have affected open records request volume. At least in my experience, handling open records requests can be a very time-consuming and frustrating process for the individuals in the department responsible for handling them. Just in Savannah, they have two (2) full-time staffers whose sole job it is to handle open records requests, and they’ve expressed a lot of frustration to me at the volume of requests they have to manage.

Part of that, too, involves shared responsibility and breaking down departmental silos. It’s every department’s job to share its public data. We need to make it easier for open data portals not only to be deployed, but managed by the layperson without a technical background regularly. A good project that attempts this is DKAN.




While I love CKAN, it has massive barriers to entry in terms of deployment. Another interesting and lightweight project in this arena we use as a Brigade is JKAN by @timwis of City of Philadelphia.

While I’m leery of any moves to store public data on the server of a privately-owned company, I will also say that, which is a B-Company by some excellent folks in Austin, does an excellent job at facilitating data sharing, with an array of analysis options and connectors allowing users to contextualize the raw dataset.



@carlvlewis see here for a great analysis from sunlight on FOIA requests. Also would encourage you to take a poke around which is an excellent model that address many of the curation and culture shift issues that you allude to. Thanks for the thoughtful reply!

1 Like


Posting feedback from other argonauts here for a COP.

I think the iceberg could benefit from a little more context to be useful. Some context that comes to mind:

  1. An example (real or hypothetical) for each “level” to make these ideas more concrete

  2. Some sort of logic for the ordering. Does the ordering mean anything? Is data sharing for social science research inherently more secretive / deeper than collaboration across governments?

Also Varun suggests digging deeper to Bianca Whiley’s work.



Sharing some relevant links on the need to dig deeper to improve open data initiatives:

There is also the highly relevant data spectrum from ODI:



We had a dialogue about unlocking the deeper potential of open data on International Open Data Day 3/2/19! See here for notes:



@patwater31 and @carlvlewis - thanks for the good exchange. A few things to add:
a) Within public administration research and practice - big push on benchmarking which allows for systematic comparison of jurisdictions on services. Example from my colleagues here at UNC-Chapel Hill for many NC cities: @patwater31: seems to align with your “mid-low” part of the iceberg, “Collaboration across local governments to enable meaningful comparisons.” And here are the list of city services that are part of the project:
b) Where would you place the Western PA Regional Data Center on your iceberg diagram? Front page seems to aim for layperson education and engagement. It looks to have data from mainly government (Pittsburgh and its main county), but seems to have a few nonprofits contributing data.
c) Another group to follow is the NNIP network.
Am curious how all this relates to the above/below water parts of your Open Data thinking.
Thanks, John Stephens, Code for Durham