Talk:OpenDataPrinciples

From OpenGovData

Jump to: navigation, search

Contents

One sentence formulation

One sentence formulation? Advantage: uses the verbs related to the sequence of the principles explicitly.

"An entity satisfies the Open Government Data Principles if it makes complete primary public data available in a timely fashion, retrievable in an accessible, machine processable and non-discriminatory form, and usable license-free."

Suggestions

The page is locked, so here are some suggestions:

Can we change /data is/data are/? [carl, passing on suggestions from Randy Bush]

Would "finest" be better than "highest" degree of granularity? [ditto]

- Fixed by JT. (I think for the question of "data is/are" there is no real better way: It's a variation from linguistic community to linguistic community, as far as I know.)

Support Open Government Data

Show your support for Open Government Data: join our very-low-volume announcement list.

http://opengov.theinfo.org/gotdata.png

on preservation and long-term access

1. Preservation will not just happen. Digital preservation in particular takes planning and resources. I worry that, if preservation is not addressed explicitly it will not be addressed adequately.

2. Requiring "access" alone -- even open, complete, free access -- is not enough. Without planning and funding for long-term access and preservation, access today can turn into inadvertent loss tomorrow.

3. Relying on the government to provide the only means of long-term preservation and access will work sometimes but fail when it is most critical. If the government is the sole-provider and it (intentionally or unintentionally) amends, alters, loses, abridges, or deletes content, the content is lost for everyone for ever. Information that is most embarrassing, most valuable, most useful to citizens in making government accountable will be most vulnerable to intentional control, alteration, and loss.

4. Relying on the government to provide sole means of access endangers privacy as it allows governments to record and track use of government information by individuals.

5. We can predict, based on what governments have done in the past, what will happen if we allow or encourage the government to recover costs for access (even marginal cost of distribution). There will be two-tier access for users: some will be able to afford access and others will not. There will be two-tier access for content: 'popular' content will be free or less expensive, but there will be charges for less-popular or less-used. There will be two-tier access to functionality: one user may be able to get one page of a hearing for free, but it will cost for citizens' groups, libraries, and others who wish to get mass content (e.g. all hearings for a congress). There will be two-tier access to formats, as well: "viewable" copies of content will be free, but there will be a charge for "actionable" (e.g., XML) versions of the same content. The government will rely on private sector vendors to provide access through outsourcing and will claim that availability of content through private vendors meets the requirements of 'access.' In an attempt to recover costs, governments will license access to data and in doing so, impose licensing restrictions on redistribution and use and will apply technological locks (i.e., DRM) to enforce license restrictions.

One possible component of a solution to the above problems is to require that the government make available en masse and distribute (without charge or licensing restrictions or DRM) all government information to libraries, archives, and other memory organizations. The existing Federal Depository Library Program (FDLP), which is defined by U.S. Code Title 44 and administered by the Government Printing Office (GPO), provides a starting place for such a distribution system. The GPO has, however, been arrogating to itself the role once given to distributed depository libraries and most depository libraries have been reluctant to ask for the responsibility of accepting deposit of digital government files. It will probably be necessary to write into plans for preservation the explicit role of government deposit and the role of depository libraries to accept and preserve that information. With lots of copies in lots of institutions, free of of locks and restrictions on use, it will be harder to lose, destroy, or control access to government information. With multiple partners preserving and providing access to the information, there will be multiple budgets, multiple constituencies, and multiple technical preservation solutions.

Another possible solution: The Rabbit Defense... Make the data available to everyone and, with the rapidly dropping cost of storage, many organizations will archive it. Publish hashes of particular datasets - on paper. Hashes are small, and the data is cross-validating. (Generating the hash from the data is trivial, so they can't publish bad hashes)

Personal tools