Talk:OpenDataPrinciples/machine processable

From OpenGovData

Jump to: navigation, search

Each record in the data should include an identifier. This identifier should be persistent across revisions to the data set so that external references to individual record can follow updates. The identifier can be a globally unique URI identifier following Semantic Web best practices, for instance.

The data format should be documented so that those familiar with the domain of the data set can understand it. All columns, tags, and abbreviations should be described. However, XML schema or the like are not necessary.

P Language Rule

You know you have a truly open format if you can build a parser for it in Perl, Python or PHP in an afternoon. That parser should be able to crawl through the dataset and dump the results into a SQL database. That doesn't necessarily mean that the data is best handled with an SQL database (although most of this material will fall into that category) - just that it can be easily imported into one.

Personal tools