Persistent Identifiers for Scientific Data Provenance

In this week’s ebiquity meeting (10:00am EDT Wed 2/25, ITE 325), Curt Tilmes will talk on “Persistent Identifiers for Earth Science Provenance“.

Historically, published scientific research could include a description of an experiment that an independent party could use to reproduce the experiment with the same results, confirming the research. Modern research in the field of earth science often depends on terrabytes of data captured from remote sensing instruments, complex computer algorithms that undergo numerous changes over the year. A single result could be the result of the work of hundreds of individuals over decades. The representation of the measurements, algorithms and all the other artifacts of experimentation leading to that result becomes a daunting problem. A key to handling this representation is a good scheme for persisent identifiers.

Persistent identifiers seem like a simple problem. Just make a good URL and don’t change it [1]. This sounds good in theory, but is difficult to maintain forever. Many other schemes have been proposed to attack various aspects of the problem of identification, with various advantages and disadvantages. I will introduce this topic and briefly describe some of the concerns with using identifiers specifically in the context described above, and some of the characteristics of various identifier schemes.

The presentation will be streamed live via ustream.tv

References and some identifier schemes

[1] Cool URIs Don’t Change
[2] Naming and Addressing: URIs, URLs, …
[3] Object Identifer (OID)
[4] The Digital Object Identifier (DOI) System
[5] Persistent Uniform Resource Locator
[6] A Universally Unique IDentifier (UUID) URN Namespace
[7] XRI (Extensible Resource Identifier)

Comments are closed.