Metadata standards is a huge topic and evaluation a difficult task, one I’ve been involved in for quite a while. So I was pretty excited when I saw the link for “DRAFT Principles for Evaluating Metadata Standards”, but after reading it? Not so much. If we’re talking about “principles” in the sense of ‘stating-the-obvious-as-a-first-step’, well, okay—but I’m still not very excited. I do note that the earlier version link uses the title ‘draft checklist’, and I certainly think that’s a bit more real than ‘draft principles’ for this effort. But even taken as a draft, the text manages to use lots of terms without defining them—not a good thing in an environment where semantics is so important. Let’s start with a review of the document itself, then maybe I can suggest some alternative paths forward.
First off, I have a problem with the preamble: “These principles are intended for use by libraries, archives and museum (LAM) communities for the development, maintenance, governance, selection, use and assessment of metadata standards. They apply to metadata structures (field lists, property definitions, etc.), but can also be used with content standards and value vocabularies”. Those tasks (“development, maintenance, governance, selection, use and assessment” are pretty all encompassing, but yet the connection between those tasks and the overall “evaluation” is unclear. And, of course, without definitions, it’s difficult to understand how ‘evaluation’ relates to ‘assessment’ in this context—are they they same thing?
Moving on to the second part about what kind of metadata standards that might be evaluated, we have a very general term, ‘metadata structures’, with what look to be examples of such structures (field lists, property definitions, etc.). Some would argue (including me) that a field list is not a structure without a notion of connections between the fields; and although property definitions may be part of a ‘structure’ (as I understand it, at least), they are not a structure, per se. And what is meant by the term ‘content standards’, and how is that different from ‘metadata structures’? The term ’value vocabularies’ goes by many names, and is not something that can go without a definition. I say this as an author/co-author of a lot of papers that use this term, and we always define it within the context of the paper for just that reason.
There are many more places in the text where fuzziness in terminology is a problem (maybe not a problem for a checklist, but certainly for principles). Some examples:
1. What is meant by ’network’? There are many different kinds, and if you mean to refer to the Internet, for goodness sakes say so. ‘Things’ rather than ‘strings’ is good, but it will take a while to make it happen in legacy data, which we’ll be dealing with for some time, most likely forever. Prospectively created data is a bit easier, but still not a cakewalk — if the ‘network’ is the global Internet, then “leveraging ‘by-reference’ models” present yet-to-be-solved problems of network latency, caching, provenance, security, persistence, and most importantly: stability. Metadata models for both properties and controlled values are an essential part of LAM systems and simply saying that metadata is “most efficient when connected with the broader network” doesn’t necessarily make it so.
2. ‘Open’ can mean many things. Are we talking specific kinds of licenses, or the lack of a license? What kind of re-use are you talking about? Extension? Wholesale adoption with namespace substitution? How does semantic mapping fit into this? (In lieu of a definition, see the paper at (1) below)
3. This principle seems to imply that “metadata creation” is the sole province of human practitioners and seriously muddies the meaning of the word creation by drawing a distinction between passive system-created metadata and human-created metadata. Metadata is metadata and standards apply regardless. What do you mean by ‘benefit user communities’? Whose communities? Please define what is meant by ‘value’ in this context? How would metadata practitioners ‘dictate the level of description provided based on the situation at hand’?
4. As an evaluative ‘principle’ this seems overly vague. How would you evaluate a metadata standard’s ability to ‘easily’ support ‘emerging’ research? What is meant by ‘exchange/access methods’ and what do they have to do with metadata standards for new kinds of research?
5. I agree totally with the sentence “Metadata standards are only as valuable and current as their communities of practice,” but the one following makes little sense to me. “ … metadata in LAM institutions have been very stable over the last 40 years …” Really? It could easily be argued that the reason for that perceived stability is the continual inability of implementers to “be a driving force for change” within a governance model that has at the same time been resistant to change. The existence of the DCMI usage board, MARBI, the various boards advising the RDA Steering Committee, all speak to the involvement of ‘implementers’. Yet there’s an implication in this ‘principle’ that stability is liable to no longer be the case and that implementers ‘driving’ will somehow make that inevitable lack of stability palatable. I would submit that stability of the standard should be the guiding principle rather than the democracy of its governance.
6. “Extensible, embeddable, and interoperable” sounds good, but each is more complex than this triumvirate seems. Interoperability in particular is something that we should all keep in mind, but although admirable, interoperability rarely succeeds in practice because of the practical incompatibility of different models. DC, MARC21, BibFrame, RDA, and Schema.org are examples of this — despite their ‘modularity’ they generally can’t simply be used as ‘modules’ because of differences in the thinking behind the model and their respective audiences.
I would also argue that ‘lite style implementations’ make sense only if ‘lite’ means a dumbed-down core that can be mapped to by more detailed metadata. But stressing the ‘lite implementations’ as a specified part of an overall standard gives too much power to the creator of the standard, rather than the creator of the data. Instead we should encourage the use of application profiles, so that the particular choices and usages of the creating entity are well documented, and others can use the data in full or in part according to their needs. I predict that lossy data transfer will be less acceptable in the reality than it is in the abstract, and would suggest that dumb data is more expensive over the longer term (and certainly doesn’t support ‘new research methods’ at all). “Incorporation into local systems” really can only be accomplished by building local systems that adhere to their own local metadata model and are able to map that model in/out to more global models. Extensible and embeddable are very different from interoperable in that context.
7. The last section, after the inarguable first sentence, describes what the DCMI ‘dumb-down’ principle defined nearly twenty years ago, and that strategy still makes sense in a lot of situations. But ‘graceful degradation’ and ‘supporting new and unexpected uses’ requires smart data to start with. ‘Lite’ implementation choices (as in #6 above) preclude either of those options, IMO, and ‘adding value’ of any kind (much less by using ‘ontological inferencing’) is in no way easily achievable.
I intend to be present at the session in Boston [9:00-10:00 Boston Conference and Exhibition Center, 107AB] and since I’ve asked most of my questions here I intend not to talk much. Let’s see how successful I can be at that!
It may well be that a document this short and generalized isn’t yet ready to be a useful tool for metadata practitioners (especially without definitions!). That doesn’t mean that the topics that it’s trying to address aren’t important, just that the comprehensive goals in the preamble are not yet being met in this document.
There are efforts going on in other arenas–the NISO Bibliography Roadmap work, for instance, that should have an important impact on many of these issues, which suggests that it might be wise for the Committee to pause and take another look around. Maybe a good glossary would be a important step?
Dunsire, Gordon, et al. “A Reconsideration of Mapping in a Semantic World”, paper presented at International Conference on Dublin Core and Metadata Applications, The Hague, 2011. Available at: http://dcpapers.dublincore.org/pubs/article/view/3622/1848