Announcement: The vocabulary management mess needs your attention.
In 2013 I wrote a couple of posts critical of some updating practices LC was using in its id.loc.gov services. The first post focused on the relators vocabulary, and how change was handled within that vocabulary.
The follow-up post dealt with the unacknowledged gaps in the LCSH service on id.loc.gov.
So, here we are almost three years later, and not much has changed. I went back through the examples in pt. 1, and the issues pointed out there are still exactly as noted three years ago. For those three years there have been many discussions about linked open data, but the vocabulary infrastructure needed to support LOD is still largely not ready for prime time.
And it’s not as if nobody’s been thinking about solutions. We wrote a paper about how versioning management ought to work in vocabulary services, but it seems to have been overlooked by even the large established services. It’s hard to avoid the conclusion that there just hasn’t been much recognition of the problem we were trying to solve. Trust me, it’s a real problem, and a very big one at that. It’s not just local, or attached to one institution–we’re talking about an international problem, one that could delay the uptake of linked data far longer than we’d like.
For users of vocabularies, the absence of vocabulary services (aside from simple lookup and basic file download) are a large impediment to the actual use of LOD. How does a creator of bibliographic data–who we’ve been encouraging to use vocabularies–actually use those vocabularies to manage their overall data needs over time? By downloading files and examining diffs every week (month, quarter, year)? Remember when we had to cruise websites to find out when a new software update was available? We’re at that stage right now in vocabulary management, and not making progress towards a service environment that will actually support the use of vocabularies in data.
To think about ‘how big’ the problem is, consider how many vocabularies occur regularly in instance data: ISBD, FRBRer, RDA, schema.org, DC Terms, etc. When terms in those vocabularies change, how do those managers of instance data know what has changed? Should they be expected to just leave the old data as is, or send it to the cleaners every year or so? Proper vocabulary management practices can be a big part of the answer–the machine-assisted answer.
So, what’s to be done?
Just about 10 years ago, LC initiated a Working Group on the Future of Bibliographic Control, complete with blue ribbon membership and a broad remit to look around and suggest a path for re-imagining what they and other libraries were facing as they worked towards a different future. Full day hearings were held in three locations across the US and these events drew so much interest that the streaming capabilities set up for the sessions were overwhelmed. I testified at one of those hearings, and–I’m sure you’re surprised–spoke about the value of vocabularies in this brave new world.
But the amazing thing was the level of interest and engagement of the library community in the issues discussed by the WG. I’m not sure I’ve ever seen anything like it, before or since. For a while there, every time I was asked to present at a meeting, the WG report was the desired topic. Literally everyone was talking about it–the community clearly recognized the importance of the effort and wanted to be part of it.
We definitely need something like that again–a place to bring together the community and its experts, to state the problems and brainstorm the solutions. Let’s call it a ‘Library Vocabulary Summit’, for the moment, and roll the possibilities around in our heads. We’d need funding, leadership, and marketing to make it happen. Let’s ALL talk! (Preferably around a large table, face to face, with a relevant agenda).