This past Friday I dug out my go-to-meeting clothes and made one of my occasional forays down one hill and up the other hill to my former place of work, Cornell’s Olin Library. The occasion was the monthly meeting of the CUL Metadata Working Group which I try to show up for when I’m not traveling. This month’s speaker was Sarah Shreeves, from the University of Illinois, Urbana-Champaign (UIUC). Now, I’ve known Sarah for some years, and have worked on a project or two with her, so this seemed like a perfect opportunity to dress up and get out of my cozy office for a morning to see what Sarah was up to. It helps that the weather is improving, this being pre-Spring in Ithaca (still largely gray skies, but mostly without the white underlay).

Sarah’s presentation is described on a WG page devoted to her visit, though her slides aren’t up yet as I write this. She talked primarily about a project UIUC is working on with the University of Wisconsin, called BibApp. The project’s aims are focused around research and collaboration on college campuses, looking for ways to bring together experts, their output, and information about them. Sarah’s role at UIUC includes their institutional repository, the point person at UW is Dorothea Salo, a self-described “repository rat,” and the presentation made quite clear how BibApp provides important support for IR’s in general as well as alternatives to the clunky interfaces most of them come with.

From my admittedly metadata-centric point of view the most interesting things Sarah demonstrated were the backend work they do with author name disambiguation and automating the determination of what can legally be deposited in the IR from the oeuvre of a particular author, based on the publisher of the item. Depending on the research community, these items may have metadata available from A & I providers, MARC metadata, or no metadata at all (book chapters and conference publications often fall in that last category). The project is developing a tool that can support the work of librarians or other trained staff in determining who an author with a particular surname and forename initial might be, using a combination of human and machine intelligence. Although it’s not authority work as we know it, once we get to a mashup phase in the new world of authorities, the data provided by this tool will be quite useful, based as it is on the occurrences in the publications themselves, as well as linkages to additional information on the author (his/her institutional affiliation, email address, etc.) There’s already a somewhat primitive XML output, but even as it flashed by (not on the presentation, but as part of the discussion) I could see some possibilities.

In this application, being able to match the publisher name in the metadata with data about publisher policies to determine whether pre-prints, post-prints, author copies or all of the above can be legally deposited saves a great deal of time both for librarians and faculty. As Sarah was talking about this, I was struck by how clear a use case this was against traditional library practice of transcription of publisher information. I suggested this to Sarah and she agreed that they really needed standard forms, not transcribed data, to make this sort of tool function well. RDA’s approach is still focused on the idea that transcription is the best way to assure uniformity, but in my opinion that approach only provides the illusion of uniformity, and the number of LC Rule Interpretations for AACR2 on this point suggests that in a world where human resources will be spread thinner than ever it’s time to jettison transcription of important data elements. The alternative that we need to implement is to treat publishers as corporate bodies and control their names as we do other names, and this will certainly be technically possible in RDA, but is not what the guidance instruction tells catalogers to do.

Another really nifty feature Sarah demonstrated was the analyses of publication patterns of different groups of researchers that were enabled by the tool once the publications themselves were available in one place. The BibApp tools are being tested using a variety of research groups, some science and others humanities, highlighting important differences between the cultures of these research areas. I was impressed by how sensible the approach of the developers has been, and how they appear to be using a nicely iterative approach to working through their results and keeping their goals in mind. In a world where university administrators need reminding of the importance of libraries and the work they do, this sort of project stands out like a beacon—meeting a number of promotional and marketing goals through a very librarian-ish organizational approach with some quick wins in the bargain.

That said, this is not the ginzu knife that some may be looking for, and the developers are sensibly determined to keep it light weight and focused. Nice work, Sarah, and a very effective presentation as well.

By Diane Hillmann, March 23, 2009, 8:50 am (UTC-5)

Add your own comment or set a trackback

Currently no comments

  1. No comment yet

Add your own comment



Follow comments according to this article through a RSS 2.0 feed