I discussed the utility of the sub-property relationship in Getting to higher MARC branches, Netting more MARC fruit, and Adding MARC fruit to the cornucopia. Coincidentally, Bob DuCharme posted Simple federated queries with RDF which outlines the same technique and provides additional information on its use for resource discovery. Those posts are somewhat technical, and I tried to lighten up in my presentation Turtle dreaming at the recent Dublin Core Metadata Initiative (DCMI) seminar Five years on. This post is another attempt to demonstrate in a non-technical way (I hope) how useful and powerful the sub-property relationship can be.

A metadata attribute, like ‘title’, that is to be used for linked data in the Semantic Web is usually represented in Resource Description Framework (RDF) as a property. A property can be used as the predicate part of a triple: “Subject – predicate – object”, where ‘Subject’ is what the triple is about (e.g. a resource), ‘predicate’ is the aspect of the subject, and ‘object’ is the value of that aspect for the specified subject. For example:

“This resource – (has) title – ‘Using the sub-property ladder’”

is a single metadata statement in triple format. We can think of this as conforming to the triple template:

“Specified resource – (has) attribute – value”.

Note that prefixing the predicate with ‘has’ turns it into a verbal phrase and renders the statement in (near) natural language.

We can also make meta-metadata statements in triple format. These are ‘data about metadata’ rather than ‘data about data’, and are often referred to as ontological triples to distinguish them from data triples such as the example above. The triple template for one type of meta-metadata statement is:

“Specified RDF element – (has) relationship – Other specified RDF element”

Note that a relationship between metadata elements is also represented in RDF as a property. In particular, ‘sub-property’ is a pre-defined relationship between two RDF property elements, giving the ontological triple:

“Property 1 – (is) sub-property of – Property 2″

Furthermore, such relationships can embed semantic rules that can be processed automatically by software known as ‘semantic reasoners’ or just plain ‘reasoners’. The rule embedded in the sub-property relationship is: If “P1 – (is) sub-property of – P2″, then any data triple using P1 as its predicate can generate another data triple using P2 as its predicate, with the same subject and object. Let’s call this kind of ontological triple a mapping triple, because it effectively maps one property to another.

Suppose we have two attributes ‘title’ and ‘varying form of title’. I can create the mapping triple:

“‘varying form of title’ – (is) sub-property of – ‘title’”.

If we have a data triple:

“This resource – (has) varying form of title – ‘Pat presents cataloguing for beginners’”

then a reasoner will automatically generate the data triple:

“This resource – (has) title – ‘Pat presents cataloguing for beginners’”

In a similar way, we can create the mapping triple:

“‘title statement’ – (is) sub-property of – ‘title’”

and from the data triple:

“This resource – (has) title statement – ‘Cataloguing for beginners’”

generate:

“This resource – (has) title – ‘Cataloguing for beginners’”

So what? Further suppose that the ‘title’ attribute is from the DCMI metadata terms, and the ‘varying form of title’ and ‘title statement’ attributes are from the MARC21 tags 245 and 246. So a MARC21 record for the resource might contain the set of data triples:

“This resource – (has) 245 [title statement] – ‘Cataloguing for beginners’”
“This resource – (has) 246 [varying form of title] – ‘Pat presents cataloguing for beginners’”

A reasoner can generate the set of data triples:

“This resource – (has) [DC] title – ‘Cataloguing for beginners’”
“This resource – (has) [DC] title – ‘Pat presents cataloguing for beginners’”

In other words, we have generated a DC record from a MARC21 record. Or we have generated a title index for the MARC21 record. Or both.

Let’s add an RDA attribute and an ISBD attribute mapping to the mix:

“[RDA] ‘title proper’ – (is) sub-property of – [DC] ‘title’”
“[ISBD] ‘has title proper’ – (is) sub-property of – [DC] ‘title’”

The data triples:

“That resource – (has) [RDA] title proper – ‘Cataloguing for geeks’”
“Another resource – [ISBD] has title proper – ‘Does cataloguing have a future?’”

can generate the corresponding DC triples, and we end up with:

“This resource – (has) [DC] title – ‘Cataloguing for beginners’”
“That resource – (has) [DC] title – ‘Cataloguing for geeks’”
“Another resource – (has) [DC] title – ‘Does cataloguing have a future?”
“This resource – (has) [DC] title – ‘Pat presents cataloguing for beginners’”

So now we have a title index to metadata from multiple heterogeneous sources. And the beginnings of a set of records in Dublin Core format.

Note that the attribute which is the sub-property must be entirely narrower in its semantics than the related super-property. If we create the mapping triple:

“‘title’ – (is) sub-property of – ‘varying form of title’”

then we generate the data triple:

“This resource - (has) 246 [varying form of title] – ‘Cataloguing for beginners’”

which is incorrect.

As a result, a data triple generated by a sub-property mapping triple is usually ‘dumber’ than the original data triple; detail is lost because the generated triple uses an attribute which is broader in meaning than the original. This ‘dumbing-up’ is necessary to produce interoperable metadata from different schemas – but data is not permanently lost because the original triple is still available for use in other applications. Needless to say, data triples created with broad attributes cannot be “smartened-down”, at least on their own.

The sub-property relationship can be chained. We can create a new attribute property, MARC21 ‘title’, which could be used in an application for making a title index to MARC21 records, as already mentioned. This new attribute is a super-property of all the MARC21 title-type attributes, and is also a sub-property of the DC ‘title’ attribute:

“[MARC21] ‘title statement’ – (is) sub-property of – [MARC21] ‘title’”
“[MARC21] ‘varying form of title’ – (is) sub-property of – [MARC21] ‘title’”
“[MARC21] ‘title’ – (is) sub-property of – [DC] ‘title’”

Doing this does not affect the previous mapping triples relating each MARC21 title-type attribute directly to the DC ‘title’ attribute, although it  makes them redundant because this new set of mapping triples generates exactly the same data triples at the DC level from the MARC21 originals.

Different application can therefore re-use and, if necessary, augment the sub-property chains for each of the high-level core attributes found in most bibliographic metadata schemas, such as title, author/creator/agent, subject, target audience, etc. The chains form a net(work) of mappings, or map, which can automatically dumb-up triples from any level of semantic granularity to any higher level.

We should only have to publish such maps or part-maps once, openly so that anyone can use them and add to them. If the professional communities develop the maps first, much effort will be saved and much authority imparted. This requires collaboration and action real soon now – the ISBD Review Group and the Joint Steering Committee for Development of RDA have started with the development of a mapping between the ISBD and RDA element sets.

These maps should remain valid forever, so the effort is worth expending. The original data triples use the original properties based on the schema attributes at the time and they will be valid “for their time”, in the same way that many catalogues are likely to contain records created under the Anglo-American Cataloguing Rules, with its ‘general material designation’ attribute long after the successor standard RDA: resource description and access has been adopted with its ‘content type’ and ‘carrier type’ attributes.

And mappings from the MARC21 element sets will show, we hope, that it may not be necessary to convert the entire contents of every MARC21 record as a result of the Bibliographic Framework Transition Initiative!

But the professional communities lack a framework to help them collaborate as a super-community. A network of mappings is more (socially) efficient than an aggregation of one-to-one mappings between pairs of schemas. We need (name)spaces to add intermediary attribute properties and publish the mappings; we need protocols for managing semantic change as schemas evolve; we need lightweight protocols for authorizing mappings; we need systems for ensuring the long-term preservation of RDF element sets and mapping triples.

Be Sociable, Share!
By Gordon Dunsire, May 12, 2012, 1:17 pm (UTC-5)

Add your own comment or set a trackback

Currently 1 comment

  1. Pingback by 图书馆从传统数据观走向关联数据及语义网:五周年 » 编目精灵III

    [...] Metadata Matters: Using the sub-property ladder / by Gordon Dunsire (May 13, 2012) 该博客由Diane Hillmann和Gordon [...]

Add your own comment



Follow comments according to this article through a RSS 2.0 feed