Monday, April 4, 2011

News Ontology - Large Pieces, Loosely Joined

I chair the IPTC's SemWeb group. This is the second in a short series of short posts about IPTC's work on using Semantic Web technologies for news. (The first post discussed where we stand with Linked Data for News).

News Ontology - or, actually, ontologies
I am more-or-less aware of several projects that are underway to build ontologies relating to news. These various efforts seem more-or-less aware of each other and are, in some cases, working with each other directly. Specifically, I believe that PA, AFP, BBC, EBU, W3C and IPTC are each crafting ontologies - and there are probably more besides.

Wait - Onto What?
The word "ontology" often seems to scare people. But it just means a formal representation of knowledge in a particular domain, expressed using Semantic Web technologies (specifically OWL - the Web Ontology Language).

An OWL or a WOL? by dullhunk

You may also find this discussion of how ontologies relate to controlled vocabularies and taxonomies helpful. (Although, capriciously, I have linked to an ontology definition that is outside of the W3C SemWeb Technology orthodoxy. Or maybe that was deliberate?)

So, What Good is an Ontology?
Mike Atherton, User Experierence Designer at RedUXD, published a well-received presentation that nicely illustrates one powerful benefit of an ontology. However, his 100 slide Beyond the Polar Bear deck doesn't mention any ontologies until the 99th slide. Instead, it discusses the important of domain modeling and how that helps him build compelling user experiences for substantial websites, chock full of different types of media  and content types - he discusses various BBC websites and microsites. (I highly recommend you check it out - don't let the apparent length put you off).
Won't you be my friend? by ucumari
Ontologies in themselves don't directly deliver a benefit. Instead, they are infrastructure: they are a helpful way to structure information that - in combination with other technologies and practices - can deliver significant advantages over shallower ways of working in a particular domain, such as news. And not just for user experience; there are ways to exploit ontologies in other areas, including mining news (such as for sentiment analysis) or for data journalism.

Examples,  Please
Not all of the various ontologies are public at this time, but a few are.

For example, the BBC has ontologies for programmes, wildlife and  sport. The W3C recently published version 1.0 of their Ontology for Media Resources.

The EBU have experimented with an ontology based on IPTC's NewsML-G2, as have others. And at the recent IPTC face-to-face meeting, Paul Kelly of XML Team discussed the potential for Sports and Semantic Technologies.

Can't We Just Have One?
Rather than having all these different, overlapping ontologies, is it possible to just have one super, unified ontology?

In fact, the decisions about what to include, omit, emphasize or downplay in your ontology depend on what you're trying to do. I believe that each ontology therefore reflects a particular point of view, a specific editorial voice, in deciding what is and isn't important within a domain. That is not to say there would be no benefit from a coordinated, standardized news ontology. The work to create (never mind understand and use) an ontology is significant; even when you agree on the key things to model, there are different choices about the best way to express that model (sometimes driven by the limitations in the tools you have to work with). And having a standard model promotes greater interoperability amongst providers and more choice for clients.

One of the key benefits of Semantic Web technologies is the ability to mix and match different ontologies. And modern methods for developing ontologies (such as NeOn) emphasize reuse and composition. (See also the interesting Master Thesis "Analyzing and Ranking Multimedia Ontologies for their Reuse" by Ghislain Auguste Atemezin) So, I see the independent but somewhat coordinated efforts to create news-related ontologies as being a strength.
Mochuelo de hoyo by barloventomagico

Get Involved
If you would like to find out more about the work that the IPTC is doing to help standardize the use of Semantic Web technologies for news, then get in touch.

1 comment: