Saturday, April 16, 2011

rNews - News Metadata in HTML

I chair the IPTC's SemWeb group. This is the third in a series of short posts about IPTC's work on using Semantic Web technologies for news. (The first post discussed where we stand with Linked Data for News and the second post looked at News Ontologies).

The big news at our March 2011 meeting was that the IPTC voted to approve Draft 0.1 of rNews. This kicks off an experimental phase, in which we ask people to learn about rNews and give us feedback via the rNews Forum. We plan to incorporate all the feedback (and fix a couple of little errors that we've found) so that Draft 0.2 of rNews will be ready for the Berlin IPTC meeting in June.
rNews and hNews
rNews is a set of specifications and best practices for using RDFa to embed news-specific metadata into HTML documents. It serves a similar purpose to hNews, although it uses a different technical approach. (hNews is a microformat - a set of conventions that make use of standard HTML elements and attributes to convey metadata). Although it is tempting to see hNews and rNews as rivals, I actually see them as supporting one another. And my suspicion is that tools that support one will pretty easily be able to support the other.

We're already starting to get some feedback on rNews, which is great. We've gotten some excellent questions about the rNews design philosophy from those with in-depth knowledge of the Semantic Web but little direct news publishing experience. And publishers who are old hands at the news business are looking at rNews as a way to learn more about semantic markup for their web content. Whatever your background, I encourage you to look at rNews and let us know what you think.

Getting There
The road to rNews started in the summer of 2010. Like any worthwhile technical standard, it required both articulating a vision of what we wanted to achieve and a lot of long conference calls, poring over the details in spreadsheets. I'd like to take this opportunity to particularly thank Dave ComptonJohn EvansAndreas GebhardJayson LorenzenEvan Sandhaus and Michael Steidl for their dedication in creating and refining rNews Draft 0.1

rNews Live!
I am excited about rNews and hNews and their potential to stimulate an ecosystem of tools for news on the web. I have some ideas for how this might happen, some of which I talked about in the recently-published interview about rNews with
I plan to share some more ideas about rNews at the upcoming New York Semantic Meetup "Meet the IPTC and learn about rNews", hosted by the New York Times. You should come by and find out more.

Monday, April 4, 2011

News Ontology - Large Pieces, Loosely Joined

I chair the IPTC's SemWeb group. This is the second in a short series of short posts about IPTC's work on using Semantic Web technologies for news. (The first post discussed where we stand with Linked Data for News).

News Ontology - or, actually, ontologies
I am more-or-less aware of several projects that are underway to build ontologies relating to news. These various efforts seem more-or-less aware of each other and are, in some cases, working with each other directly. Specifically, I believe that PA, AFP, BBC, EBU, W3C and IPTC are each crafting ontologies - and there are probably more besides.

Wait - Onto What?
The word "ontology" often seems to scare people. But it just means a formal representation of knowledge in a particular domain, expressed using Semantic Web technologies (specifically OWL - the Web Ontology Language).

You may also find this discussion of how ontologies relate to controlled vocabularies and taxonomies helpful. (Although, capriciously, I have linked to an ontology definition that is outside of the W3C SemWeb Technology orthodoxy. Or maybe that was deliberate?)

So, What Good is an Ontology?
Mike Atherton, User Experierence Designer at RedUXD, published a well-received presentation that nicely illustrates one powerful benefit of an ontology. However, his 100 slide Beyond the Polar Bear deck doesn't mention any ontologies until the 99th slide. Instead, it discusses the important of domain modeling and how that helps him build compelling user experiences for substantial websites, chock full of different types of media  and content types - he discusses various BBC websites and microsites. (I highly recommend you check it out - don't let the apparent length put you off).
Ontologies in themselves don't directly deliver a benefit. Instead, they are infrastructure: they are a helpful way to structure information that - in combination with other technologies and practices - can deliver significant advantages over shallower ways of working in a particular domain, such as news. And not just for user experience; there are ways to exploit ontologies in other areas, including mining news (such as for sentiment analysis) or for data journalism.

Examples,  Please
Not all of the various ontologies are public at this time, but a few are.

For example, the BBC has ontologies for programmes, wildlife and  sport. The W3C recently published version 1.0 of their Ontology for Media Resources.

The EBU have experimented with an ontology based on IPTC's NewsML-G2, as have others. And at the recent IPTC face-to-face meeting, Paul Kelly of XML Team discussed the potential for Sports and Semantic Technologies.

Can't We Just Have One?
Rather than having all these different, overlapping ontologies, is it possible to just have one super, unified ontology?

In fact, the decisions about what to include, omit, emphasize or downplay in your ontology depend on what you're trying to do. I believe that each ontology therefore reflects a particular point of view, a specific editorial voice, in deciding what is and isn't important within a domain. That is not to say there would be no benefit from a coordinated, standardized news ontology. The work to create (never mind understand and use) an ontology is significant; even when you agree on the key things to model, there are different choices about the best way to express that model (sometimes driven by the limitations in the tools you have to work with). And having a standard model promotes greater interoperability amongst providers and more choice for clients.

One of the key benefits of Semantic Web technologies is the ability to mix and match different ontologies. And modern methods for developing ontologies (such as NeOn) emphasize reuse and composition. (See also the interesting Master Thesis "Analyzing and Ranking Multimedia Ontologies for their Reuse" by Ghislain Auguste Atemezin) So, I see the independent but somewhat coordinated efforts to create news-related ontologies as being a strength.
Get Involved
If you would like to find out more about the work that the IPTC is doing to help standardize the use of Semantic Web technologies for news, then get in touch.