February 8, 2008 3:47 AM
Is Calais an alternative to RDFa annotation?
Reuters have launched a web service for annotation of content on web, they call it Calais.
We want to make all the world's content more accessible, interoperable and valuable. Some call it Web 2.0, Web 3.0, the semantic web or the Giant Global Graph - we call our piece of it Calais [1].
What Calais web service does?
This web service will accept the text content and scan it to find data for semantic annotation. The service shall find the appropriate metadata for semantic annotation. There is a provision that the content provider can suggest a vocabulary to be used for semantic annotation. The service shall store the RDF triples generated from the semantic annotation in a central repository and provide a Globally Unique Identifier (GUID) to the content provider. The service shall also provide these RDF triples to the content provider.
What should Calais web service user do?
The content provider who will use Calais web service must proivde the returned Calais GUID to those who need RDF triples corresponding to the published content. Any web application that needs RDF triples for the published content will find these triples in Calais central repository by providing the GUID.
Using the Calais GUID, any downstream consumer is able to retrieve this metadata via a simple call to Calais [1].
The content provider may also use RDFa to include the metadata terms from the RDF triples returned by the Calais web service for semantic annotation of the content.
What are the advantages of Calais web service?
The Calais web service has following advantages:
- It shall find appropriate metadata for semantic annotation of the input content. The content provider does not have to search for appropriate vocabularies and metadata for semantic annotation of the content.
- It shall generate and store RDF triples for the content in a central repository. The content provider does not need to include GRDDL transformations for RDF generation.
Conclusion: The Calais web service provides RDF triples for annotable data in the input content. The Calais web service does not use RDFa for semantic annotation of the content. The web content provider must hand annotate the web content by using RDFa and metadata from RDF triples returned by the Calais web service. An unanswered question is whether Calais web service will use normative metadata from a standard ontology. Use of RDFa and normative metadata are key to building semantic web/Giant Global Graph. The use of GUID may not be considered as a replacement to GRDDL transformation; the former will require that every time the content is updated the RDF triples stored in Calais repository must be updated and if GUID is changed then all users of this GUID must be notified. By using GRDDL transformation instead of GUID the users of RDF triples need not be notified about any changes to the content. The web server cache may store the latest updates to RDF triples.
References:[1] Overview. 2008. Calais. <http://opencalais.mashery.com/Overview>



I am going to be blogging live from a couple of days of the