April 25, 2007 8:20 AM
Transition to semantic web - part 1
We discussed the concept of "open market" and identified some issues that must be addressed. We discussed RDF language and its schema. We will discuss security in semantic web and advantages of semantic web later. Here we will start the discussion on transition to semantic web.
The evolution of the semantic web requires transition from one type of data representation to another, the HTML/XHTML and XML elements will be converted into RDF web resources with associated metadata. The information that was earlier owned by the web content provider may be shared and the former may not have all the intellectual property rights. The use of shared information may transform the appearance of data on the web.
According to Tim Berners-Lee the evolution of the semantic web is about "converting things". We identified three steps for conversion.
Step 1: How to get data? - Identification of web resources
Data is present in all the statements written or spoken. Data can be extracted from these statements by converting them into RDF triples and then break these triples further to the extent possible such that the RDF graph ends on a non-divisible (that cannot be further broken into a RDF triple) literal value for every property. In non-semantic web when a statement is written on the web page it may contain links to the details of important terms used.
| Example: 256Kbps DSL broadband internet connection. Broadband internet connection - DSL - speed 256Kbps Internet connection - broadband - 256Kbps DSL |
This information may be present on more than one web site and phrased in a different manner to avoid plagiarism. Some may contain links to explain terms like DSL or broadband, these links may link to a local definition on the web page or to a standard definition provided by the telecommunication standards organisation. It is more likely that it will be the former. Different web pages may explain these terms in a different manner. Most internet users will not register at standards web site to know the correct meaning. Every word in the given statement can be identified by a URI, "connection" can be linked to the dictionary. While fineness of RDF web resource identification cannot be enforced on the web content provider, the metadata in the RDF vocabulary can guide the web content provider about the properties of a web resource. The choice of the RDF web resources to be included in the web content will depend on the web applications that are to be supported.
If a word (including alphanumeric) can be visualized as some physical or non-physical entity in this universe it can be identified as a RDF web resource.
Some methods for web resource identification are:
screen scrapping - XML tags, attributes and values in the web page content can be the first level of RDF web resources, these can be refined based on the associated common vocabulary.
references - Anchor text can be converted into a RDF web resource. The context information (words around this anchor text) of this anchor text may contain more RDF web resources.
data collected from consumers - This information can be used to define metadata for the services, the values provided by the consumer may be RDF web resource literal values. The options in the drop-down menu may be RDF web resources. An application for DSL broadband connection may take subscriber phone number or address as input and fill rest of the required data from the corresponding RDF web resource. Implementation of standard service records can reduce data security & control effort for both consumer and service provider.
...



I am going to be blogging live from a couple of days of the
Leave a comment