BA has launched a social site called Metrotwin.com which acts as a travel guide to New York and London.
moreWith a new range of affordable digital camcorders hitting the shelves this week, NMK takes a closer look at the continuing rise of ‘citizen journalism’.
moreThe first panel discussion broached the topic of 5D, the developing area of immersive design that touches on themes around sensory experience, virtual reality environments and design that is not just digital and not just virtual, but rather a hybrid of the two.
more
Semantic search is poorly understood and leading to claims for its powers that lie beyond the bounds of what computers are able to do, says Charlie Hull, MD of Lemur Consulting.
moreWhile the definition of Web 2.0 has been argued between digital specialists for some time now, the same key themes prevail. According to Wikipedia, Web 2.0 technology enhances "creativity, information sharing, and, most notably, collaboration among users". The definition of Web 3.0 however is much more difficult to define. more
A survey has revealed that the American public is shunning traditional media such as newspapers and TV as their primary source of news. The Internet has become the main channel of information for nearly half (48 per cent) of Americans - an increase of 8 per cent from one year ago.
moreWhile Web 2.0, user-generated content sites perform less well than traditional sites when it comes to advertising conversions, the cost of using such sites is proportionally low. more
Operators and handset manufacturers have been heralding the imminent arrival of the mobile web for some time. But in the words of the sage of Salford, Steven Morrissey, "How soon is now?" more
The annual British Comupter Society Roger Needham lecture delivered by Ian Horrocks at the Royal Society on 7th December 2005 delivered some key insights into the reasoning behind - and challenges of - the semantic web, reports Deirdre Molloy...
Some of this talk – the annual British Computer Society
Roger Needham Lecture delivered by Professor Ian Horrocks at the
Royal Society on 7 December 2005 – went a little over my head,
but other parts percolated the grey matter and I learned a lot
more than I usually do at public events...
By Deirdre Molloy
[Register and post your own comments
on this article below...]
By way of introduction, Dr Peter Kay, Assistant Director of
Microsoft Research (UK) mentioned that the late Professor Roger Needham (1935-2003), in whose
honour the lecture is held, created the concept of “clumps”, the
“clusters” of today.
Horrocks began by asking what is the semantic
web and he defined it by first looking at some of the
problems and limitations of the current web.
The current web isn’t much more than distributed hypermedia. The
information on it is designed to be consumed by human beings but
is not accessible to automated processes.
Tim Berners-Lee’s original vision of the web was more ambitious,
Needham explained, proffering this TBL quote about the WWW to
support his assertion: “A set of connected applications forming
a consistent, logical web of data… given well-defined
meaning.”
The messy, syntactic web
It’s hard work using this “syntactic web”, as Horrocks termed
it. A search for his colleague Alan Rector on Google Images turns up images
of someone called ‘Reverend Alan’. What else is impossible to
find using the syntactic web? Complex queries involving
background knowledge; locating information in data repositories
– eg. travel enquiries, prices, goods and services; the results
of human genome experiments.
It’s also hard to find and use web services eg. Given a DNA
sequence, identify its genes, determine the proteins, etc… its
very difficult if not impossible to get a coherent pathway to
this information.
The problem here lies in the fact that mark-up is all about
presenting the information, not about the semantic content. The
web page is very accessible to us but very difficult for a
machine to understand. Even worse, text is sometimes buried
inaccessibly inside images and graphics.
Delegating complex tasks to web agents
So what is the proposed solution? To add semantic annotation to
web resources. In the case of the pictures, semantic information
along the lines of: “Dr Alan Rector is a Professor of Computer
Science at the University of Manchester”
What does giving semantic annotation to web resources mean?
Horrocks explained that it involves firstly an external
agreement on the meaning of annotations, for instance the Dublin Core
Metadata Initiative. It must have limited flexibility and
extensibility; and a limited number of things that can be
expressed. It uses ontologies to specify the meaning of
annotations.
He then elaborated on the role of ontology in information
science. An ontology is an engineering artefact. It’s also a
vocabulary used to describe (a particular view of) some domain.
And it’s an explicit specification of the intended meaning of
the artefact category.
Horrocks listed some applications of ontologies currently in
operation, such as e-Science (bioinformatics); in medicine where
they are building and maintaining terminologies such as Snomel,
NCI and Gaten. Ontologies are also used for organising complex
and semi-structured information, such as that held by the UN,
NASA, Ordinance Survey, General Motors and Lockheed
Martin.
What are ontology languages?
Next he turned to the Semantic Web itself, and began by considering
‘ontology languages’. If ontology languages are to serve their
purpose for the web, they need to be agreed. One of the first to
be agreed was the RDF schema, but it’s very weak and doesn’t
allow us to explain many things. It’s also very high order,
difficult to explain and difficult to provide support for.
Two languages have been developed to address the deficiencies
and problems of RDF. OIL was developed by a group of largely
European developers, and DALMONT was developed by a group of
largely US-based researchers. They merged to produce DAML+OIL.
This was submitted to the W3C as OWL.
Both were based on the same underlying description logics, a
family of knowledge logic-based knowledge representation
formalisms. These were descendants of the semantic networks
KL-one. They describe the domain in terms of concepts (classes),
roles (properties, relationships) and individuals. The operators
allow for composition of complex concepts. And names can be
given to complex concepts.
Eg “happy parent” = Parent-child-smart-fit.
Reasoning & logic
Semantics and reasoning are distinguished by formal semantics
(which are typically model theoretic) and decidable fragments of
FOL (often contained in C2). [note: at this point I was totally
lost and I must apologise if my recap contains errors! But
Horrocks gradually began to talk in less specialist language].
The provision of inference services is guided by decision
procedures for key problems (satisfyability, subsumption, etc)
and by highly optimised implemented systems.
But why the description logic? OWL exploits the results of 15
years of DL research and is based on well-defined (mood
theoretic) semantics. Its formal properties are well understood
(complexity; decidability). We know the reasoning algorithms,
and there is an implemented system – Cerebra.
Why all the strange names? Description logics are a family of KR
formalisms – mainly distinguished by available operators. The
available operators are indicated by letters in the name eg.
S,H,O,I,N. OWL had to come up with a web-friendly syntax.
Ontologies for the working web
In turn Needham explained why they chose ontology reasoning.
Given the key role of ontologies in many applications it’s
essential to provide tools and services to help users, and
therefore to design and maintain high quality ontologies that
are meaningful – ie. All names classes have instances.
What were the research challenges? Firstly increasing expressive
power, which involved complex role inclusion axioms; concrete
domains / data types; database style keys; rule language
extensions.
The second challenge was improving scalability, requiring
optimization techniques; reduction to disjunctive datalog, and
hybrid DL-DB. Tools and infrastructure was another challenge, as
was design methodologies (based on foundational
ontologies).
Q&A with the audience:
The question of the commercial barriers to asking high-quality
questions on the WWW was raised, and the delegate wondered if
the commercial cost of annotating the source info would be met.
In turn, he continued, some people and organisations won’t want
the source information so annotated to then be given away for
free, suggesting that it won’t find its way onto the open
web.
Horrocks responded that he hoped the process will bootstrap
itself, and interest in ontology annotation will grow and become
a groundswell.
Elsewhere in the audience David Holsworth noted that Google’s
translation service from French into English had recently
improved, and said this indicated improving semantic
algorithms.
[NB:Notes on the lecture can also be downloaded here]
Further thoughts on this lecture can be found on the
Beers & Innovation blog
About Professor Ian Horrocks:
Ian is Professor in the School of Computer Science at the
University of Manchester. His primary research interest is
Knowledge Representation; in particular ontologies and ontology
languages, tableaux algorithms for Description Logics (DLs),
optimisation techniques for such algorithms, and the application
of all of the above to e-Science and the Semantic Web. He was a
member of the W3C WebOntology working group that developed the
OWL language (now a W3C recommendation), and is an author
of the SWRL Semantic Web Rules Language proposal. For more
information visit his web page
Further reading:
The British
Computer Society
The
Semantic Web Community portal
The
Semantic Web: An Introduction – article by Sean B
Palmer
What Is the Semantic Web? – article from
Altova
Tim Berners-Lee started blogging in January 2006 on the DIG research journal group blog
Comments
You must be logged in to comment.