Google is in the news for starting to introduce the Semantic Web to the masses.
I think that I probably first came across the term “Semantic Web” around 1999 or early in 2000. I worked for the Institute for Image Data Research (no longer extant) at Northumbria University (very much extant) from late summer 2000 for 12 months.
We were having discussions about the concept then. I think that I was the one who said then that we must “bridge the semantic gap.” I meant two things by this phrase:
- to bridge the gap between information delivered via Web pages and the person trying to access it, especially when it is factual information held in complex databases, through using metadata to describe the content – and including controlled vocabulary that has clear definitions;
- to bridge the gaps between those who create the database content, even when they come from different academic backgrounds, or come without an academic background.
At the time I was working on a project to research a method for evaluating the museums’ image databases, more specifically, their searchability from a user’s perspective, and the users then tended to be staff and volunteers because most museums did not have collections online.
I had noticed that one of the recurring search issues that few of the users even noticed at the time was the lack of consistency in how things were described. It is inevitable that when many people create records about objects over many years, especially when they have different types of knowledge. Even people with the same subject degrees could have a different understanding of a term, and those from disciplines that are similar can understand quite different information from the same term. One example that I often use is “Modern.” The dictionaries are vague, and yet to archaeologists and historians it has a much more precise meaning. The problem is, it has a different meaning depending on discipline, and on what geographical area over what broad span of time they study (and, sometimes, even where someone studied).
I have tried to get people to clarify these ambiguities in cultural heritage records online by using controlled vocabularies that have more precise definitions so that at least there is a way of knowing what the meaning of a word is within that database or set of databases. My theory was that if one embeds it into databases and puts it into the metadata of a website’s pages, it will help make linking up data between databases more possible one day.
I want a world of good quality, interoperable databases that produce information with the potential for most to understand it. I have been trying to persuade people for over a decade that the Semantic Web is the way forward and that we need to embed descriptive metadata into our culture websites.
Now Google is heading towards the Semantic Web approach. Maybe people will start to understand why I have bored them on the subject of metadata when they see it in action.