Google is Building an Ontology

1 minute read

Since the spring of 2012, Google has started to move from search of particular combinations of letters to searching things. This move from string-based search to searching things is a move towards Google building an ontology of things.

The difference with searching things instead of strings is that Google can differentiate between homonyms, things that sound the same but are different. For example, if you think about Google itself, it is both a search engine and a company. And these are two different things, with different attributes.

This is not the first time that computer science has turned to metaphysics. With the advent of the object-oriented paradigm it was up to programmers to find and define the qualities of objects they called upon to perform tasks.

But with Googleโ€™s approach, itโ€™s the other way around: computer science coming back to the roots of ontology, classic Aristotelian metaphysics, looking at the objects in the world and trying to figure out what their properties are. It is no longer up to the programmer to decide, what an object is like. Instead, they have to map out reality, find the essential features of objects and disregard the unessential ones. And as any philosopher dealing with ontology can tell you, that is no easy task.

If you were a nigh-omniscient search engine, how would you approach the problem? Simple: with data. Google creates a hundred-dimensional model of objects and then determines their relations to other objects. Similar things are closer in some of the dimensions, different things further away. And as the old axiom goes: anything that has identical attributes with some object, is that object.

A direct result of this is that some of the attributes things have may be highly counterintuitive to any human intelligence. In general, similar things are grouped together based on them appearing in similar contexts. But because the approach is statistical, it will occasionally bring up false positives, at least from a human viewpoint

Over time, some of these will be corrected, but some connections can be persistent thanks to the kind of data Google is using: it is not cataloguing reality, but a representation of that reality. Some of the connections in that representation can seem weird at first glance, likeconnecting hipsters with Hitler.