The Knowledge Vault was announced at the end of last month as the new potential replacement from the Knowledge Graph. The Knowledge Graph has been part of Google search results since May 2012 and since then it has garnered quite a lot of unwanted attention. Prone to mistakes initially, a month after launch it was already accused of displaying an error 20% of the time. Over the last two years the Knowledge Graph seems to have improved but problems persist with inaccurate information.
What is Google’s Knowledge Graph
The Knowledge Graph is the information box on the right hand side of the search results. A search for a person or film, for example, will trigger the Knowledge Graph. To find out a little more about the Knowledge Graph take a look at our more detailed explanation here.
For example, a search for “Richard Branson” will show this:
You can see that the source of the information is from Wikipedia and if you follow the link you will visit the page the text is copied from. This is how the Knowledge Graph works. It uses a mixture of information from databases like Freebase and information from across the web.
Why do we need the Knowledge Vault?
In a few words: Google doesn’t trust the Knowledge Graph anymore.
Google’s Knowledge Graph seems to have improved over time but still often reports inconsistencies and errors. The main issue seems to be the dragging in of inappropriate information from external websites. The inappropriate information usually falls into two categories; incorrect and (genuinely) inappropriate.
Examples of incorrect Knowledge Graph displays include:
- Paris Jackson (Michael Jackson’s daughter) featuring information from Paris Jackson the Canadian football player, see here.
- Brandy the Singer confused with brandy the drink, see here.
- Even Google’s Head of Webspam is reported to have attending the University of North Carolina School of Law, which is untrue. Again, read more about this in this article.
Examples of (genuinely) inappropriate information includes:
- Photos involving nudity appearing in the Knowledge graph.
- Questionable sources leading to swearing and offensive descriptions, including Greggs’ offensive spoof-logo appearing as their official logo in the knowledge graph. Read more about this and how they handled it here.
These bad results are quite regularly found and shared across social media, so much so that it doesn’t really seem like a big deal when it does happen.
When it came to the iPhone 6 and Apple Watch, Google did not risk dragging in any information from Wikipedia or any other website. Instead they decided on writing the information manually and checking it over with Apple before putting it live. This may not seem like a noteworthy action but it speaks volumes. A company that not only prides itself on automation but where automation is everything it stands for, does not trust its own automated process to find the correct and relevant information.
What is Google’s Knowledge Vault?
The Knowledge Vault is Google’s own database of facts. By crawling the web, combining and merging information and data, Google is building its own database of facts instead of relying on databases like Freebase or websites like Wikipedia and dragging in their information. The aim is to rely less on editors and publications, as these are not reliable enough, and to be able to build a database that is more likely to display correct and more verified information.
It is reported by Search Engine Land that Google has assembled 1.6 billion facts so far and that they are being scored by confidence in their accuracy.
How will this play out? We’re not sure yet. Google clearly wants to improve these results by working on automating the verification of facts by calculating a confidence score. This should help to avoid the embarrassing mistakes the Knowledge Graph has been (very publicly) making.