November 8, 2013 by Baawraman
Almost 1.5 years back, I documented a research on “Structured Data”, and my conclusion was as follows:
“A human can search for a lowest priced book, can understand “Operating System”, when documents say “OS”, but machines cannot accomplish this task without human intervention, because web-pages are designed for humans, not for machines. As I said, semantically structured web is required which will enable machines to understand and respond to complex search queries. Google is now actually trying to make happen what Tim Berners-Lee initially expressed the vision of Semantic Web, which is as follows:
I have a dream for the Web [in which computers] become capable of analysing all the data on the Web – the content, links, and transactions between people and computers. A “Semantic Web”, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The “intelligent agents” people have touted for ages will finally materialize.
Google is now moving towards being a “Knowledge Engine” from a traditional “Search Engine”, Google renamed its Search Team to Knowledge Team recently, which further solidifies the aforementioned fact. Google Knowledge Graph, Carousel list are some more concrete evidences for the same.”
There was a time when Search Engines were committed towards focusing their efforts in effective indexing of the giant size html content across the web, but, today, it’s not all about indexing, it’s about understanding the information in the html content. In order to effectively understand the information, Structured/Semantic data comes into the picture. Google Hummingbird Algorithm update is capable of understanding the long question form queries (Yeah, Conversational Search) and can easily identify the synonyms and substitution queries. Hummingbird is more of a query expansion approach which can better understand natural language queries. This new algorithm is well equipped to identify user intent in the query, for example, if you do a voice search with “which is the best place to eat pizza in Jaipur”, Google can easily understand that here in the query, “place” means “restaurants” and Google will return results accordingly.
In this way, Google will use synonyms to provide better search results. Google uses “Statistical Language Translation”, in which query is translated into different language and then translated back to the original language. If both variations correspond to same results, they can be used as synonyms. Google in its Hummingbird Patent clearly says following:
A computer-implemented method comprising: identifying a particular query term of an original search query; identifying a candidate synonym for the particular query term; accessing stored data that specifies, for a pair of terms that includes the particular query term and the candidate synonym of the particular query term, a confidence value for a non-adjacent query term of the original search query that is not adjacent to the particular query term in the original search query; determining that, in the stored data that specifies, for the pair of terms that includes the particular query term and the candidate synonym of the particular query term, the confidence value for the non-adjacent query term satisfies a threshold; and determining to revise the original search query to include the candidate synonym of the particular query term, based on determining that the confidence value for the non-adjacent query term satisfies the threshold.
In May 2013, Google rolled out Conversation Search for chrome, which pushed users to use conversational search queries and long tail question form queries. Then, in September 2013, Google announced that Google is using a new Algorithm called “Hummingbird”. If we connect these two events, we can conclude that Google first asked users to use long tail conversational queries and then, Google launched an effective algorithm to handle it.
Today, when I see an article like this, this or this, I feel that if it is a herculean task to criticize yourselves, from the same school of thoughts, there is no harm in blowing your own trumpet (at times). What I concluded year ago, is going to be the future of Search (Need not to mention that Sir Lee said the same years ago).