What is Latent Semantic Indexing?
What is Latent Semantic Indexing? It certainly sounds impressive and mysterious. Yet, like most jargon, it’s quite logical, once you understand the meaning. More to the point: once you understand it, you can use it to boost your search engine listings.
“Latent” means “hidden” and “semantic” stands for “meaning”. So LSI is simply “hidden meaning indexing” and it’s a phrase you will come to hear more and more now, in relation to search engines — particularly Google.Google, having set the standard for search engines — and shown the others how they have monetarized their store of information, with their AdWords program — now find both Msn and Yahoo snapping at their heels.
But Google didn’t get to be top dog in the search engine wars without keeping on the cutting edge of information retrieval technology, which is why they have been quietly acquiring technology and knowledge for the last few years, even to the extent of buying complete companies. This technology is already being used and will be increasingly used to ensure Google stay ahead of the competition.
What brought Google to such prominence from a standing start, seven years ago, to the lion’s share of the search engine market was their strict adherence to the twin mantras of “relevance” and “quality”.
If you use a computer, you’ll understand the paradox: they provide of being amazingly clever for some functions and totally stupid for others. So, whilst Google’s computers were pretty good at checking relevancy — for example matching keywords to the text of the web pages — they have been pretty poor at measuring the quality of the web page.
So, to gauge this, they had to rely on information they could get a handle on, such as links to a web page from sites run by humans. The logic cannot be faulted: if a human, thought the content was good and signaled their approval by sending a link to a web page, Google took this as a vote for the content. Several links from several different web sites was even better.
Unfortunately, Google are in a battle against the finest computer the world has known: the human brain. So it wasn’t long before these strategies were reverse engineered and keyword stuffing and link farms made their appearance.
In addition, Google were using some less refined techniques, such as the age of a web site and the length of time a domain name has been registered for and how quickly a web site grows. These criteria are meant to flag sites created specifically to make income from Google’s AdSense program, by spotting web sites that apparently arrive overnight, having been created by content generating software for the sole purpose of providing a hollow shell of a web site, upon which to place AdSense ads.
Unfortunately, this is sometimes a blunt weapon and penalizes quite legitimate web sites, that may have merely changed servers or have suddenly published a sudden increase of content, not via some nefarious method, but simply because the webmaster has been burning the proverbial midnight oil.
Think about it: how logical is it to deny the searcher really good fresh, original content, seamlessly matching their requirement, which the webmaster worked on for many months, prior to the launch, simply because the web site is only a few weeks old, the domain name may only be registered for a year or so and because the whole web site — despite being many months in preparation — was all loaded in a day or so?
There had to be a better way.
There is. It’s here. And it’s name is . . . yes, you’ve guessed it: latent semantic indexing. This is a smarter way of judging the content of web pages and looking at the pages, in the context of the entire web site, to determine a common theme which the web site covers in depth.
This means that a web site with excellent content isn’t penalized simply because it hasn’t been around for long and doesn’t have many links, because it hasn’t sought a truckload of artificial ones, and adds fresh content on a very regular basis.
From Google’s point of view this will ensure they retain their position as the most popular search engine, because a searcher will be presented with the very best web site, totally relevant to their keywords which also provides the best and most comprehensive quality content currently available, even if that site doesn’t have that many links and only arrived fairly recently.
Of course, if there are two sites of equal relevance, and depth of quality the one with the most long standing and back links will be first choice. What LSI means, however, is a dramatic shift in the 80/20 rule. Until now, this was taken to mean 80% of what the search engine took into account were off-page factors, such as links, with only a modest 20% coming from the web site itself — no matter how good and relevant the content was.
Perhaps, we will now see a reverse of the 80/20 rule, with those crude off-page factors falling to around 20% of the whole and the entire searching process properly centered where Google always wanted it to be: on good relevant, quality content.
Copyright 2006 Paul Hooper-Kelly and InternetMarketingMagician.com
Paul Hooper-Kelly owns http://www.InternetMarketingMagician.com helping people achieve their dream lifestyle by creating automated websites that provide passive incomes.
Visit his internet marketing blog and his website for more helpful internet marketing tips and insider secrets.