As smart as Google is, their search engine uses an algorithm based on natural language and the relationships or connectivity of semantic occurrences (otherwise known as shingles, i.e. groups of words or phrases) to ascertain context.
These semblances, e.g. relationships are the fabric of relevance and intent and represent the convergence of allowing search engines work for you by theming your website’s topic to reinforce a topic and in turn develop website authority as a by product.
You here people talking about theming a website, but what is a theme? A theme is a cluster of words (keywords and phrases) which have overlapping continuity that reinforces topical relevance. In other words (a masterful collection of organized content which drives home a specific subject or series of subjects).
While this makes perfect sense from a logical perspective for consumers, it also makes sense from the viewpoint of search engines (which strive to understand and rate context) to provide the most relevant result for a query.
Consider keywords as symbols or as a lexicon, when certain words are combined, they construct a meaningful context of expression, search engines use these relationships as a precursor for sorting and sifting for relevance using a process known as natural language processing.
Taken a step further, these relationships (in a vector space model) equate to mathematical values which aid in determining the documents relevance score. So, if 4 words exist on a page that have a relationship to another phrase which has meaning, those values are then calculated to produce a favorable or less favorable connection/score through enumerating the variations or the related vectors.
In essence, the higher the relevance score for a query, as determined by chronology, citation, target discrimination, trust and authority, the more relevant and popular a query will be when fetched from the index of the Google server cloud.
To provide an example, if I use the word “diamond” on a page and the word “mit” appears, then a search engine understand the context in relation to the game of baseball (which shares the context). If the word rich, luster or shiny appear in context, then the model shifts to interpret the page as the literal meaning of the word “diamond” as in compressed coal aged for centuries to create “highly sought after gems”.
Aside from the on page considerations, citation or “links from other sites” also impact the rank order of the target page within a website. The key is or rather the cypher is to (a) create the most relevant topical nodes selectively through site architecture (b) create selective and specific internal linking to reinforce the site/themes purpose and then (c) then mirror that by building links or “attracting links” from social media, bloggers or other sources to provide enough peer review to cross the tipping point of the website authority model.
Once your pages have enough citation, trust and relevance (based on semantic connectivity, inverse document frequency, term weights, target discrimination and hundred of other asynchronous algorithms, the conclusion ripples across data centers and can be seen in search engines that directly correlate to where you rank.
From this perspective SEO is easy once you understand the nodes of relevance, which keywords belong to which clusters and where the co-occurrence overlap is (when building out a site, its architecture and supporting content).
If you are really curious about how search engines or information retrieval systems really work, then I suggest you read Dr. E. Garcia from http://www.miislita.com/ My Little Island of IR (information retrieval) goodies.
The gist is to understand what search engines assign relevance to, what corpus of documents they use to determine “The Authority Set“ as the apex of page one results and how you can structurally emulate this process from the onset to create theme relevant websites from the inside out (baking the SEO into the process) instead of trying to fix a broken wagon after the fact.
My suggestion is that you read the following posts for further insight, so that SEO becomes less of a guessing game and more or a science. I am no expert on IR, but I understand enough to rank for keywords in extremely competitive markets in a fraction of the time using a fraction of the links by applying many of the principles of theming a website with silo site architecture.
This is ultimately where search engines are heading as you make their job easier, they reward you for presenting the continuity they are programmed to regard and quantitative and qualitative alike. In other words, useful and laser focused (not nebulous like a flashlight with no topic in sight).
Articles from the archives include:
- SEO and Information Retrieval
- Relevance and Retrieval
- Taxonomy, Relevance Score & Website Authority
- Search Engine Optimization Concepts
- SEO and Search Engine Algorithms
And last but not least, go back to the source which created this series of events – >>>”How Semantic Connectivity Affects SEO“<<<. We hope you enjoy this blast from the past. In the meantime, comments are open and welcomed.
This article was very helpful. It helped me refresh my knowledge of SEO, and I’ve been a pal. Shared!
Thanks for introducing me to Dr Garcias tutorials Jeffrey, awesome stuff. Even my Mathematician/Astro-Physicist wife was impressed! I really wish I had paid attention in advanced math in high school now, mind-bending stuff, but helped me to understand how LSI works and debunking a lot of the myth floating around. I liked the word he invented for all the mythology… “blogonomies”. fun times we are in.
Hi Matthew:
Sorry for the late reply. Agreed, it’s almost as if the quote from the new movie “Limitless” has credence here.
I now found math “useful” but in this sense, it applied to search engines.