Anatomy Of An Internet Search Engine  

Posted by Perfect Domain in , , , , ,

Anatomy Of An Internet Search Engine

Author: Dave Davies

For some unfortunate souls SEO is simply the learning of tricks

and techniques that, according to their understanding, should

propel their site into the top rankings on the major search

engines. This understanding of the way SEO works can be

effective for a time however it contains one basic flaw ... the

rules change. Search engines are in a constant state of

evolution in order to keep up with the SEO's in much the same

way that Norton, McAfee, AVG or any of the other anti-virus

software companies are constantly trying to keep up with the

virus writers.

Basing your entire websites future on one simple set of rules

(read: tricks) about how the search engines will rank your site

contains an additional flaw, there are more factors being

considered than any SEO is aware of and can confirm. That's

right, I will freely admit that there are factors at work that I

may not be aware of and even those that I am aware of I cannot

with 100% accuracy give you the exact weight they are given in

the overall algorithm. Even if I could, the algorithm would

change a few weeks later and what's more, hold your hats for

this one; there is more than one search engine.

So if we cannot base our optimization on a set of hard-and-fast

rules what can we do? The key my friends, is not to understand

the tricks but rather what they accomplish. Reflecting back on

my high school math teach Mr. Barry Nicholl I recall a silly

story that had a great impact. One weekend he had the entire

class watch Dumbo The Flying Elephant (there was actually going

to be a question about it on our test). Why? The lesson we were

to get from it is that formulas (like tricks) are the feather in

the story. They are unnecessary and yet we hold on to them in

the false belief that it is the feather that works and not the

logic. Indeed, the tricks and techniques are not what works but

rather the logic they follow and that is their shortcoming.

And So What Is Necessary?

To rank a website highly and keep it ranking over time one must

optimize it with one primary understanding, that a search engine

is a living thing. Obviously this is not to say that search

engines have brains, I will leave those tales to Orson Scott

Card and other science fiction writers, however their very

nature results in a lifelike being with far more storage

capacity.

If we consider for a moment how a search engine functions; it

goes out into the world, follows the road signs and paths to get

where it's going, and collects all of the information in its

path. From this point, the information is sent back to a group

of servers where algorithms are applied in order to determine

the importance of specific documents. How are these algorithms

generated? They are created by human beings who have a great

deal of experience in understanding the fundamentals of the

Internet and the documents it contains and who also have the

capacity to learn from their mistakes, and update the algorithms

accordingly. Essentially we have an entity that collects data,

stores it, and then sorts through it to determine what's

important which it's happy to share with others and what's

unimportant which it keeps tucked away.

So Let's Break It Down ...

To gain a true understanding of what a search engine is, it's

simple enough to compare it to the human anatomy as, though not

breathing, it contains many of the same core functions required

for life. And these are:

The Lungs & Other Vital Organs - The

lungs of a search engine and indeed the vast majority of vital

organs are contained within the datacenters in which they are

housed. Be it in the form of power, Internet connectivity, etc.

As with the human body, we do not generally consider these

important in defining who we are, however we're certainly

grateful to have them and need them all to function properly.

The Arms & Legs - Think of the links

from the engine itself as the arms and legs. These are the

vehicles by which we get where we need to go and retrieve what

needs to be accessed. While we don't commonly think of these as

functions when we're considering SEO these are the purpose of

the entire thing. Much as the human body is designed primarily

to keep you mobile and able to access other things, so too is

the entire search engine designed primarily to access the

outside world.

The Eyes - The eyes of the search

engine are the spiders (AKA robots or crawlers). These are the

1s and 0s that the search engines send out over the Internet to

retrieve documents. In the case of all the major search engines

the spiders crawl from one page to another following the links,

as you would look down various paths along your way. Fortunately

for the spiders they are traveling mainly over fiber optic

connections and so their ability to travel at light speed

enables them to visit all the paths they come across whereas we

as mere humans have to be a bit more selective.

The Brain - The brain of a search

engine, like the human brain, is the most complex of its

functions and components. The brain must have instinct, must

know, and must learn in order to function properly. A search

engine (and by search engine we mean the natural listings of the

major engines) must also include these critical three components

in order to survive.

The Instinct - The instinct of a

search engines is defined in it's core functions, that is the

crawling of sites and either the inability to read specific

types of data, or the programmed response to ignore files

meeting a specific criteria. Even the programmed responses

become automated by the engines and thus fall under the category

of instinct much the same as the westernized human instinct to

jump from a large spider is learned. An infant would probably

watch the spider or even eat it meaning this is not an automatic

human reaction.

The instinct of a search engines is important to understand

however once one understands what can and cannot be read and how

the spiders will crawl a site this will become instinct for you

too and can then safely be stored in the "autopilot" part of

your brain.

The Knowing - Search

engines know by crawling. What they know goes far beyond what is

commonly perceived by most users, webmasters and SEOs. While the

vast storehouse we call the Internet provides billions upon

billions of pages of data for the search engines to know they

also pick up more than that. Search engines know a number of

different methods for storing data, presenting data,

prioritizing data and of course, way of tricking the engines

themselves.

While the search engine spiders are crawling the web they are

grabbing the stores of data that exist and sending it back to

the datacenters, where that information is processed through

existing algorithms and sp@m filters where it will attain a

ranking based on the engine's current understanding of the way

the Internet and the documents contained within it work.

Similar to the way we process an article from a newspaper based

on our current understanding of the world, the search engines

process and rank documents based on what they understand to be

true in the way documents are organized on the Internet.

The Learning - Once it is understood

that search engines rank documents based on a specific

understanding of the way the Internet functions, it then follows

that in order to insure that new document types and technologies

are able to be read and that the algorithm be changed as new

understandings of the functionality of the Internet are

uncovered a search engine must have the ability to "learn".

Aside from a search engine needing the ability to properly

spider documents stored in newer technologies, search engines

must also have the ability to detect and accurately penalize

sp@m and as well as accurately rank websites based on new

understandings of the way documents are organized and links

arranged. Examples of areas where search engines must learn in

an ongoing basis include but are most certainly not limited

to: Understanding the relevancy of the content

between sites where a link is found Attaining the

ability to view the content on documents contained within new

technologies such as database types, Flash, etc.

Understanding the various methods used to hide text, links, etc.

in order to penalize sites engaging in these tactics

Learning from current results and any shortcoming in them, what

tweaks to current algorithms or what additional considerations

must be taken into account to improve the relevancy of the

results in the future.

The learning of a search engine generally comes from the

uber-geeks hired by and the users of the search engines. Once a

factor is taken into account and programmed into the algorithm

it them moves into the "knowing" category until the next round

of updates.

How This Helps in SEO

This is the point at which you may be asking yourself, "This is

all well-and-good but exactly how does this help ME?" An

understanding of how search engines function, how they learn,

and how they live is one of the most important understandings

you can have in optimizing a website. This understanding will

insure that you don't simply apply random tricks in hopes that

you've listened to the right person in the forums that day but

rather that you consider what is the search engine trying to do

and does this tactic fit with the long term goals of the engine.

For a while keyword density sp@mming was all the rage among the

less ethical SEOs as was building networks of websites to link

together in order to boost link popularity. Neither of these

tactics work today and why? They do not fit with the long-term

goals of the search engine. Search engines, like humans, want to

survive. If the results they provide are poor then the engine

will die a slow but steady death and so they evolve.

When considering any tactic you must consider, does this fit

with the long-term goals of the engine? Does this tactic in

general serve to provide better results for the largest number

of searches? If the answer is yes then the tactic is sound.

For example, the overall relevancy of your website (i.e. does

the majority of your content focus on a single subject) has

become more important over the past year or so. Does this help

the searcher? The searcher will find more content on the subject

they have searched on larger sites with larger amounts of

related content and thus this shift does help the searcher

overall. A tactic that includes the addition of more content to

your site is thus a solid one as it helps build the overall

relevancy of your website and gives the visitor more and updated

information at their disposal once they get there.

Another example would be in link building. Reciprocal links are

becoming less relevant and reciprocal-links between unrelated

sites are virtually irrelevant. If you are engaging in

reciprocal link building insure that the sites you link to are

related to your site's content. As a search engine I would want

to know that a site in my results also provided links to other

related sites thus increasing the chance that the searcher was

going to find the information that they are looking for one way

or another without having to switch to a different search

engine.

In Short

In short, think ahead. Understand that search engines are

organic beings that will continue to evolve. Help feed them when

they visit your site and they will return often and reward your

efforts. Use unethical tactics and you may hold a good position

for a while but in the end, if you do not use tactics that

provide for good overall results, you will not hold your

position for long. They will learn.

Article Source: http://www.articlesbase.com/seo-articles/anatomy-of-an-internet-search-engine-2388.html

About the Author:

0 comments

Post a Comment

Categories

Archives