Page
Cloaking - To Cloak or Not to Cloak
Teacher:
Sumantra Roy
Page cloaking can
broadly be defined as a technique used to deliver
different web pages under different circumstances.
There are two primary reasons that people use page
cloaking:
i) It allows them
to create a separate optimized page for each search
engine and another page which is aesthetically
pleasing and designed for their human visitors.
When a search engine spider visits a site, the page
which has been optimized for that search engine is
delivered to it. When a human visits a site, the
page which was designed for the human visitors is
shown. The primary benefit of doing this is that
the human visitors don't need to be shown the pages
which have been optimized for the search engines,
because the pages which are meant for the search
engines may not be aesthetically pleasing, and may
contain an over-repetition of keywords.
ii) It allows
them to hide the source code of the optimized pages
that they have created, and hence prevents their
competitors from being able to copy the source
code.
Page cloaking is
implemented by using some specialized cloaking
scripts. A cloaking script is installed on the
server, which detects whether it is a search engine
or a human being that is requesting a page. If a
search engine is requesting a page, the cloaking
script delivers the page which has been optimized
for that search engine. If a human being is
requesting the page, the cloaking script delivers
the page which has been designed for
humans.
There are two
primary ways by which the cloaking script can
detect whether a search engine or a human being is
visiting a site:
i) The first and
simplest way is by checking the User-Agent
variable. Each time anyone (be it a search engine
spider or a browser being operated by a human)
requests a page from a site, it reports an
User-Agent name to the site. Generally, if a search
engine spider requests a page, the User-Agent
variable contains the name of the search engine.
Hence, if the cloaking script detects that the
User-Agent variable contains a name of a search
engine, it delivers the page which has been
optimized for that search engine. If the cloaking
script does not detect the name of a search engine
in the User-Agent variable, it assumes that the
request has been made by a human being and delivers
the page which was designed for human
beings.
However, while
this is the simplest way to implement a cloaking
script, it is also the least safe. It is pretty
easy to fake the User-Agent variable, and hence,
someone who wants to see the optimized pages that
are being delivered to different search engines can
easily do so.
ii) The second
and more complicated way is to use I.P. (Internet
Protocol) based cloaking. This involves the use of
an I.P. database which contains a list of the I.P.
addresses of all known search engine spiders. When
a visitor (a search engine or a human) requests a
page, the cloaking script checks the I.P. address
of the visitor. If the I.P. address is present in
the I.P. database, the cloaking script knows that
the visitor is a search engine and delivers the
page optimized for that search engine. If the I.P.
address is not present in the I.P. database, the
cloaking script assumes that a human has requested
the page, and delivers the page which is meant for
human visitors.
Although more
complicated than User-Agent based cloaking, I.P.
based cloaking is more reliable and safe because it
is very difficult to fake I.P.
addresses.
Now that you have
an idea of what cloaking is all about and how it is
implemented, the question arises as to whether you
should use page cloaking. The one word answer is
"NO". The reason is simple: the search engines
don't like it, and will probably ban your site from
their index if they find out that your site uses
cloaking. The reason that the search engines don't
like page cloaking is that it prevents them from
being able to spider the same page that their
visitors are going to see. And if the search
engines are prevented from doing so, they cannot be
confident of delivering relevant results to their
users. In the past, many people have created
optimized pages for some highly popular keywords
and then used page cloaking to take people to their
real sites which had nothing to do with those
keywords. If the search engines allowed this to
happen, they would suffer because their users would
abandon them and go to another search engine which
produced more relevant results.
Of course, a
question arises as to how a search engine can
detect whether or not a site uses page cloaking.
There are three ways by which it can do
so:
i) If the site
uses User-Agent cloaking, the search engines can
simply send a spider to a site which does not
report the name of the search engine in the
User-Agent variable. If the search engine sees that
the page delivered to this spider is different from
the page which is delivered to a spider which
reports the name of the search engine in the
User-Agent variable, it knows that the site has
used page cloaking.
ii) If the site
uses I.P. based cloaking, the search engines can
send a spider from a different I.P. address than
any I.P. address which it has used previously.
Since this is a new I.P. address, the I.P. database
that is used for cloaking will not contain this
address. If the search engine detects that the page
delivered to the spider with the new I.P. address
is different from the page that is delivered to a
spider with a known I.P. address, it knows that the
site has used page cloaking.
iii) A human
representative from a search engine may visit a
site to see whether it uses cloaking. If she sees
that the page which is delivered to her is
different from the one being delivered to the
search engine spider, she knows that the site uses
cloaking.
Hence, when it
comes to page cloaking, my advice is simple: don't
even think about using it.
About
the teacher:
Sumantra
is one of the most respected search engine
positioning specialists on the Internet. To have
Sumantra's company place your site at the top of
the search engines, go to http://www.1stSearchRanking.com/
For more advice on how you can take your web site
to the top of the search engines, subscribe to his
FREE newsletter by going to http://www.1stSearchRanking.com/newsletter.htm