The Problem
with Automated Content
Article: October
9, 2006
Cast your mind back
about three years.
Shortly after the
dawning of the Google Adsense Age, webmasters learned that their sites
were effectively little gold mines or "virtual real estate"
as one expert put it. The more cyber-property you had, the more virtual
billboards you were able to put up (also called Adsense blocks). And
so if you made $n dollars by owning one web page with an Adsense ad
(or any ad) on it, then it was reasonable to assume that you would make
$n x 10,000 if you had 10,000 pages with similar ads on it.
Similarly, reason
suggested that 1 million such pages would make you $n x 1,000,000.
Webmasters were
eager to rise to this Gold Rush challenge, and so were those present-day
providers of picks and shovels, the software developers. Applications
were developed which could produce thousands of web pages in less than
an hour from a keyword list. All you had to do was a little research
using Overture's keyword tool or its many free derivatives - the more
sophisticated practitioner of this art would have added Wordtracker
into the mix - and you had your keyword list.
Add some adjectival
superlatives such as "better" or "best" or "latest"
before each keyword and you had an even bigger list. Then after each
keyword add "in New York" or "in London" or even
all the place names in the English speaking world (there are over 30,000
of them) and you had a massive list. The software which was available
at the time could, and still can, produce whole websites consisting
of tens of thousands of pages from your own such bloated keyword handiwork.
Each page of that site would be highly optimized for one keyword phrase,
so that you could more or less guarantee that your page would be in
number one position on all the search engines, simply because it was
so specific. Such websites could be cranked out and uploaded to your
server all in the same day. You could produce 50 such websites, each
with thousands of pages, in a single month; all of them with Adsense
blocks on each page.
The problem was,
they were all unreadable.
Pages that were
manufactured at that speed could hardly rely on human dexterity in creating
their content. So the software which produced them - and it was ingenious
software - had to resort to other means. These largely fell into two
groups: RSS feeds and what came to be called "scraped" content.
The problem with RSS feeds was that lots of other people were using
the same feed. The problem with scraped content was that it belonged
to someone else. In both cases, the hyperlink which was obligatory (but
which could be turned off in the case of the scraped content) bled Pagerank
away and in other ways compromised the integrity of your site. Both
practices also had the habit of leaving footprints for the search engines
to spot. Lawyers' purses bulged a bit as well.
At about the same
time, people searching the Internet complained of seeing bland web pages
with content that was either non-existent, meaningless or repetitive
(even, heaven forbid, duplicate). The search engines addressed this
by punishing web sites that displayed those tendencies, and so raised
the informational quality of their listings for a while. This punishment
consisted of altering their algorithms so that sites or pages which
demonstrated such blandness were either pushed so far down the listings
that they effectively could not be seen, or delisted altogether (banned).
Along came a flurry
of remedies. You could pay ghost-writers at Elance or Rentacoder to
produce the content for you according to a specified keyword density
(but even at $3 an hour it was expensive if you wanted to replace all
those thousands of pages which had just been banned by Google). Then
a huge mini industry of private label membership sites came along, charging
you a monthly fee to use its thousands of stock articles without any
copyright questions being asked. (But there were seldom the specific
keyword phrases you wanted in those articles, and you could never control
the keyword density; also you just knew that lots of other people were
using the same articles from the same membership sites.)
Other software came
along and inserted random text at the top and bottom of each article,
so that each page became unique in its own way. Still more software
was produced which substituted common words in existing PLR articles
from stock synonyms (there was word going round that if a page was 28
percent more different than another page then you were okay). The problem
was that if the page was read as a whole, it made no sense at all. But
this could still fool the search engines. Just.
The search engines
were reported to have recruited thousands of student "editors"
to manually weed out such aberrations from their indices. More emphasis
was placed on non-reciprocal inbound links with the appropriate keywords
in the anchor text (or within ten words left or right of the anchor
text), and other "off-page" considerations. And so it went
on. And on.
There were all sorts
of "solutions" offered to those webmasters who had known the
heady days of the big-figure Google checks for doing very little, and
were willing to pay almost any price to return to them. Accordingly,
the software became more ambitious. In turn, the search engines became
more demanding, and there were increasing signs that perfectly legitimate
sites were being punished as well as the spam pages.
We seem to have
reached a point where something has to give. The browsing public does
deserve better than scraped content, RSS feeds and the abundance of
proto-plagiarism that it still gets. The need is for content that makes
sense and is readable by real people and also of value, as well as ticking
all the boxes of the search engine bots' latest algorithm. Equally,
webmasters have a need for such content as well, yet they also have
an understandable need to be able to produce that content on demand
to their increasingly information-hungry readers. To satisfy such demands
it is unlikely that one piece of software alone will suffice. Instead,
it seems clear that a system of content delivery needs to exist which
is actually sophisticated enough to produce content which is of value
to all concerned.
Gordon Goodfellow
is an Internet marketer and technologist, writer and researcher. His
Content Artist website explores
these issues further.
You
are free to reproduce the above article as long as it is reproduced
in its entirety (including the author's biographical resource box)
and with any hyperlinks live and intact.
For
more Information and a free downloadable guide, contact:
Gordon
Goodfellow, CEO
Content Artist
Suite 323, 258
Belsize Road,
London, UK, NW6
4BT.
Telephone: +44
(0) 208 421 3194
Fax: +44 (0) 208
428 8280
email: support [ at ] contentartist.com
Copyright ©
2006 Content Artist