Are You Being Scraped? Should You Care?Writing by Nick Stamoulis on Thursday, 3 of January , 2008 at 1:35 pm
You’ve seen ‘em. The scraper sites that are stealing your content. They’re a big pain in the you-know-what. You know it, I know it, they know it. But does it matter?
It might.
Scraper sites exist to make money. But how do they make money off of your content? It depends.
Some scraper sites will steal your content and place AdSense ads on those pages to gain the revenue from people clicking on those ads. Other scraper sites will sell advertising. But how do they get your content?
How Content Scrapers Operate
There are bots that will crawl your site looking for specific keywords. When they find them they scrape that content and place it on the blog or website of the person who programmed the bot. This is called content theft and it’s a big, big problem. Just visit some forums and you’ll see webmasters cursing up a storm about this stolen content. But what can you do about it?
Some of those scraper sites are banking on bloggers clicking through from their admin areas to see who is linking to them. They may not have any PR or clout at all and exist only to get the traffic from the people whose sites they are scraping from. These scrapers aren’t really the problem. They’re a nuisance, but if you don’t click their ads then they likely aren’t making any money.
Identifying The Real Problem Scrapers
The problem sites are the scrapers who have a high PR. A high PageRank indicates credibility. Therefore, it’s possible that the search engines will crawl those sites before they crawl yours and they’ll be credited with original content even though they stole it from you. If that is the case then you have something to worry about.
You’ll have to download the Google toolbar in order to see what the PR of those sites are. I recommend that you visit every content scraper site you find that has your content on it without your approval. Send them a letter asking them to remove your content. If they do not do so then find out who their ISP is and report them. You should also report to the search engines that someone is scraping your content and give them an URL.
How To Fix The Scraping Problem
Those high PR sites scraping your content may have other content that you can’t see. They could have a blog or article directory in a folder on that site that isn’t linked to the scraped content. It’s invisible to human eyes, but the search engines can see it. That content could cause their site to be crawled more often than yours, in which case, if they get crawled before you do, they’ll get credited for original content and you’ll be dinged for duplicate content. That’s your real problem. And to fix that you have to communicate with the search engines to let them know that your site is the originator of the content. Be patient, though. You can rectify the situation with the search engines, but it doesn’t happen in two hours.

Search Marketing Standard Magazine – 1 Year Subscription for Only $15
Sign up for Yahoo! Search Marketing Pay Per Click and get a $50 Sign Up Credit.

searchengineoptimizationjournal.com
Tags: Being, Care, Scraped, Should