This article on pagejacking is the result of a recent experience we had with a competitor who thought it would be a good idea to copy an entire page from our site using a sneaky method. It turned out he had done the same to dozens of others. I think he's regretting that strategy now :).
What is pagejacking?
In essence, pagejacking is the copying of a page by unauthorized parties in order to filter off traffic to another site. The copying doesn't include just the wording - it's the whole box and dice. Traffic to the illegitimate page is then usually redirected to a competing, or at times, totally unrelated offer.
Why do people pagejack?
When you have the good fortune of having a page that ranks highly in the SERP's (Search Engine Results Pages); it brings you both good and bad attention. Some unscrupulous individuals make take copies of your pages in an attempt to get equally high, or higher rankings and therefore capturing some of the traffic that really should have gone to your site.
In the instance where the pagejacker is also well versed in search engine optimization; it can be the case that the *majority* of search engine traffic that usually arrives on your site is redirected to the pagejacker. As you can imagine, this can be very costly to your online business.
How is pagejacking executed?
The "newbie" pagejacker simply copies your page in it's entirety and pastes it into another page on his own site. They may add some of their own offers to the page and adjust the links in your content to point to other pages on their site. Only the most stupid of pagejackers use this process.
The more advanced pagejacking strategy is quite clever. First, a copy of your page is taken. A page is then created on the pagejackers site that is basically a carbon copy of your content - including meta-tags. The pagejacker then adds extra scripting to allow only search engine robots to be able to read the content of the page. A 302 .htaccess redirect or meta-refresh is then used to automatically redirect human viewers to a totally different page - they never see your content.
How do I detect pagejacking?
You can detect pagejacking quite easily as most pagejackers will only bother with pages that have decent search engine rankings. Use the following process:
How do I deal with pagejacking
Pagejackers by nature are a snivelling, cowardly breed and easy to deal with if you go about it in the right way.
If you have identified pagejacked content, the first thing you need to do is to save the cached copy of the page - this is very important as it is solid evidence.
One of the great features of Google is that when it displays cached copies of pages, it adds a box to the top of it with identifying information, including the URL and the date the cached copy was taken.
If you are using Internet Explorer, to save a copy of the cached page, simply go to "File", select "Save as" and in the "Save as type" dropdown option, choose "Web archive, single file (*.mht)". This option will download everything, including images and the Google info box into a single file. Having a single file makes it easier to transmit to other parties during the follow up process.
Once you have the archive file safely stored on your own computer, it's time to swing into action.
The first thing you should do is to contact the owner of the site. There is no need to be overly polite in the notification, but also do not be abusive. Bear in mind that in some cases, the pagejacker may *not* be the actual site owner. The owner of the site may have employed an unethical optimization company who used the pagejacking technique. Regardless, it is the site owners' responsibility to deal with the situation.
I recommend writing a brief note along these lines:
Subject = "Copyright infringement - (Domain Name)"
"It has come to my attention that you have made an unauthorized use of my copyrighted work located here; (copyrighted work URL), by reproducing it on your site (their URL with infringing copy). At no time have I given permission for you to reproduce my original content in such a way.
A cached copy from Google of the illegally copied content on your site is attached, along with details as to its location on your site and the date it was gathered. It appears that my content is being used on your site as part of a pagejacking strategy and is visible only to search engines.
As the legal owner of this copyrighted content, I demand that you remove my property from your site immediately.
You have 72 hours to remove this content. If the content is not removed within this time frame, then I will find it necessary to take further action; including contacting Google, your hosting service and any other legal avenues I have at my disposal.
Ensure you flag the email as urgent and select the read receipt option in your email software. If after 72 hours, the content is not removed, you should first contact the company hosting the site. These details, as well as the domain name registrant, can usually be found on the WHOIS record for the domain name by looking at the nameserver information, or by running a trace on the domain name.
If you do find it necessary to contact the hosting service, check the host's site first for guidelines for copyright complaints. Each company may differ slightly in terms of copyright infringement complaints processes and it's important that you follow their submission guidelines carefully - usually a US company will direct you to follow a process as laid out in the DMCA (Digital Millennium Copyright Act).
If the infringement has caused you a major loss in profit, then it is advisable that you contact your lawyer before taking any sort of action if it is within your means to do so.
How do I prevent pagejacking
In short - you don't. It gets to a point where you can spend so much time in trying to protect your online business from parasites and copycats that you may as well not bother with having a site at all. Monitoring is the key in relation to pagejacking.
Other possible negative effects of pagejacking
I've read a number of reports on the subject of pagejacking that appear to indicate that some search engines will favor the pagejacked page over the original one to the point that the original page will be dropped from the SERPs altogether. The reason for this is that most search engines employ duplicate content filters - and the way some work is that the higher ranking page is usually the one that is kept.
One very important negative effect of pagejacking is damage to your brand. For instance, a pagejacker may copy a page that contains multiple instances of your business or product name. If the pagejacker is successful in achieving consistently higher rankings than your own content, unsuspecting surfers may begin to associate the brand with misleading content and steer clear of it altogether.
Protecting your site from online parasites is an ongoing battle; I hope this article has assisted you in dealing with one aspect of this multi-faceted war.
Related learning resources
paid cash taking online surveys - free to join online
In Loving Memory - Mignon Ann Bloch
copyright (c) 1999-2011 Taming the Beast Adelaide - South Australia