.... Internet marketing resources, ecommerce web site design tutorials
  Taming the Beast - quality web marketing and ecommerce development services .... .

 

Return to web marketing and ecommerce articles index

Why some sites aren't listed in search engines

It's one thing to have a low search engine ranking, quite another to have none at all. In a previous article, I summarized ethical and proven search engine submission/optimization strategies; but what if you've submitted your site and it hasn't been listed?

You're not alone - the question of why people's sites don't appear in search engine results, or only a few pages is fairly common and can be somewhat frustrating in trying to track down the problem. If you're having this issue, this article may provide some valuable clues and strategies for rectifying the issue.

How do search engines gather information?

First of all, it doesn't hurt to have a basic understanding of how search engines gather information. 

Automated software programs, also known as bots, crawlers or spiders, are released by search engines and spend their time automatically following links on web sites; or are initiated by a submission of a site's URL to a search engine. 

When a manual search engine submission is performed, the bot doesn't spider the entire site immediately - the request is queued. Once the bot starts spidering, the process can take days, weeks, or even months. Even if the site is only partially spidered, results can then become available in the SERPS (Search Engine Results Pages).

As the spiders follow links throughout a site, they gather information about the content of the pages. The content includes not just what humans can see but also other aspects including meta tag and title tag information, who is linking to that page and where the links on the page are linking to. 

If you're interested in learning how to identify if a search engine robot is visiting your site; read our tutorial on search engine spider identification.

How do search engines rank?

The results of the search engine spiders' forays throughout your site are then processed by complex algorithms, or formulas. Even the text you use in files names, links or the text that people use to link to you plays a role in the algorithm (see anchor text optimization).

Each search engine company has a different set of algorithms and the specifics are closely guarded secrets. This is to help prevent site owners from "spamming" the engine; i.e. to artificially inflate their rankings for particular queries. 

After the page/site information is processed, the information is added to the search engine's database and is then available to searchers.

Sounds pretty simple I guess, but when you take into account that the most popular search engines have billions of pages in their indices - many competing for no.1 positions on particular queries, you can imagine just how much computing power is needed and how complex the algorithms are.

Given the complexity of indexing and ranking; things are bound to go wrong from time to time and some sites will be excluded for a variety of reasons. If you're having problems with getting your site listed, one or more of the following may have occurred. This information is fairly generic and has been provided with the big 3 search engines in mind - Google, Yahoo and MSN.

Need to learn about  search engine ranking and optimization strategies or want to monitor the SEO health of your web sites (and those of your competitors)? One of the most comprehensive set of online search engine optimization tools around -  SEOMOZ.

How long ago was the site launched?

While search engines are much faster these days with indexing sites, it can still take a few months to get listed - depending on the search engine. If it's been less than 12 weeks since you submitted your site - don't panic. During the 12 weeks, check to see if spiders have been visiting your site.

If no spiders have been visiting your site, or they are only hitting the home page and it's been more than 12 weeks since you submission, you can contact the search engine company to ask for some advice as to why you weren't listed.

When approaching a search engine company, always be very polite and to-the-point; begging is very unbecoming. Bear in mind that they are under no obligation to list your site. Also be prepared for an extended wait in relation to getting a response from them and for a lack of specifics if they mention that your site hasn't been listed as it doesn't meet their submission quality guidelines.

Have you checked properly? 

It may be that your site is listed, but just ranking very lowly. To check to see if your site is listed, run a search on:

"site.com"

With "site" being your domain name. Include the quotes.

If you find your site, it's a good start - it just means you either have a naturally low ranking, ranking calculations haven't been finalized or that perhaps the site has been penalized. The latter part of this article covers some issues pertaining to penalties. If your site is squeaky clean, but still buried in the results, try reviewing some of our other optimization strategies.

Major changes or moved server?

Have you made major changes to your site recently? By major changes, I mean site-wide. Even moving to another server can cause search engines to hiccup. This is usually temporary, but you need to ensure that the rest of site is squeaky clean in the interim. Also, when you move servers, it's best to leave the site on the old server live for a couple of weeks as search engines may still look for it there while the new location information is propagating - search engines have a habit of caching IP addresses for a while. Learn more about issues related to moving servers.

No inbound links

As mentioned, search engine spiders find your site not only through direct submissions, but also through links from other listed sites that point to yours. If nobody else is linked to your site, this can contribute to the problem of getting listed. In these cases, it's wise to implement a linking campaign. Learn more about requesting reciprocal links and link exchange tips & software.

The Sandbox effect

There's been much talk of late amongst industry experts of some major engines being hesitant to rank newer websites until they have existed for more than a period of x months. If your site appears in one engine, but not another, then this can be an indicator of the "sandbox effect". There's not much you can do in this situation except continue working on your site and just wait.

Duplicate, non-original content

For search engines, it's not just the amount of content, but the quality. If your site consists mainly of someone else's content, this can prevent you from being listed or reducing your ranking. In extreme cases, it can even lead to a ban.

Lack of content, image/Flash based site

Search engines feed best on text - they cannot read images and very limited abilities when it comes to dealing with Flash based sites. If your site is primarily image based, it may be time to rethink it's design to allow for some textual content. If this isn't possible, then the use of "alt" text on all your images is strongly advised. Here's an example:

<a href="http://www.tamingthebeast.net"><img border="0" src="../images/picture.gif" alt="A description of this image including keywords" width="458" height="39">

Dynamic sites

Dynamic sites are those where content is generated from a database instead of the traditional method of having the content "hard coded" on the actual page (known as static content). 

In the early days of dynamic site technology, search engine spiders would choke on long URL's or would shy away from URL's with many parameters such as "?", "&" "id", "%", "+" and "=". While basic dynamic URL's are now handled without a problem, I've still seen many instances where multiple parameter output is ignored. 

A good rule of thumb is that your URLs should not display more than 2 such parameters when viewed in the browser address bar. If you have more and your site is having problems with getting listed, consult your web developer about the issue.

Many dynamic sites are powered by off-the-shelf CMS's (Content Management Systems). In this situation, there are usually special bolt-on scripts, called mods, available to turn output from the CMS from dynamic style URL's to static. For example:

http://site.com/main.php?id=stuff&t=bleh&p=ick

can be changed to:

http://site.com/bleh-ick.htm

If there isn't a mod available, manual coding of your .htaccess file can produce the same result.

Incorrect redirects, modifications or broken links

If you are using redirects or other forms of .htaccess redirection and/or rewriting, it's important to get it write. Each time a page is requested from your site, a header response is generated. The codes returned to the spider include:

200 (OK)
404 (File not found) 
301 (moved permanently) 
302 (moved temporarily) 
403 (forbidden)

There's around a dozen other codes, but what search engine spiders want to see is either 200 or 301. If you are redirecting pages or domains, it's always wise to use a 301 redirect.

Here's a tool you can use to check your server header responses.

Also, don't forget to check your site for broken links; another issue that can impede a spiders' progress.

Identical title and meta tags

If you're using the same title, description and meta tags on each page of your site, this can also cause ranking and listing issues. These tags should be different on each page. Learn more about meta tag and title tag optimization.

Robots exclusion

While sites are in development, it's not unusual for the developer (if he has some search engine savvy) to add robots exclusion tags to pages, or entries to the robots.txt file. This is to prevent the site from being spidered while under development. If these are in place, they'll need to be adapted or removed altogether before the search engine spider will be able to retrieve data from your site.

Learn more about the robots exclusion meta tag or the robots.txt file

Server issues

Murphy's law states, "if anything can go wrong, it will". If the server your site is hosted on experiences difficulties during the time that the search engine spider visits your site, this can cause the spider to "think" that the site doesn't exist. If you are aware of serious server downtime during the period after submission, it may be wise to try submitting your site again.

Note: when manually submitting your site, it's advisable not to repeatedly do so within a short space of time. Once every couple of weeks is enough. Unlike the old days, once your site is listed, there is no need to continue with manual submission. See our tutorial for free submission pages for Google, Yahoo and MSN + other search engine submission tips.

Once you're in, you're in and you should remain in unless your site is banned. Your site can be banned for a number of reasons, the same reasons that can prevent it from being listed in the first place; so lets now examine some of those:

Getting unbanned

Before we discuss the various issues that can get you banned, it's important to note that if your site has any of these issues and you do rectify them; don't assume that you'll be relisted automatically. 

You'll need to contact the search engine company, state that you found an issue on your site that may have caused it to incur a penalty or ban and ask for it to be reconsidered for listing. Again, bear in mind that the search engine is under no obligation to list your site; so be very polite and very patient in waiting on a response.

Invisible text

If you have large amounts of text or keyphrases that are the same color as the background of your page, many search engines will view this as an attempt to spam. Whether intentional or unintentional, you should rectify this immediately. 

Other advertising campaigns

I've seen this happen many, many times - a site owner submits their web site to the engines, gets impatient after a couple of weeks and then decides to take out a "1,000,000 visitors for $24" campaign.

Many of these cheap traffic strategies can actually get you banned. Rule of thumb - if it sounds too good to be true, then it probably is - and you should steer clear of such schemes. This is especially the case where the method of delivery of traffic is through a page on your site being displayed. Not all such advertising campaigns are shonky, you just need to be careful.

Be patient - there really is no magic bullet solution to rankings and listings; solid and ethical search engine optimization strategies are still the only way to go.

Linking to bad neighborhoods

It's hard to define this and there are no hard and fast rules; but there has been a lot of anecdotal evidence to suggest that a new site linking to many banned sites can prevent the new site from being listed. It's wise to check out any sites you are thinking of linking to before doing so. If you find that you have linked to many banned sites, just remove the links. Learn more about linking and bad neighborhoods.

Link farms

If your site has little content and consists mainly of link exchanges, this can get you labeled as a link farm and can prevent you from being listed. Remove as many links as you can and focus on building content. If you want to engage in reciprocal linking the safe way, learn more in our link exchange tutorial.

Cross linking

For site owners who have a number of sites, it's not unusual to see them linking to each site from each site on every page in an effort to keep visitors within their "network" but also to boost search engine rankings. 

This can not only impede getting a new site listed, but also have a negative affect on all cross-linked sites. My advice is that if you do wish to cross-link, do it from one page only - your "about" page is probably most suitable for this. 

Banned domain name

Another hard one to pin, but this definitely happens on some engines. It's not uncommon for the owner of a banned site to simply dump the domain name. That domain name is then registered by someone else, but the ban may still remain in place. The only free way I know of to try and track a domain name's history is through entering the domain name at archive.org and see what it brings up. If you find that the previous owner of the domain name had illegal or questionable content; you can either dump the name yourself and register a new one, or contact the search engine and let them know that name ownership and content of the site has changed.  

Improper use of search optimization software

Over the last couple of years, many software packages guaranteeing to increase search engine rankings have hit the market. These applications have their place, but I've also read of sites being banned through the over usage of certain packages. Learn more about search engine optimization software - the benefits, the dangers and responsible use.

Cloaked/doorway pages

A couple of years ago, there was a huge trend of people creating pages specifically for search engines, some that  humans would never see. This usually worked one of two ways:

a) a series of pages would be created that were loaded with keywords and phrases, and no other content.

b) a special "sniffer" script was installed in a "real" page. When the script detected that a search engine robot had requested that page, it would actually deliver another page, most likely one described in a)

Search engine companies tweaked to this and added detection methods in their algorithms to combat this kind of spamming. 

Shady SEOP work

If you hired a Search Engine Optimization Professional to tweak your site, bear in mind that like in any other trade there are good, bad and evil practitioners. Just because a "professional" worked on the site, it doesn't mean they used ethical strategies. Run a search on the company to see if other people have had similar problems. Learn more about the sometimes shady world of SEOP and SEM companies.

This is just a brief listing of some of the points that can keep your site from being listed by search engines. I'll be updating it from time to time as other points come to mind. If you have an experience you'd like to share with others about why your site wasn't listed and what you did to rectify the issue, I'd love to hear from you!

Further learning resources

Review all our search engine optimization tutorials

Michael Bloch
Taming the Beast
http://www.tamingthebeast.net 
Tutorials, web content, tools and software.
Web Marketing, Internet Development & Ecommerce Resources
____________________________

Copyright information.... This article is free for reproduction but must be reproduced in its entirety, including live links & this copyright statement must be included. Visit http://www.tamingthebeast.net  for free Internet marketing and web development articles, tutorials and tools! Subscribe to our popular ecommerce/web design ezine!

Click here to view article index 

Online meeting & webinar software review
Powerful, easy to use collaboration tools that can help improve your marketing sales and training efforts. Learn more about these services in this review & try a free trial!

The best shopping cart software
Our reviews of some of the best shopping carts around - free ecommerce solutions  through to premium services offering affiliate programs, marketing modules & online soft goods delivery.  Shopping cart software guide 

Autoresponder software/mailing list manager
 Read our beginners guide and reviews of all-in-one autoresponder & email marketing software solutions.

Credit card transaction fraud screening!  Effective fraud screening is an essential part of running an online businesses. Fraud transactions cost you money and can threaten your merchant account. Pick up a stack of transaction screening tips in this free guide! 

Need some advice/tools for writing/creating a web design, development or marketing proposal?

 

 

 

Home

 

Get paid cash taking online surveys - free to join online 
survey companies that will pay you cash for your opinion!

In Loving Memory - Mignon Ann Bloch

copyright (c) 1999-2011  Taming the Beast  Adelaide - South Australia 

Profile - Contact - Privacy - Consultants Portfolio 

Search Site - Terms of Service - Social/environmental