Prevent duplicate indexing of phpBB3 threads by Google

In a default phpBB3 installation(without an SEO mod), the same thread can be accessed through many URLs. For example, the thread Favourite Animorphs fanfiction can be accessed through these different URLs :

  • http://animorphsfanforum.com/viewtopic.php?f=5&t=60
  • http://animorphsfanforum.com/viewtopic.php?f=5&p=659
  • http://animorphsfanforum.com/viewtopic.php?f=5&t=60&p=659
  • http://animorphsfanforum.com/viewtopic.php?f=5&t=60&p=659#p659
  • http://animorphsfanforum.com/viewtopic.php?t=60
  • http://animorphsfanforum.com/viewtopic.php?p=659
  • http://animorphsfanforum.com/viewtopic.php?f=5&t=60&st=0&sk=t&sd=a
  • http://animorphsfanforum.com/viewtopic.php?f=5&t=60&st=0&sk=t&sd=a#p659

The agressive Googlebot might pick almost all of those links and it might lead to potential ranking problems in SERPs.

One way to resove this duplicate indexing is to install an SEO mod like PhpBB-SEO.com SEO mod or Handyman’s SEO mod. But, installing and updating a mod with each phpBB3 update and/or mod update might get harrowing and burdensome. An alternative, albeit easier way to prevent duplicate indexing is to add these lines to your robots.txt :

User-agent: *
Disallow: /viewtopic.php?p=
Disallow: /viewtopic.php?=&p=
Disallow: /viewtopic.php?t=
Disallow: /viewtopic.php?start=
Disallow: /*&view=previous
Disallow: /*&view=next
Disallow: /*&sid=
Disallow: /*&p=
Disallow: /*&sd=a
Disallow: /*&start=0

This will forbid all bots following robots.txt directives to crawl the redundant URLs. Your threads will only be accessible to the bots through the URL http://www.domain.com/viewtopic.php?f=X&t=X.

Note(1): If you display Google Adsense ads in your forum pages, you might need to allow the Google Adsense bot(Mediapartners-Google) to access the ‘forbidden’ URLs, so that it can crawl the pages and display relevant ads. So, you need to add these ‘extra’ lines to your robots.txt :

User-agent: Mediapartners-Google
Disallow:

This will allow the Mediapartners-Google bot to access the forbidden URLs.

Note(2): If your forum is on a subdirectory(say /forum/) rather than the root, append “/forum” to the robots.txt directives. It should look like this :

User-agent: *
Disallow: /forum/viewtopic.php?p=
Disallow: /forum/viewtopic.php?=&p=
Disallow: /forum/viewtopic.php?t=
Disallow: /forum/viewtopic.php?start=
Disallow: /forum/*&view=previous
Disallow: /forum/*&view=next
Disallow: /forum/*&sid=
Disallow: /forum/*&p=
Disallow: /forum/*&sd=a
Disallow: /forum/*&start=0

If you face any problems, leave your comments and I’ll look into it.

Join the Conversation

13 Comments

  1. Pingback: phpbb3 SEO
  2. This post is about how to prevent duplicate indexing of threads, not how to get Google crawl more of your threads.

    However, considering that your forum is 6 months old and Google didn’t index any of your threads, there seems to be something amiss.

    There may be a lot of reasons for this :
    • You have a very few no of backlinks.
    • You engaged in black-hat SEO(keyword spamming, etc).
    • You accidentally blocked Googlebot through a robots.txt directive.

    I’d recommend that you add your site to Google Webmaster Tools and monitor the crawling errors encountered by Googlebot.

  3. i’ve a forum running for more than 6 months and so far there are only 4 pages got indexed by google and all the 4 pages is static page like the main page, faq n memberlist. my forum has more than 200 topic so far.

    Is your solution posted above able to solve this indexing problem?

  4. I have a flash website
    i’m looking for the script who of google ads with flash.
    do you know this script?

  5. Well, Google’s TOS clearly states that :

    You shall not, and shall not authorize or encourage any third party to:
    (ii) edit, modify, filter, truncate or change the order of the information contained in any Ad, Link, Ad Unit, Search Result, or Referral Button, or remove, obscure or minimize any Ad, Link, Ad Unit, Search Result, or Referral Button in any way without authorization from Google; (iii) frame, minimize, remove or otherwise inhibit the full and complete display of any Web page accessed by an end user after clicking on any part of an Ad (“Advertiser Page”), any Search Results Page, or any Referral Page.

    So, I don’t think displaying Google ads which have been converted to flash would be a good idea.

  6. I will try this and will be back to comment about the indexing results.
    Thanks for sharing, great job!

Leave a comment

Leave a Reply