HOME TIPS FORUMS DOWNLOADS
   »»  Tips Archive |
Modules : News Module - Potential Duplicate Content Problem
Posted by tl001 on 2005/7/31 22:51:02 (4260 reads)
Modules

If you have set Meta Robots option to "index,follow" under Preferences »» Meta Tags and Footer, be sure to add the following 3 lines into article.php (all versions), newsbythisauthor.php (new version):




Quote:


//index nofollow
$meta_robots = str_replace( ',follow' , ',nofollow' , $xoopsTpl->get_template_vars( "xoops_meta_robots" ) ) ;
$xoopsTpl->assign( "xoops_meta_robots" , $meta_robots ) ;


before
Quote:

include_once XOOPS_ROOT_PATH.'/footer.php';
?>


This will tell the robots to index your pages ONLY, not to follow any other links in the article, including "send to a friend", "print", "pdf", and other links.

Xoops default settings will allow the robots to index your page, then to print the page and to make a pdf version of the page - three versions of the same document. You could easily get penalized by Google for duplicate contents.

The issue has been brought to Hervé's attention, and hopefully a new version will incorporate the index,nofollow rule.

credit: the index,nofollow idea is from GIJOE's piCal module.

Printer Friendly Page Send this Story to a Friend
The comments are owned by the poster. We aren't responsible for their content.
Poster Thread
Chappy
Posted: 2005/8/5 23:49  Updated: 2005/8/5 23:49
Just popping in
Joined: 2004/8/17
From: Rowlett, TX
Posts: 20
 Re: News Module - Potential Duplicate Content Problem
As usual, great tip. Just updated my sites news and it went flawlessly.

Thanks!
Anonymous
Posted: 2005/8/15 4:42  Updated: 2005/8/15 4:42
 Re: News Module - Potential Duplicate Content Problem
hello,

Just one question, how do you know that you have a duplicate content problem ?

bye,
Hervé
tl001
Posted: 2005/8/15 9:25  Updated: 2005/8/15 10:33
Webmaster
Joined: 2004/6/10
From:
Posts: 282
 Re: News Module - Potential Duplicate Content Problem
Hervé:

I don't know for sure (I mentioned in the email) and that it why I said it could potentially be a problem. But why not just remove the possibilities? As I have mentioned, robots do not need to click on 'print' and 'pdf' links to create two more copies of the same content. They don't need to "rate/vote", "email", and "comment".

To me, my only concern is if the robots have indexed my article. I don't need the print version and pdf version. By the way, if you do some Google search you will find some xoops articles with the print version only. I don't think it is helpful for a site attracting visitors through print version. Personally I would like them to come in through the site with theme and full menu displayed, so I would have a better chance of getting them stay a bit longer.

tl
[edit] Aslo if you have a subscription site with "robot noarchive" enabled, PRINT and PDF versions will be cached and that just defeats the no-cache attempt.[/edit]
zoullou
Posted: 2005/8/23 6:25  Updated: 2005/8/23 6:25
Just popping in
Joined: 2005/8/23
From:
Posts: 1
 Re: News Module - Potential Duplicate Content Problem
Hi,
I find this tip on the google bot page :
Quote:
How do I tell Googlebot not to crawl a single outgoing link on a page?

Meta tags can exclude all outgoing links on a page, but you can also instruct Googlebot not to crawl individual links by adding rel="nofollow" to a hyperlink. When Google sees the attribute rel="nofollow" on hyperlinks, those links won't get any credit when we rank websites in our search results. For example a link,

<a href=http://www.example.com/>This is a great link!</a>

could be replaced with

<a href=http://www.example.com/ rel="nofollow"> I can't vouch for this link</a>.


You can add this argument to print pdf... link to google don't index these pages.

Cheers
Anonymous
Posted: 2005/8/26 12:02  Updated: 2005/8/26 12:02
 Re: News Module - Potential Duplicate Content Problem
Hello Ted,

The more I learn about bots, the less I'm sure !
I've read many articles, books and forums about this and finally, nobody's sure on how bots runs.
That's the only certitude.

In a recent article in the "Site Pro" newsletter I've read that bots completly don't care about "rel=nofollow" and that it's not usefull in any way.... ???!!!!

I also read that msn and yahoo need meta keywords while google don't care about them.

Where is the truth, I don't know.

The next major release of the News module will include options to let the user decide of what to do and how to do it.

Bye,
Hervé
tl001
Posted: 2005/8/26 16:27  Updated: 2005/8/26 16:44
Webmaster
Joined: 2004/6/10
From:
Posts: 282
 Re: News Module - Potential Duplicate Content Problem
Based on the crawling patterns, Googlebots and MSNbots all obey

index,nofollow

Don't know much about Yahoo! robots, but I think they also do.

Keywords - Google does not weight them but it might penalize sites overly stuffed (my own speculation)

MSN and Yahoo! may still weight keywords. I think as long as Google remains the King of search engines in most cases (one of my sites has over 90% search engine traffic from Google), the news modules should not have too many keywords generated.