Mar 29, 2006
Session IDs in PHP can cause some real problems when search engines index your pages. For this reason, you should disable PHPSESSID on your sites, and keep session IDs in cookies instead. If you disable PHPSESSID in the URL, this can become a usability issue, as all visitors must have cookies enabled to make use of any code that requires sessions (such as login scripts). This is unfortunate, but in my mind worth the sacrifice.Why disable PHPSESSID?
If a visitor comes to your site with cookies off, PHP will automatically add PHPSESSID into the URL of every page. This is the way of maintaining state between pages.Because many search engines spider your site with cookies off, they will see the version of your URL with the session ID included. A number of search engines will include the session ID in the index, and it can be hard to get rid of.
Session IDs in search engine indexes are ugly. They don't look good, and confuse the visitor as to the real content on the page. It clouds your pristine search result with technical rubbish, you lose that lovely whitespace around your listing that will attract searchers to your listing.
Worse, having PHPSESSID in a search result can cause duplicate content issues, and Google is unlikely to give out PageRank to a URL including a PHPSESSID. In my experience, pages with PHPSESSID in the URL seem to be in the supplemental index more often too.
Other reasons
If having ugly looking search results and being in the supplemental index wasn't enough reason to change, there are other reasons too.When you have a session ID in your URL, it makes session hijacking a little easier for a hacker. Also if you copy-paste the URL to a friend, they may end us sharing a shopping cart with you while they browse the site - sure to generate unexpected results.
Session IDs in URLs are a good idea if your site must work without cookies enabled, and search engine rankings are not that important to you.
How to Disable PHPSESSID
It's always best to approach the problem from several angles. Consider the following objectives...- Prevent search engines from being given a PHPSESSID in the first place.
- Redirect any visitor that comes into the site with a PHPSESSID in the URL.
- Remove existing listings in the search engine indexes that already have PHPSESSID included
I'll assume for this you have PHP, since we are talking about PHP sessions here...
Step 1. Preventing PHPSESSID from appearing
Also assuming your webserver is Apache, insert the following code into .htaccess to prevent session IDs from appearingphp_value session.use_trans_sid 0
This code tells the server to store the PHPSESSID in a cookie, or not to bother. If the browser does not have cookies enabled (eg Googlebot), then the session id won't move to the URL. This does mean that all functionality that relies on sessions will not work (such as session based logins). Keep this in mind.
Also note that I have had trouble getting this code to work on PHP hosts that run PHP in CGI mode. As a result, I moved all my sites to hosts that run PHP as an Apache module. Seems like overkill, but I really don't like these session IDs.
Step 2. Redirecting visitors
Step 2 is to redirect all visitors that come into the site with a session ID from an outside link. If you allow the link to work as is, then you have a duplicate content problem (one piece of content available with 2 or more URLs).The logic I use here is a little more general, because I firmly believe in the rule "one page, one URL".
Consider the following code, on every page of your site...
$correcturl = 'http://www.ragepank.com/articles/26/disable-phpsessid/';
if ($correcturl != $actualurl) {
header("HTTP/1.1 301 Moved Permanently");
header("Location: " . $correcturl);
exit();
}
If you try coming into this page with a PHPSESSID attached, this code will detect that the URL is wrong, and 301 redirect you to where you should be. This code takes care of session IDs, but also ALL other kinds of duplicate content issues. This code has the URL hard-coded into the script, but you would automate this on dynamic sites.
Step 3. Telling Google
Now that your redirections are working (and you have tested them), you need to tell Google to update it's index and get rid of those ghastly session IDs. You don't want your 50 page website having 500 pages indexed in Google.This next concept might seem a little strange, so bear with me.
You need to link to the pages that contain the PHPSESSID, including the PHPSESSID. Because search engines will never be given the same PHPSESSID twice, they are unlikely to find the exact page with the indexed PHPSESSID again. This is why you should link to it.
- Visit Google, Yahoo, MSN and search for all indexed pages on your site, eg... "site:ragepank.com".
- Make a list of all pages containing a PHPSESSID.
- Create a new page on your website, and link to it from somewhere obscure.
- Add links to all these PHPSESSID pages on this new page
Alternatively, put all the pages you want removed from Google into your XML sitemap. This does more or less the same thing.
Search engine robots will all the links you created (to the PHPSESSID pages). They will see the 301 redirection in place, and update their index accordingly.
Or so the theory goes anyway, in practice this can take months on a small website.
Allow search engines some time to remove your PHPSESSID pages from their index, it can take several months before engines will remove PHPSESSID from their listings.
Results
This does take time. Once Google indexes a page, it can be difficult to change or get rid of the page. It can happen over time, but normally you need to treat Google like a child, and explicitly say (by using 301 redirects) which URLs you want changed.This technique does work. The results are nice clean search engine listings, and is definitely worth the effort.
25 Comments
The simplest way to remove PHPSESSID is to put this in your config file of your site:
ini_set('session.use_trans_sid', false);
Saves you a lot of time!!
Is the PHPSESSID should be disabled for any websites you are creating? Or only for dynamic websites?
In other words, for a static website that does not use login feature or dynamic web pages, should I disable PHPSESSID?
Thank you
try this instead:
ini_set("url_rewriter.tags","");
put this output_reset_rewrite_vars(); or ini_set('url_rewriter.tags', ''); before session_start();
Fili - Jun 7, 2007
Option 1 is not a good solution in my opinion.
PHPSESSID is not a BUG its a FEATURE.
Some application actually depend on it.
There must be some other way (maybe user-agent checking?) to solve this issue.
It all depends on the application of course - some web apps totally rely on session IDs to work, and by using option 1 you are breaking your app for those visitors without cookies.
Even now, I'm still coming across indexed pages with session IDs in the URLs, and aside from the duplicate content issues, it looks darn ugly and opens you up to session hijacking.
But PHPSESSID is much less of an issue for SEO now than it was a year ago (Google is getting smarter), if if your app relies on sessions, it might be worth leaving it alone.
Fili - Jun 7, 2007
I've heard that when using google-sitemaps, only the therein specified url's are used in de Google-index. If this is true, then there is no need for fighting PHPSESSID in the first place.
Why doesn't the Google-bot just automatically strip PHPSESSID from an url?
Shouldn't be that hard right?
Leif - Jul 28, 2007
I came up with the following for a site I manage. I did not need session id's in the url and happen to get them indexed accidentally because I somehow got a 'session_start()' in my code...got rid of that but need to 301 redirect all the pages indexed with the 'phpsessid' junk in them. Anyway, I just did something like this:
$uri = $_SERVER['REQUEST_URI'];
$findme = 'PHPSESSID';
$checkPHPSESSID = strpos($uri, $findme);
if ($checkPHPSESSID != "") {
$correcturl = 'http://www.sitedomain.com/'.$page;
header("HTTP/1.1 301 Moved Permanently");
header("Location: " . $correcturl);
exit();
}[\code]
Leif - Jul 28, 2007
oops, I put the wrong slash in that code tag. Maybe you can fix that for me Harvey? Thanks.
fyi - &page is a variable I setup to call in each page into a master template.
i have made new pagerank checking site and facing the same problem of
PHPSESSID. thanx for this usefull post
Anze - Aug 28, 2007
To remove the PHPSESSID from URLs you can use this redirect code:
RewriteCond %{QUERY_STRING} ^PHPSESSID=.*$
RewriteRule ^(.*)$ %{REQUEST_URI}? [L,R=301]
RewriteCond %{QUERY_STRING} ^(.*)&PHPSESSID=.*$
RewriteRule ^(.*)$ %{REQUEST_URI}?%1 [L,R=301]
Best,
Anze
Dallas - Sep 6, 2007
xoip's solution worked for me!
ini_set("url_rewriter.tags","");
Everything else was causing an errors with my configuration.
Thanks!
Alex - Oct 21, 2007
ini_set("url_rewriter.tags",""); also worked for me! I don't know anything about PHP so I pasted it in near the beginning of my global.php file and crossed my fingers. ta-da! This has been bothering me for a while. Thanks guys.
To remove the PHPSESSID from URL whererever it is (beginning, middle or end of the query string).
RewriteEngine On
RewriteCond %{QUERY_STRING} (.*)phpsessid=[^&]+&*(.*)$ [NC]
RewriteRule (.*) http://www.mydomain.com/$1?%1%2 [R=301,L]
xoip's solution also worked for me!
code:
ini_set("url_rewriter.tags","");
I set this before calling the session_start function and it has successfully removed the session ID from all of my URL's, and this is on a huge project to that deals with large amounts of information I am using sessions with.
This article itself it taking a risky step in attempting to remove sessions with the "tell the server to store the PHPSESSID in a cookie, or not to bother" when you are looking for a quick and cheap way to get rid of messy URL's.
ini_set('url_rewriter.tags', ''); has worked for me.
Thanks everyone, much appreciated. This was bugging me badly as the search engines have indexed my site really quickly
Gacrux - Sep 2, 2008
Great article, and a big thanks to Gilles for his apache redirect code.
Gee - Sep 18, 2008
Hello. I tried the redirect supplied by Gilles and it seems to work fine when this is the case:
www.mysite.com/page.html?PHPSESSID=in6mo8 (it strips off the correct part)
However on URLs like the following it doesn't seem to work:
www.mysite.com/pages/page1/?PHPSESSID=in6mo8
In that instance the URL still shows the sessid. Can someone tell me how to modify the mod_rewrite to take that condition into consideration? Thanks!
P.S. Before I posted my comment I noticed that this comment box was already populated with what appears to be SPAM. Not sure why but it seemed very strange!
Xoip. You sir, are a genious!!
Here is quick solution.
Try it
if($_GET['PHPSESSID']) {
$bad_url=$_SERVER['REQUEST_URI'];
$good_url=str_replace('&PHPSESSID='.$_GET['PHPSESSID'],"",$bad_url);
header("Location:".$good_url);
}
Thanks for the article, the best way it seems is to change the config file settings to:
ini_set('session.use_trans_sid', false);
Hi, i have a slight modification on the code that instead of specifying the original and supposed-to-be URL of a page, you just pass the original URL and it will be automatically cleaned...
http://japaalekhin.llemos.com/writings/how-to-remove-phpsessid-in-the-url
But then again an apache redirect is better... i'm just not into regular expressions
@Fili:
"Why doesn't the Google-bot just automatically strip PHPSESSID from an url?"
IMHO, it's not a good idea to maintain faults by other engines/individuals/companies because not all sites have pages that has phpsessid in the url. imagine if google was to process all urls those with and without phpsessid.. it would be a burden for their crawlers
hi as a follow up to my previous comment, you can actually make google ignore some parameters in yoursite using webmaster tools.. i just saw it right now and so i guess i'll let you guys know
Hey,
thanks for the post, that was very helpful.I change the option in php.ini and rewrite the PHPSESSID pages in the index with the expression from Anze. Hope that will help!



















Option 2
Include the url?s on a Google xml sitemap and submit to Google ? uses the same logic as linking to them from a page.
Option 3
You can use the remove url function on Google http://services.google.com:8882/urlconsole/controller?cmd=siteDown
To remove a url, you need to make the page return a 404 error for that session id url. The page will be removed in 3-5 business days and will be removed for 6 months.
Add the following code at the start of the file for which the url has the session id cached on Google. Immediately after using the remove url tool, remove the code. It only needs to be on there for the moment of the submission of the url to Google
Ie to remove the url /file.php?PHPSESSID=1345345146
Add to the start of the file.php, the following code:
if($_GET[PHPSESSID]){
header("HTTP/1.1 404 Not Found");
exit();
}
Then remove the code from the file.php file.