[TYPO3] mod_rewrite - Mass Redirection Help Needed (I'm Desparate!)

Oliver Rowlands oliver at liquidlight.co.uk
Wed May 23 23:34:59 CEST 2007


Hi Tim,

The matching can occur at any level.

Here's a quick guide on achieving SEO friendly site URI migrations:


1. The RewriteMap mapping file

First you'll need to create a RewriteMap file. This is essentially a 
mapping file containing the redirection values from the old site's URIs 
to the new Typo3 site's RealURL paths. The easiest way to implement this 
is with a simple text file containing mapping tuples, one tuple per line:

/old/uri_1 /new/uri_1
/old/uri_2 /new/uri_2
[...]

Based on the URI structure of your old site I suggest you target only 
the value of the 'doc_id' GET parameter. Since you have already created 
a mySQL database containing the redirects you need it shouldn't be to 
hard for you to generate a text file with the following structure:

10 /new/realurl/path1/
20 /new/realurl/path2/
[...]

Save this file as 'redirects.map' and upload it to your web server.

You can also create compressed versions of this RewriteMap which are 
more efficient but are out of scope for this purpose of this guide.


2. The RewriteMap configuration

Next you need to define your RewriteMap in your Apache/virtual-host 
configuration. This can not be defined in a .htaccess so this might be a 
problem if your site is hosted on a shared server. If it is contact your 
hosting provider and they should be able to assist you.

In your Apache/virtual-host configuration file add the following 
statement to your site's configuration:

RewriteMap redirects txt:/root/path/to/redirects.map

This tells Apache to instantiate a new RewriteMap using the alias 
'redirects' using the mapping file located at '/root/path/to/redirects.map'.

Make sure the 'redirects.map' file is readable by Apache (chmod a+r 
redirects.map) then reload Apache's configuraiton (/etc/init.d/apache 
reload).


3. The mod_rewrite .htaccess configuration

Finally you need to configure mod_rewrite to use the RewriteMap when a 
URI in the format '/its/index.cfm?doc_id=*' is requested.

In your site's local .htaccess configuration (or even better in the 
Apache/virutal-host configuration since this is faster and more memory 
efficient) define the following mod_rewrite configuration settings above 
Typo3's default RewriteCond/RewriteRule settings:

RewriteCond %{QUERY_STRING} ^doc_id=([0-9]+).*$
RewriteRule ^its\/index\.cfm$ ${redirects:%1|/}? [R=301,L]

This configuration essentially tells Apache to do the following:

- If the GET query string contains the string 'doc_id=' followed by a 
integer;
- And the requested URI is 'its/index.cfm';
- Then use the RewriteMap 'redirects' and search for the 'doc_id' value 
%1 (the integer value targeted in the RewriteCond);
- If a value for the integer is found in the redirect map then the 
visitor is redirected, using a permanent redirect (HTTP 1/x 301), to the 
new URI and all other GET parameters are removed (?);
- If no value is found then the visitor is redirected back to the 
homepage ('/').


That should be it. You will have to reload your Apache's configuration 
every time you alter the mapping file as Apache only parses it during 
it's configuration initialisation period then keeps it in memory.

The 301 permanent redirects are very important from an SEO point of 
view. Temporary 302 redirects, which Apache outputs by default if you do 
not specify the R=x flag, simply to not cut it. They will cause havoc in 
your search engine site indexing and will probably result in a lose of 
ranking due to duplicate content.

Keep in mind that using RewriteMaps has a performance impact though this 
will only be an issue if your site is receiving very large amounts of 
traffic.

Hope this helps!

Regards,

Oliver

Timothy Patterson wrote:
> Oliver,
> 
> I've seen this page/technique before, but from the looks of things I 
> cannot send http://www.domain.com/its/index.cfm?doc_id=10 and 
> http://www.domain.com/its/index.cfm?doc_id=20 to different pages due to 
> matching only taking place at the file level...  Is this in fact the case?
> 
> Thanks,
> Tim
> 
> Oliver Rowlands wrote:
>> Hi Timothy,
>>
>> It seems you didn't look hard enough. I suggest you have a look at 
>> Apache's mod_rewrite rewrite maps[1]. I've used them in the past to 
>> migrate sites with 1000's of pages to Typo3.
>>
>> Let me know if you want more information on how to set this up.
>>
>> Regards,
>>
>> Oliver
>>
>> [1] http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html#RewriteMap
>>
>> Timothy Patterson wrote:
>>> We are moving away from a CMS that utilizes URLs in the form of 
>>> "http://www.domain.com/its/index.cfm?doc_id=10" in favor of Typo3.
>>>
>>> I have RealURL working perfectly (great job devs!) and I am quite 
>>> satisfied with my setup.
>>>
>>> My problem is the fact that I have to keep all of the old URLs valid 
>>> on my site.  Due to the 1000s of pages on my site I would like to 
>>> implement a type of redirection database to keep the old URLs valid 
>>> while increasing manageability of the redirections.
>>>
>>> After doing some research online, I have found that there is no 
>>> clean-cut way to redirect a URL in the style that our old CMS uses.
>>>
>>> Example:
>>> You can only redirect "http://www.domain.com/its/index.cfm" to any 
>>> URL.  You cannot redirect 
>>> "http://www.domain.com/its/index.cfm?doc_id=10" to a URL and 
>>> "http://www.domain.com/its/index.cfm?doc_id=20" to a different URL.  
>>> Catch my drift?  "Standard" redirects only allow you to match file 
>>> names and not the parameters after the file name.
>>>
>>> My solution was to create a MySQL database, then use a combination of 
>>> mod_rewrite with PHP.
>>>
>>> If you view my sample .htaccess file and PHP script below, you can 
>>> see my problem...  My setup works great except for the fact that I 
>>> get a ?myredirect=0 appended to every single URL on my site.  I've 
>>> played with mod_rewrite for hours to try to get rid of that, but it 
>>> seems to be the only way to avoid an infinite rewriting loop (it is 
>>> the only way I can communicate from the PHP script back to 
>>> mod_rewrite).  I really don't want the ?myredirect=0 on every page.
>>>
>>> Does anyone have any suggestions?  Any better ways of accomplishing 
>>> this goal?  Does anyone know of any other mass redirection 
>>> solutions?  Any other hybrid PHP / mod_rewrite possibilities?  Am I 
>>> just crazy?
>>>
>>> If I can get some help I'd be glad to write up a How-to doc that will 
>>> help others migrating away from a different CMS.
>>>
>>> Here is what I have so far...  (It works, just adds a ?myredirect=0 
>>> to every page's url.  Grr!)
>>>
>>> mod_rewrite (in a .htaccess file):
>>> RewriteEngine On
>>> RewriteCond %{QUERY_STRING} !(myredirect)
>>> RewriteRule .* /redirect.php?dbredirection=%{REQUEST_URI} [L,QSA]
>>>
>>> And my PHP code:
>>> <?php
>>> // DB Connectivity Include...
>>> include "config.php";
>>>
>>> // Reads QUERY_STRING environment variable and grabs everything 
>>> 'after // dbredirection='
>>> list($junk, $request) = split("dbredirection=", getenv('QUERY_STRING'));
>>>
>>> // Replace the first instance of a & with a ? (caused by mod_rewrite)
>>> $request = preg_replace("/(\&)/", "?", $request, 1);
>>>
>>> // Connect to MySQL...
>>> $connection = mysql_connect($db_host, $db_user, $db_pass);
>>> mysql_select_db($db);
>>>
>>> // Get today's date in MySQL format...
>>> $expdate = date('Y-m-d');
>>>
>>> // Build Query...
>>> // Select & Check Expiration - Default is 2050-01-01 so we should be 
>>> ok // with this...
>>> $query = "SELECT newurl FROM redirects WHERE oldurl='$request' AND 
>>> expires >= '$expdate' LIMIT 1";
>>>
>>> // Execute Query...
>>> $result = mysql_query($query);
>>>     if(mysql_num_rows($result) != 0)
>>> {
>>>     // Retrieve Row...
>>>     $row = mysql_fetch_row($result);
>>>
>>>     //echo "Redirecting to: $row[0]";
>>>     //die;
>>>
>>>     // Perform Safe, "Permanent" Redirect...
>>>     header("HTTP/1.1 301 Moved Permanently");
>>>     header("Location: $row[0]");
>>>     die;
>>> } else {
>>>     // No entries exist...  Try to go to page...
>>>     // Add ?myredirect=0 to the URL to prevent mod_rewrite loops!!!
>>>
>>>     // Does $request contain a '?'
>>>     $qmark = strstr($request, '?');
>>>     if(!$qmark)
>>>     {
>>>         $request .= "?myredirect=0";
>>>     } else {
>>>         $request .= "&myredirect=0";
>>>     }
>>>            header("Location: $request");
>>>     die;
>>> }
>>> ?>
>>
>>


-- 
Oliver Rowlands
:: Liquid Light ::

E - oliver at liquidlight.co.uk
W - http://www.liquidlight.co.uk

T - 00 44 (0)845 6 58 88 35
F - 00 44 (0)845 6 58 44 35


More information about the TYPO3-english mailing list