[TYPO3] mod_rewrite - Mass Redirection Help Needed (I'm Desparate!)

Timothy Patterson tjpatter at svsu.edu
Thu May 24 14:24:34 CEST 2007


Oliver,

Thank you very much for the detailed guide.  It has gotten me back on 
the right track of things.

I do however have one question...  Is it possible to match on the entire 
REQUEST_URI (an example: /folder/index.php?url=blah) instead of only the 
doc_id?  I would essentially like to centralize all of my site's 
redirections into this system if possible.  I'm just not sure if the 
RewriteMap file supports exact URL matching...

I see where I could easily grab the %{REQUEST_URI} variable in 
mod_rewrite's config and pass it to the RewriteMap file.  Does it listen 
to what I want though?

Thanks again for all of your help!
~Tim

Oliver Rowlands wrote:
> Hi Tim,
> 
> The matching can occur at any level.
> 
> Here's a quick guide on achieving SEO friendly site URI migrations:
> 
> 
> 1. The RewriteMap mapping file
> 
> First you'll need to create a RewriteMap file. This is essentially a 
> mapping file containing the redirection values from the old site's URIs 
> to the new Typo3 site's RealURL paths. The easiest way to implement this 
> is with a simple text file containing mapping tuples, one tuple per line:
> 
> /old/uri_1 /new/uri_1
> /old/uri_2 /new/uri_2
> [...]
> 
> Based on the URI structure of your old site I suggest you target only 
> the value of the 'doc_id' GET parameter. Since you have already created 
> a mySQL database containing the redirects you need it shouldn't be to 
> hard for you to generate a text file with the following structure:
> 
> 10 /new/realurl/path1/
> 20 /new/realurl/path2/
> [...]
> 
> Save this file as 'redirects.map' and upload it to your web server.
> 
> You can also create compressed versions of this RewriteMap which are 
> more efficient but are out of scope for this purpose of this guide.
> 
> 
> 2. The RewriteMap configuration
> 
> Next you need to define your RewriteMap in your Apache/virtual-host 
> configuration. This can not be defined in a .htaccess so this might be a 
> problem if your site is hosted on a shared server. If it is contact your 
> hosting provider and they should be able to assist you.
> 
> In your Apache/virtual-host configuration file add the following 
> statement to your site's configuration:
> 
> RewriteMap redirects txt:/root/path/to/redirects.map
> 
> This tells Apache to instantiate a new RewriteMap using the alias 
> 'redirects' using the mapping file located at 
> '/root/path/to/redirects.map'.
> 
> Make sure the 'redirects.map' file is readable by Apache (chmod a+r 
> redirects.map) then reload Apache's configuraiton (/etc/init.d/apache 
> reload).
> 
> 
> 3. The mod_rewrite .htaccess configuration
> 
> Finally you need to configure mod_rewrite to use the RewriteMap when a 
> URI in the format '/its/index.cfm?doc_id=*' is requested.
> 
> In your site's local .htaccess configuration (or even better in the 
> Apache/virutal-host configuration since this is faster and more memory 
> efficient) define the following mod_rewrite configuration settings above 
> Typo3's default RewriteCond/RewriteRule settings:
> 
> RewriteCond %{QUERY_STRING} ^doc_id=([0-9]+).*$
> RewriteRule ^its\/index\.cfm$ ${redirects:%1|/}? [R=301,L]
> 
> This configuration essentially tells Apache to do the following:
> 
> - If the GET query string contains the string 'doc_id=' followed by a 
> integer;
> - And the requested URI is 'its/index.cfm';
> - Then use the RewriteMap 'redirects' and search for the 'doc_id' value 
> %1 (the integer value targeted in the RewriteCond);
> - If a value for the integer is found in the redirect map then the 
> visitor is redirected, using a permanent redirect (HTTP 1/x 301), to the 
> new URI and all other GET parameters are removed (?);
> - If no value is found then the visitor is redirected back to the 
> homepage ('/').
> 
> 
> That should be it. You will have to reload your Apache's configuration 
> every time you alter the mapping file as Apache only parses it during 
> it's configuration initialisation period then keeps it in memory.
> 
> The 301 permanent redirects are very important from an SEO point of 
> view. Temporary 302 redirects, which Apache outputs by default if you do 
> not specify the R=x flag, simply to not cut it. They will cause havoc in 
> your search engine site indexing and will probably result in a lose of 
> ranking due to duplicate content.
> 
> Keep in mind that using RewriteMaps has a performance impact though this 
> will only be an issue if your site is receiving very large amounts of 
> traffic.
> 
> Hope this helps!
> 
> Regards,
> 
> Oliver
> 
> Timothy Patterson wrote:
>> Oliver,
>>
>> I've seen this page/technique before, but from the looks of things I 
>> cannot send http://www.domain.com/its/index.cfm?doc_id=10 and 
>> http://www.domain.com/its/index.cfm?doc_id=20 to different pages due 
>> to matching only taking place at the file level...  Is this in fact 
>> the case?
>>
>> Thanks,
>> Tim
>>
>> Oliver Rowlands wrote:
>>> Hi Timothy,
>>>
>>> It seems you didn't look hard enough. I suggest you have a look at 
>>> Apache's mod_rewrite rewrite maps[1]. I've used them in the past to 
>>> migrate sites with 1000's of pages to Typo3.
>>>
>>> Let me know if you want more information on how to set this up.
>>>
>>> Regards,
>>>
>>> Oliver
>>>
>>> [1] http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html#RewriteMap
>>>
>>> Timothy Patterson wrote:
>>>> We are moving away from a CMS that utilizes URLs in the form of 
>>>> "http://www.domain.com/its/index.cfm?doc_id=10" in favor of Typo3.
>>>>
>>>> I have RealURL working perfectly (great job devs!) and I am quite 
>>>> satisfied with my setup.
>>>>
>>>> My problem is the fact that I have to keep all of the old URLs valid 
>>>> on my site.  Due to the 1000s of pages on my site I would like to 
>>>> implement a type of redirection database to keep the old URLs valid 
>>>> while increasing manageability of the redirections.
>>>>
>>>> After doing some research online, I have found that there is no 
>>>> clean-cut way to redirect a URL in the style that our old CMS uses.
>>>>
>>>> Example:
>>>> You can only redirect "http://www.domain.com/its/index.cfm" to any 
>>>> URL.  You cannot redirect 
>>>> "http://www.domain.com/its/index.cfm?doc_id=10" to a URL and 
>>>> "http://www.domain.com/its/index.cfm?doc_id=20" to a different URL.  
>>>> Catch my drift?  "Standard" redirects only allow you to match file 
>>>> names and not the parameters after the file name.
>>>>
>>>> My solution was to create a MySQL database, then use a combination 
>>>> of mod_rewrite with PHP.
>>>>
>>>> If you view my sample .htaccess file and PHP script below, you can 
>>>> see my problem...  My setup works great except for the fact that I 
>>>> get a ?myredirect=0 appended to every single URL on my site.  I've 
>>>> played with mod_rewrite for hours to try to get rid of that, but it 
>>>> seems to be the only way to avoid an infinite rewriting loop (it is 
>>>> the only way I can communicate from the PHP script back to 
>>>> mod_rewrite).  I really don't want the ?myredirect=0 on every page.
>>>>
>>>> Does anyone have any suggestions?  Any better ways of accomplishing 
>>>> this goal?  Does anyone know of any other mass redirection 
>>>> solutions?  Any other hybrid PHP / mod_rewrite possibilities?  Am I 
>>>> just crazy?
>>>>
>>>> If I can get some help I'd be glad to write up a How-to doc that 
>>>> will help others migrating away from a different CMS.
>>>>
>>>> Here is what I have so far...  (It works, just adds a ?myredirect=0 
>>>> to every page's url.  Grr!)
>>>>
>>>> mod_rewrite (in a .htaccess file):
>>>> RewriteEngine On
>>>> RewriteCond %{QUERY_STRING} !(myredirect)
>>>> RewriteRule .* /redirect.php?dbredirection=%{REQUEST_URI} [L,QSA]
>>>>
>>>> And my PHP code:
>>>> <?php
>>>> // DB Connectivity Include...
>>>> include "config.php";
>>>>
>>>> // Reads QUERY_STRING environment variable and grabs everything 
>>>> 'after // dbredirection='
>>>> list($junk, $request) = split("dbredirection=", 
>>>> getenv('QUERY_STRING'));
>>>>
>>>> // Replace the first instance of a & with a ? (caused by mod_rewrite)
>>>> $request = preg_replace("/(\&)/", "?", $request, 1);
>>>>
>>>> // Connect to MySQL...
>>>> $connection = mysql_connect($db_host, $db_user, $db_pass);
>>>> mysql_select_db($db);
>>>>
>>>> // Get today's date in MySQL format...
>>>> $expdate = date('Y-m-d');
>>>>
>>>> // Build Query...
>>>> // Select & Check Expiration - Default is 2050-01-01 so we should be 
>>>> ok // with this...
>>>> $query = "SELECT newurl FROM redirects WHERE oldurl='$request' AND 
>>>> expires >= '$expdate' LIMIT 1";
>>>>
>>>> // Execute Query...
>>>> $result = mysql_query($query);
>>>>     if(mysql_num_rows($result) != 0)
>>>> {
>>>>     // Retrieve Row...
>>>>     $row = mysql_fetch_row($result);
>>>>
>>>>     //echo "Redirecting to: $row[0]";
>>>>     //die;
>>>>
>>>>     // Perform Safe, "Permanent" Redirect...
>>>>     header("HTTP/1.1 301 Moved Permanently");
>>>>     header("Location: $row[0]");
>>>>     die;
>>>> } else {
>>>>     // No entries exist...  Try to go to page...
>>>>     // Add ?myredirect=0 to the URL to prevent mod_rewrite loops!!!
>>>>
>>>>     // Does $request contain a '?'
>>>>     $qmark = strstr($request, '?');
>>>>     if(!$qmark)
>>>>     {
>>>>         $request .= "?myredirect=0";
>>>>     } else {
>>>>         $request .= "&myredirect=0";
>>>>     }
>>>>            header("Location: $request");
>>>>     die;
>>>> }
>>>> ?>
>>>
>>>
> 
> 


More information about the TYPO3-english mailing list