[TYPO3-core] Patch: simulateStaticDocuments and SEO

Dmitry Dulepov dima at spamcop.net
Wed Jan 18 20:08:26 CET 2006


Hi!

Seconf version of the patch. Changes:
- "prefixChar" -> "replacementChar"
- now using compatibility feature to set hyphen if compatibility mode is set to
4.0.0 or newer
- removed extra EOL at the EOF

Dmitry.

Dmitry Dulepov wrote:
> This is a CVS patch request
> 
> Type: feature
> 
> Branch: TYPO3_4-0
> 
> Problem: Typo3 uses undescope character as separator to generate URLs if
> simulateStaticDocuments is enabled. Google however recognizes hyphen as word
> separator in URLs. Thus it cannot find keywords in URLs produced by Typo3
> 
> Solution: make this character configurable. A new option added
> (config.simulateStaticDocuments_prefixChar) with default value of underscope. If
> this value is set and it equals to its urlencode(), than it is used as prefix
> separator character.
> 
> This patch also converts ereg_replace to preg_replace in the function
> responsible for handling simulateStatic file names.
> 
> FAQ:
> Q. Will it change existing URLs?
> A. No, unless you explicitely specify this character and set it to something
> else then underscope. No existing installations are affected by this patch.
> 
> Q. If I set this character to hyphen, will external references to my pages work?
> A. Yes, they will. Typo3 does not care about prefix, it is ment for humans (and
> search engines). ID part of the URL stays the same.
> 
> Q. How Google treats hyphen and underscope in the URL?
> A. Hyphen separates keywords while underscope glues keywords and make an exact
> sequence of them.
> 
> Q. Are you sure about all this? Who told you?
> A. See yourself. Here:
> http://www.prweaver.com/blog/2004/08/26/2-hyphen-and-underscore
> and here:
> http://www.webmasterworld.com/forum3/23564.htm
> (Googleguy is a Google employee who gives carefully selected information about
> Google internals)
> 
> Dmitry.
> 
> 
> ------------------------------------------------------------------------
> 
> Index: class.tslib_fe.php
> ===================================================================
> RCS file: /cvsroot/typo3/TYPO3core/typo3/sysext/cms/tslib/class.tslib_fe.php,v
> retrieving revision 1.104.2.2
> diff -u -r1.104.2.2 class.tslib_fe.php
> --- class.tslib_fe.php	12 Jan 2006 16:03:12 -0000	1.104.2.2
> +++ class.tslib_fe.php	17 Jan 2006 20:42:51 -0000
> @@ -1613,7 +1613,11 @@
>  					}
>  						// if .simulateStaticDocuments was not present, the default value will rule.
>  					if (!isset($this->config['config']['simulateStaticDocuments']))	{
> -						$this->config['config']['simulateStaticDocuments'] = $this->TYPO3_CONF_VARS['FE']['simulateStaticDocuments'];
> +						$this->config['config']['simulateStaticDocuments'] = trim($this->TYPO3_CONF_VARS['FE']['simulateStaticDocuments']);
> +						if ($this->config['config']['simulateStaticDocuments']) {
> +								// Set prefix char only if it is needed
> +							$this->setSimulPrefixChar();
> +						}
>  					}
>  
>  							// Processing for the config_array:
> @@ -3166,6 +3170,23 @@
>  		$url.=$this->makeSimulFileName($this->page['title'], $this->page['alias']?$this->page['alias']:$this->id, $this->type).'.html';
>  		return $url;
>  	}
> +	
> +	/**
> +	 * Checks and sets prefix character for simulateStaticDocuments. Default is underscope.
> +	 * 
> +	 * @return	void
> +	 */
> +	function setSimulPrefixChar() {
> +		$replacement = '_';
> +		if (isset($this->config['config']['simulateStaticDocuments_prefixChar'])) {
> +			$replacement = trim($this->config['config']['simulateStaticDocuments_prefixChar']);
> +			if (urlencode($replacement) != $replacement) {
> +					// Invalid character
> +				$replacement = '_';
> +			}
> +		}
> +		$this->config['config']['simulateStaticDocuments_prefixChar'] = $replacement;
> +	}
>  
>  	/**
>  	 * Converts input string to an ASCII based file name prefix
> @@ -3177,11 +3198,16 @@
>  	 */
>  	function fileNameASCIIPrefix($inTitle,$titleChars,$mergeChar='.')	{
>  		$out = $this->csConvObj->specCharsToASCII($this->renderCharset, $inTitle);
> -		$out = ereg_replace('[^[:alnum:]_-]','_',trim(substr($out,0,$titleChars)));
> -		$out = ereg_replace('[_-]*$','',$out);
> -		$out = ereg_replace('^[_-]*','',$out);
> -		$out = ereg_replace('([_-])[_-]*','\1',$out);
> -		if (strlen($out))	$out.=$mergeChar;
> +			// Get prefix character
> +		$prefixChar = &$this->config['config']['simulateStaticDocuments_prefixChar'];
> +		$replacementChars = '_\-' . ($prefixChar != '_' && $prefixChar != '-' ? $prefixChar : '');
> +		$out = preg_replace('/[^A-Za-z0-9_-]/', $prefixChar, trim(substr($out, 0, $titleChars)));
> +		$out = preg_replace('/([' . $replacementChars . ']){2,}/', '\1', $out);
> +		$out = preg_replace('/[' . $replacementChars . ']?$/', '', $out);
> +		$out = preg_replace('/^[' . $replacementChars . ']?/', '', $out);
> +		if (strlen($out)) {
> +			$out .= $mergeChar;
> +		}
>  
>  		return $out;
>  	}
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> TYPO3-team-core mailing list
> TYPO3-team-core at lists.netfielders.de
> http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-team-core
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ssd-seo-2.txt
Url: http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20060118/000c6b1f/attachment.txt 


More information about the TYPO3-team-core mailing list