[TYPO3-core] Patch: simulateStaticDocuments and SEO
Dmitry Dulepov
dima at spamcop.net
Wed Jan 18 20:08:26 CET 2006
Hi!
Seconf version of the patch. Changes:
- "prefixChar" -> "replacementChar"
- now using compatibility feature to set hyphen if compatibility mode is set to
4.0.0 or newer
- removed extra EOL at the EOF
Dmitry.
Dmitry Dulepov wrote:
> This is a CVS patch request
>
> Type: feature
>
> Branch: TYPO3_4-0
>
> Problem: Typo3 uses undescope character as separator to generate URLs if
> simulateStaticDocuments is enabled. Google however recognizes hyphen as word
> separator in URLs. Thus it cannot find keywords in URLs produced by Typo3
>
> Solution: make this character configurable. A new option added
> (config.simulateStaticDocuments_prefixChar) with default value of underscope. If
> this value is set and it equals to its urlencode(), than it is used as prefix
> separator character.
>
> This patch also converts ereg_replace to preg_replace in the function
> responsible for handling simulateStatic file names.
>
> FAQ:
> Q. Will it change existing URLs?
> A. No, unless you explicitely specify this character and set it to something
> else then underscope. No existing installations are affected by this patch.
>
> Q. If I set this character to hyphen, will external references to my pages work?
> A. Yes, they will. Typo3 does not care about prefix, it is ment for humans (and
> search engines). ID part of the URL stays the same.
>
> Q. How Google treats hyphen and underscope in the URL?
> A. Hyphen separates keywords while underscope glues keywords and make an exact
> sequence of them.
>
> Q. Are you sure about all this? Who told you?
> A. See yourself. Here:
> http://www.prweaver.com/blog/2004/08/26/2-hyphen-and-underscore
> and here:
> http://www.webmasterworld.com/forum3/23564.htm
> (Googleguy is a Google employee who gives carefully selected information about
> Google internals)
>
> Dmitry.
>
>
> ------------------------------------------------------------------------
>
> Index: class.tslib_fe.php
> ===================================================================
> RCS file: /cvsroot/typo3/TYPO3core/typo3/sysext/cms/tslib/class.tslib_fe.php,v
> retrieving revision 1.104.2.2
> diff -u -r1.104.2.2 class.tslib_fe.php
> --- class.tslib_fe.php 12 Jan 2006 16:03:12 -0000 1.104.2.2
> +++ class.tslib_fe.php 17 Jan 2006 20:42:51 -0000
> @@ -1613,7 +1613,11 @@
> }
> // if .simulateStaticDocuments was not present, the default value will rule.
> if (!isset($this->config['config']['simulateStaticDocuments'])) {
> - $this->config['config']['simulateStaticDocuments'] = $this->TYPO3_CONF_VARS['FE']['simulateStaticDocuments'];
> + $this->config['config']['simulateStaticDocuments'] = trim($this->TYPO3_CONF_VARS['FE']['simulateStaticDocuments']);
> + if ($this->config['config']['simulateStaticDocuments']) {
> + // Set prefix char only if it is needed
> + $this->setSimulPrefixChar();
> + }
> }
>
> // Processing for the config_array:
> @@ -3166,6 +3170,23 @@
> $url.=$this->makeSimulFileName($this->page['title'], $this->page['alias']?$this->page['alias']:$this->id, $this->type).'.html';
> return $url;
> }
> +
> + /**
> + * Checks and sets prefix character for simulateStaticDocuments. Default is underscope.
> + *
> + * @return void
> + */
> + function setSimulPrefixChar() {
> + $replacement = '_';
> + if (isset($this->config['config']['simulateStaticDocuments_prefixChar'])) {
> + $replacement = trim($this->config['config']['simulateStaticDocuments_prefixChar']);
> + if (urlencode($replacement) != $replacement) {
> + // Invalid character
> + $replacement = '_';
> + }
> + }
> + $this->config['config']['simulateStaticDocuments_prefixChar'] = $replacement;
> + }
>
> /**
> * Converts input string to an ASCII based file name prefix
> @@ -3177,11 +3198,16 @@
> */
> function fileNameASCIIPrefix($inTitle,$titleChars,$mergeChar='.') {
> $out = $this->csConvObj->specCharsToASCII($this->renderCharset, $inTitle);
> - $out = ereg_replace('[^[:alnum:]_-]','_',trim(substr($out,0,$titleChars)));
> - $out = ereg_replace('[_-]*$','',$out);
> - $out = ereg_replace('^[_-]*','',$out);
> - $out = ereg_replace('([_-])[_-]*','\1',$out);
> - if (strlen($out)) $out.=$mergeChar;
> + // Get prefix character
> + $prefixChar = &$this->config['config']['simulateStaticDocuments_prefixChar'];
> + $replacementChars = '_\-' . ($prefixChar != '_' && $prefixChar != '-' ? $prefixChar : '');
> + $out = preg_replace('/[^A-Za-z0-9_-]/', $prefixChar, trim(substr($out, 0, $titleChars)));
> + $out = preg_replace('/([' . $replacementChars . ']){2,}/', '\1', $out);
> + $out = preg_replace('/[' . $replacementChars . ']?$/', '', $out);
> + $out = preg_replace('/^[' . $replacementChars . ']?/', '', $out);
> + if (strlen($out)) {
> + $out .= $mergeChar;
> + }
>
> return $out;
> }
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> TYPO3-team-core mailing list
> TYPO3-team-core at lists.netfielders.de
> http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-team-core
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ssd-seo-2.txt
Url: http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20060118/000c6b1f/attachment.txt
More information about the TYPO3-team-core
mailing list