[TYPO3-core] RFC: utf8 in log files

Michael Stucki michael at typo3.org
Fri Nov 24 17:09:56 CET 2006


Hi Martin,

> To get meaningful results in web analyzers for non-latin charsets the URL
> should be encoded in UTF8. The attached patch allows this.

The patch looks good to me, though I did not test it yet. First, I have some
questions:

> If you set config.stat_apache_niceTitle to 'utf-8' it will not do a
> transliteration (which is quite pointeless for eg Chinese) but will store
> the path and page title in UTF8.

What would be a reason for not using this while forceCharset is set to utf-8
already? I mean: Would it make sense to set this by default?

About the enable string "utf-8": I often need to check if I should specify
"utf8" (MySQL) or "utf-8" (mail encoding). Sometimes even the case
sensitivity needs to be correct. So, wouldn't it be ok to allow any of
these values:

preg_match('/utf\-?8/i')

> Additionally I have added config.stat_pageLen. Works like
> config.stat_titleLen, but affects lenght of the actual (leaf) page title,
> not the length of a (node) page title in the path. Before this change the
> page title length was fixed to 30, while the path titles could be
> configured up to 100 chars.
> 
> Property: stat_apache_niceTitle
> Data type: boolean / string
> Description:
> If set, the URL will be transliterated from the renderCharset to ASCII (eg
> ä => ae, à => a, ? => a), which yields nice and readable page titles in
> the log. All non-ASCII characters that cannot be converted will be changed
> to underscores.
> If set to 'utf-8', the page title will be converted to UTF-8 which results
> in even more readable titles, if your log analyzing software supports it.
> 
> Property: stat_pageLen
> Data type: int 1-100
> Description:
> The length of the page name (at the end of the path) written to the
> logfile/database.

Very good!!

Regards, michael
-- 
Use a newsreader! Check out
http://typo3.org/community/mailing-lists/use-a-news-reader/



More information about the TYPO3-team-core mailing list