[TYPO3-core] RFC: #2302: substitute all strtoupper/strtolower with the t3lib_div-method

Martin Kutschker masi-no at spam-typo3.org
Wed Jul 2 14:04:39 CEST 2008


Ernesto Baschny [cron IT] schrieb:
> Martin Kutschker wrote: on 28.06.2008 18:44:
> 
>>> This is an SVN patch request.
>>>
>>> Type: Bugfix
>>>
>>> Bugtracker reference:
>>> http://bugs.typo3.org/view.php?id=2302
>>>
>>> Branches: trunk
>>>
>>> Problem: strtoupper and strtolower are not multibyte-safe. For this
>>> reason we added 2 methods in t3lib_div sometime ago.
>>
>> No, this is a misunderstanding! The sole purpose of this two function is
>> to provide a locale independent conversion (needed for Turkish).
>>
>> It's only meant for 7bit ASCII data, ie strings that are know to contain
>> only a-z and A-Z (eg by definition markers, etc)
>>
>> Do NOT use it it on arbitary string especially not utf8.
>>
>> If you need  to upper case utf or other charsets use t3lib_cs->case().
> 
> I would suggest to make the function documentation more precise. It
> currently says:
> 
>          * Converts string to lowercase
>          * The function converts all Latin characters (A-Z, but no
> accents, etc) to
>          * lowercase. It is safe for all supported character sets (incl.
> utf-8).
>          * Unlike strtolower() it does not honour the locale.
> 
> First it shouldn't say it works on "all Latin characters", but on all
> ASCII (7-bit) characters. Then "it is safe for all supported character
> sets (incl. utf-8)" is even more misleading.

Ok, sorry my fault.

> We know what it means,

Do we?

> but
> someone that doesn't might think this function can convert even UTF-8
> data to lowercase. So I suggest to have it like:
> 
>     * Converts ASCII strings to lowercase
>     * Only A-Z character are considered. Only use this method for
>     * strings where you expect only ASCII characters (e.g. markers).
>     * Unlike strtolower() it does not honour the locale (i.e. also
>     * works on turkish locale).

Well, what I meant with utf8-safe is that it won't garble any utf8 data,
 as it works only on the ASCII/7bit part. So in fact you can use it on
any supported charset without troubles as long as you keep in mind that
only the letters A-Z are converted to lower case.

Masi



More information about the TYPO3-team-core mailing list