[TYPO3-core] RFC: #2302: substitute all strtoupper/strtolower with the t3lib_div-method
Martin Kutschker
masi-no at spam-typo3.org
Wed Jul 2 14:04:39 CEST 2008
Ernesto Baschny [cron IT] schrieb:
> Martin Kutschker wrote: on 28.06.2008 18:44:
>
>>> This is an SVN patch request.
>>>
>>> Type: Bugfix
>>>
>>> Bugtracker reference:
>>> http://bugs.typo3.org/view.php?id=2302
>>>
>>> Branches: trunk
>>>
>>> Problem: strtoupper and strtolower are not multibyte-safe. For this
>>> reason we added 2 methods in t3lib_div sometime ago.
>>
>> No, this is a misunderstanding! The sole purpose of this two function is
>> to provide a locale independent conversion (needed for Turkish).
>>
>> It's only meant for 7bit ASCII data, ie strings that are know to contain
>> only a-z and A-Z (eg by definition markers, etc)
>>
>> Do NOT use it it on arbitary string especially not utf8.
>>
>> If you need to upper case utf or other charsets use t3lib_cs->case().
>
> I would suggest to make the function documentation more precise. It
> currently says:
>
> * Converts string to lowercase
> * The function converts all Latin characters (A-Z, but no
> accents, etc) to
> * lowercase. It is safe for all supported character sets (incl.
> utf-8).
> * Unlike strtolower() it does not honour the locale.
>
> First it shouldn't say it works on "all Latin characters", but on all
> ASCII (7-bit) characters. Then "it is safe for all supported character
> sets (incl. utf-8)" is even more misleading.
Ok, sorry my fault.
> We know what it means,
Do we?
> but
> someone that doesn't might think this function can convert even UTF-8
> data to lowercase. So I suggest to have it like:
>
> * Converts ASCII strings to lowercase
> * Only A-Z character are considered. Only use this method for
> * strings where you expect only ASCII characters (e.g. markers).
> * Unlike strtolower() it does not honour the locale (i.e. also
> * works on turkish locale).
Well, what I meant with utf8-safe is that it won't garble any utf8 data,
as it works only on the ASCII/7bit part. So in fact you can use it on
any supported charset without troubles as long as you keep in mind that
only the letters A-Z are converted to lower case.
Masi
More information about the TYPO3-team-core
mailing list