[TYPO3-english] vge_tagcloud and proper handling of UTF-8 tags
François Suter
fsu-lists at cobweb.ch
Tue Apr 12 22:29:28 CEST 2011
Hi Alex,
> But in the tagcloud section, special characters like "à ù ñ ì" in words
> like pàgina or bùsqueda are not displayed correctly.
>
> Some of them are splited at the position of this chars others are
> displayed with this famous and uggly unknown symbols .
I could reproduce the problem but only with "à" as in "pàgina". All
other letters that you mention caused no problem, as well as many other
characters in other language. I even tried "Jóhanna Sigurðardóttir"
(Iceland's prime minister) and it works fine. So there's got to be
something weird with the "à". I dug a bit deeper and the thing is that
when "à" is converted to latin-1 (which is what PHP's regexp functions
do - unfortunately) it converts to 2 characters: one weird letter and
one blank. So it get's split on that blank. So this could happen to
other characters too, but definitely not all.
> Since this is not limited to spanish, but also to german (äüö) and
> french (Çé...) and generaly to UTF-8 i'm asking me what to do?
One workaround could be to use the "extractKeywords" hook from
vge_tagcloud. This lets you provide your own method for splitting the
words. Inside your hook you could then convert all the strings to
latin-1 (which should be ok assuming your site is entirely in Spanish),
split just like the tag cloud does, then convert back to UTF-8 for
proper display.
HTH
--
Francois Suter
Cobweb Development Sarl - http://www.cobweb.ch
More information about the TYPO3-english
mailing list