[TYPO3-english] contagged Extension has error while parsing joined words (words joined with dashes)
Jochen Rau
j.rau at web.de
Tue Mar 10 09:54:33 CET 2009
Hi Parakash,
> In the Content parser and tagger (Glossary) contagged extension there
> seems to be some sort of error while parsing joined words (word joined
> using dashes).
> This is clearly noticeable especially when second word contains special
> characters such as ( ê, à, u', é, etc...)
>
> For example consider the word " elle-même " the term is defined as
> "elle" with a link to example.com then the link is getting rendered as
> follows:
>
> <dfn><a target="_top" href="http://www.example.com">Elle-m</a></dfn>ême
>
> I doubt this could have something related with the preg_match() used in
> getPositions() function of class.tx_contagged.php.
>
> What could be the problem? Anyone?
I have uploaded contagged v0.2.1 to the TER (should be availablew in a
few hours). It improves the handling of UTF-8 in combined words.
Don't forget to activate UTF-8 support by adding "u" to the Regular
Expression Modifier in the TS constants:
contagged.modifier = Uisu <--
UTF-8 handling was deactivated by default because some old versions of
PHP used on shared hosting do not have the necessary libraries activated.
Cheers
Jochen
More information about the TYPO3-english
mailing list