[TYPO3-english] contagged Extension has error while parsing joined words (words joined with dashes)
    Jochen Rau 
    j.rau at web.de
       
    Tue Mar 10 09:54:33 CET 2009
    
    
  
Hi Parakash,
> In the Content parser and tagger (Glossary) contagged extension there 
> seems to be some sort of error while parsing joined words (word joined 
> using dashes).
> This is clearly noticeable especially when second word contains special 
> characters such as ( ê, à, u', é, etc...)
> 
> For example consider the word " elle-même " the term is defined as 
> "elle" with a link to example.com then the link is getting rendered as 
> follows:
> 
> <dfn><a target="_top" href="http://www.example.com">Elle-m</a></dfn>ême
> 
> I doubt this could have something related with the preg_match() used in 
> getPositions() function of class.tx_contagged.php.
> 
> What could be the problem? Anyone?
I have uploaded contagged v0.2.1 to the TER (should be availablew in a 
few hours). It improves the handling of UTF-8 in combined words.
Don't forget to activate UTF-8 support by adding "u" to the Regular 
Expression Modifier in the TS constants:
contagged.modifier = Uisu <--
UTF-8 handling was deactivated by default because some old versions of 
PHP used on shared hosting do not have the necessary libraries activated.
Cheers
Jochen
    
    
More information about the TYPO3-english
mailing list