[TYPO3-RTE] Cleaning pasted content

Stanislas Rolland stanislas.rolland at fructifor.ca
Sat Jan 7 22:59:48 CET 2006


Hi Robert,
> 
> There are many settings for rtehtmlarea or the RTE API in general which 
> enable the admin to control the input to a certain degree for the sake 
> of a consistent output in the FE (like removing attributes from certain 
> tags or removing certain tags in general). Which is very good, since a 
> consistent output is very important.
> 
> However, when pasting content from other sources (websites, Word, 
> OpenOffice.org etc.), the current input control may not be sufficient, 
> especially when the source is not well-formed from the perspective of 
> the RTE (even more when tables are disabled with 'removeTags = table, 
> tbody, td, th, thead, tr').
> 
In freshly released version 1.1.0 of htmlArea RTE (although at the time 
of writing, the documentation is not yet refreshed), there is an option 
to configure the Page TSConfig property "enableWordClean" with a TYPO3 
htmlparser: the pasted text is sent to the server for parsing using the 
TYPO3 htmlparser, and the parsed text is sent back into the editing 
area. The cleaning is done on the server: there may be a slight delay 
depending on the connexion speed and on size of the pasted text.

This will not solve all the problems you report. However, it sets a base 
to work from. For example, it would be possible to insert a hook on this 
parser-invoving script in order to do some further processing. The 
transformations you propose could be performed by such hook.

What do you think of this approach?

Stanislas



More information about the TYPO3-project-rte mailing list