[TYPO3-core] RFC #7984: Bug: stdWrap.crop now closes opened tags and counts chars correctly

Jochen Rau j.rau at web.de
Tue Aug 12 04:06:07 CEST 2008


Hi Benjamin,

I was in contact with Peter from mid April to the beginning of May. We 
worked together on his extension in a way that he sent me his latest 
snapshot an I applied my Unit Tests an gave him feed back. So, in the 
end, both of the solutions behave nearly the same.

IMO there are still some disadvantages of pmkhtmlcrop:
- It utilizes the Parser DOMDocument which is not well configurable.
- Custom tags like <link> will be removed if you don't parse RTE-content 
through the parseFunc before applying the htmlCrop.
- Parsing DOM is very load intensive. Parsing text with pmkhtmlcrop 
takes 120-160% of the time the proposed solution does (tested with ZEND 
Studio)
- The extension does count entities in a wrong way (e.g. &nbsp; is 
counted as 6 chars but should be counted as one char). The resulting 
text is shorter than expected.

But on the other hand there are some advantages of pmkhtmlcrop:
- Crawling a tree is more sophisticated than building a flat array from 
a programmers point of view.
- Peter introduced a useful feature: cropping to the last '.'.


The disadvantages of my proposed solution are:
- The Regular Expressions are hard to read and understand.
- Cropping to the last white space (sliding back to the next white space 
to avoid broken words) can not "slide" over a tag.

The advantages of my proposed solution are:
- It does't change the original HTML code apart from cropping it ;-) .
- It does what is promised (see the unit tests).
- It avoids the disadvantages of pmkhtmlcrop ;-)

Greetings
Jochen




Benjamin Mack schrieb:
> Hey Dmitry,
> 
> it is a bit complex, I remember Peter Klein having an extension that 
> does that exact thing, having a new stdWrap option doing this in a more 
> sophisticated way (see extension "pmkhtmlcrop").
> 
> I'm not in the material (also haven't read the patch yet), but it would 
> be great to see the advantages and disadvantages of both implementation.


More information about the TYPO3-team-core mailing list