[TYPO3-Solr] Crop viewhelper not cropping Tika PDF abstract in results
Christoph Moeller
moeller at network-publishing.de
Thu May 24 21:36:24 CEST 2012
Hello newsgroup,
we have problems getting the crop viewhelper working on teaser text in
the result list originating from PDF content extracted by Tika.
Our config:
viewhelpers {
crop {
maxLength = 300
cropIndicator = ...
cropFullWords = 0
}
}
That works fine with regular indexed TYPO3 DB content.
However, it has no effect at all on some (not all!) of the indexed PDF
file results - the abstract text will not be cropped.
The indexed file text seems to include all kinds of charset garbage,
which is non-optimal for displaying in SERPs anyway.
I think this might by related to tslib_cObj::cropHTML() used in the
helper, which has problems with newlines (and probably other
non-printable stuff) - see http://forge.typo3.org/issues/28741
We tried preg-removing all non printable chars before the cropHTML()
call in the helper class, but to no avail.
After all, we only want indexed PDF abstract text to be cropped just the
way it works with TYPO3 content.
Anyone having similar issues?
Thanks and cheers
Chris
More information about the TYPO3-project-solr
mailing list