[TYPO3-Solr] Tika App vs Tika solr server integration

Claus Fassing claus at fassing.eu
Fri Feb 10 11:32:12 CET 2012


Ho Olivier,

Am 10.02.2012 10:51, schrieb Olivier Dobberkau:
> Hi Claus,
>
> Are you sure that your Servlet Container handles UTF8 correctly?

Yes, content extraction with the Tika app work as expected

> Which Version of Apache Solr Server are you using?

Live environment : 3.3.0
Develop environment : 3.4.0

But both do not extract content correct on some (Not all!) pdf files by 
using Solr server extraction. With the Tika app, it is working fine.
I couldn't figure out the different from these pdf files to the working 
ones,
but get correct results with the Tika app.
And I read anywhere that the integrated Tika lib ist much older than the 
app. Thought this is the reason.

> Are you sure that no other extracting service in TYPO3 is interfering here?

Yes, absolutely. The same happen on the develop environment and this is 
a most clean installation where I never tried other extraction services.

Is there any chance to get the Solr server using the Tika app by 
configuration ?

Regards Claus


More information about the TYPO3-project-solr mailing list