[TYPO3-Solr] Solr und Tika

Ingo Renner ingo at typo3.org
Mon Mar 29 14:57:05 CEST 2010


Hi all,

to bring some light into the situation:

* the current version of EXT:solr in TER (1.0.1) does not support file 
indexing
* the soon to be released version of EXT:solr (1.1.0) will not support 
file indexing either because it's not implemented yet (not even in 
2.0-dev), and because it will need some testing
* version 2.0, released somewhen later this year (end of the year 
tentatively) will support file indexing
* if you need file indexing now, you can join the 2.0 dev program with 
dkd to get early access and influence what we're going to work on next 
(file indexing f.e.)


EXT:tika as it is on forge will be used by EXT:solr 2.0 to extract 
content from files. So EXT:solr hands over the files to EXT:tika.
EXT:tika again, can use either a local Tika jar or a remote Solr server 
which is handled by EXT:solr 2.0. Both scenarios, local and remote 
extraction, have pros and cons...

For local extraction you need Java on your host. This might not be 
available, like on a shared host or you may have a distributed setup 
where you don't want to install Java on each host.

For remote extraction you need to send the files over the network 
though, that might not be the most clever thing either... consider 
sending many MB of audio or video files over the network only to get a 
few bytes of information back.


However, it should be possible to use EXT:tika with local extraction for 
EXT:dam already. "Should" means that I implemented it the way so that it 
"should" work, but I didn't have the time to actually test it yet...


Hope that helps :)
Ingo

-- 
Ingo Renner
TYPO3 Core Developer, Release Manager TYPO3 4.2



More information about the TYPO3-project-solr mailing list