[TYPO3-Solr] Solr und Tika
Ingo Renner
ingo at typo3.org
Mon Mar 29 14:57:05 CEST 2010
Hi all,
to bring some light into the situation:
* the current version of EXT:solr in TER (1.0.1) does not support file
indexing
* the soon to be released version of EXT:solr (1.1.0) will not support
file indexing either because it's not implemented yet (not even in
2.0-dev), and because it will need some testing
* version 2.0, released somewhen later this year (end of the year
tentatively) will support file indexing
* if you need file indexing now, you can join the 2.0 dev program with
dkd to get early access and influence what we're going to work on next
(file indexing f.e.)
EXT:tika as it is on forge will be used by EXT:solr 2.0 to extract
content from files. So EXT:solr hands over the files to EXT:tika.
EXT:tika again, can use either a local Tika jar or a remote Solr server
which is handled by EXT:solr 2.0. Both scenarios, local and remote
extraction, have pros and cons...
For local extraction you need Java on your host. This might not be
available, like on a shared host or you may have a distributed setup
where you don't want to install Java on each host.
For remote extraction you need to send the files over the network
though, that might not be the most clever thing either... consider
sending many MB of audio or video files over the network only to get a
few bytes of information back.
However, it should be possible to use EXT:tika with local extraction for
EXT:dam already. "Should" means that I implemented it the way so that it
"should" work, but I didn't have the time to actually test it yet...
Hope that helps :)
Ingo
--
Ingo Renner
TYPO3 Core Developer, Release Manager TYPO3 4.2
More information about the TYPO3-project-solr
mailing list