[TYPO3] Indexing external files with indexed search
Steven Bagshaw
steven.bagshaw at unv.org
Fri Feb 3 12:05:27 CET 2006
Hello,
I have a previously functioning Indexed Search setup, to which I want to add
the ability to index external files (PDF, Word, Excel, PPT).
I have gotten a few steps along the way, but have hit a brick wall that I
hope someone can help with.
I have index_externals = 1 set, and the paths to the various tools are
correct. But the actual indexing of the external files does not work
properly.
For PDFs, I always get this message in the debugging...
------------------------------------------
Indexing needed, reason: Page has never been indexed (is not represented in
the index_phash table).
Could not index file! Unsupported extension
- Looking at the code in class.indexer.php, this is because
if (is_array($contentParts))
is failing. So, I assume the values returned by the pdftotext.exe are not
right.
For WORD docs, I get messages like this
----------------------------------------
s/Index: AFR_monthly_HPV_Feb_ver3_doc 4101 5 +78 =83 Indexing needed,
reason: Page has never been indexed (is not represented in the index_phash
table).
...onthly_HPV_Feb_ver3_doc/Split content 4102 59
...b_ver3_doc/Extract words from content 4162 3
..._ver3_doc/Analyse the extracted words 4165 1
...thly_HPV_Feb_ver3_doc/Submitting page 4166 5
..._doc/Check word list and submit words 4171 10 Inserting words: 3
So, that all looks good.
But looking at the BE index monitoring tools and using the search in the FE
does not show this document as indexed.
I am running on Windows, Typo3 3.8, Indexed Search 2.1.3.
Thanks for any help!
Steven
More information about the TYPO3-english
mailing list