[TYPO3] Indexing external files with indexed search

Steven Bagshaw steven.bagshaw at unv.org
Fri Feb 3 12:05:27 CET 2006


Hello,

I have a previously functioning Indexed Search setup, to which I want to add
the ability to index external files (PDF, Word, Excel, PPT).

I have gotten a few steps along the way, but have hit a brick wall that I
hope someone can help with.

I have index_externals = 1 set, and the paths to the various tools are
correct. But the actual indexing of the external files does not work
properly.

For PDFs, I always get this message in the debugging...
------------------------------------------

Indexing needed, reason: Page has never been indexed (is not represented in
the index_phash table).
Could not index file! Unsupported extension

- Looking at the code in class.indexer.php, this is because

 if (is_array($contentParts))

is failing. So, I assume the values returned by the pdftotext.exe are not
right.

For WORD docs, I get messages like this
----------------------------------------
s/Index: AFR_monthly_HPV_Feb_ver3_doc    4101  5  +78  =83  Indexing needed,
reason: Page has never been indexed (is not represented in the index_phash
table).
...onthly_HPV_Feb_ver3_doc/Split content    4102  59
...b_ver3_doc/Extract words from content    4162  3
..._ver3_doc/Analyse the extracted words    4165  1
...thly_HPV_Feb_ver3_doc/Submitting page    4166  5
..._doc/Check word list and submit words    4171  10      Inserting words: 3



So, that all looks good.

But looking at the BE index monitoring tools and using the search in the FE
does not show this document as indexed.

I am running on Windows, Typo3 3.8, Indexed Search 2.1.3.

Thanks for any help!

Steven





More information about the TYPO3-english mailing list