[Typo3-dev] Indexed Search Engine and indexing of external files

Glen Gibb grg at stanford.edu
Sun Jul 18 22:58:46 CEST 2004


Hi Kasper,

I've encountered a problem with indexing of external files - basically 
external files weren't being indexed.

After spending some time debugging and tracing through the source I 
discovered the problem was to do with the way the module identifies 
whether a link refers to a file that can be indexed when the link has an 
absolute path. For example, a link to a file such as 
"/mysite/fileadmin/upload/somefile.pdf" would not be indexed. The reason 
that the file wasn't being indexed is because class.indexer.php calls 
the php function "is_file" passing in the link as the path to the file. 
For the example file given above it is looking for the directory mysite 
under the root directory, rather than as a subdirectory of my HTTP 
document root.

I've attached two different patches to class.indexer.php which attempt 
to fix the problem. Both patches introduce a new function called 
toSitePath which takes a filename and attempts to convert it to a 
filename that is relative to the site path. The first 
(indexed_search.1.diff) simply looks for the site path (which would be 
/mysite/) and strips it. The second version does the same thing but also 
returns an empty string if the link does not refer to a file within the 
site. (Note: the function isn't smart enough to handle links containing 
two dots "..")

I should also point out that links of the form above are generated if 
config.absRefPrefix is set.

Thanks in advance
Glen Gibb
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: indexed_search.2.diff
URL: <http://lists.typo3.org/pipermail/typo3-dev/attachments/20040718/81e59e62/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: indexed_search.1.diff
URL: <http://lists.typo3.org/pipermail/typo3-dev/attachments/20040718/81e59e62/attachment.asc>


More information about the TYPO3-dev mailing list