[TYPO3-dam] does Indexed search index pdf files with DAM enabled

Pascal Voitot pascal.voitot at free.fr
Tue Aug 22 11:03:11 CEST 2006


Hello,

Daniel, you are right when you say the indexed_search is not meant for what I
want to do!
In fact, I'm trying to use typo3 and its DAM as a basic, simple and free DMS
without the need to put documents in a DB managed by a full DMS architecture
(such as Sharepoint, Alfresco etc...) and with very powerful features to create
FE presentation of the documents. This is not the original role of typo3 but it
appears not to be a so bad idea when I see how expensive and complex document
management systems are and how basic needs of people can be.

My problem is that the DAM indexes documents with metadata and even extracts of
the doc but I need also a full-text indexation and search for
DOC/PDF/PPT/XLS/OO etc...

The indexed_search is not meant for this at all for the reason you tell, it is
linked to the FE and you are right: putting DAM doc information in the
indexed_search tables would only bring rubbish in the search results!

So my idea is to create an "indexed_search" specifically linked to the DAM and
also to think about a real versioning system etc...

here we are :)

br
Pascal


Selon Daniel Thomas <dt at dpool.net>:

> Hi Pascal,
>
> > As far as I know, there are no relationship between the DAM and the
> > indexed
> > search... The DAM only tries to retrieve some metadata from the
> > files with
> > tools such as catdoc, pdftotext etc...
> > So you can get generally get the header of the file, the language
> > and an extract
> > of the content. But for PDF, files are often protected against
> > reading content
> > and the DAM can only retrieve basic information.
> >
> > BUT there is no way for the DAM to index a document in the
> > indexed_search. But I
> > think this should be the logical evolution!!
> >
>
> Could you outline your idea here.
> What exactly should indexed_search do?
> So far it is closely wedded to the TYPO3 page or page-hash paradigm.
> Both the indexing mechanism and the frontend-search functionality are
> built to support an output format and not a data storage format.
>
> Personally, I see no use in fulltext-indexing of DAM records if this
> is not combined with a major concept change in the way the
> indexed_search operates or if it comes to that a different search
> engine. In the end the indexed_search plugin would have to evolve,
> not the DAM.
>
> However, I am not sure if this is a good solution for TYPO3. The
> primary value of the indexed_search is that with its "format-
> blindness" it integrates easily with virtually every output  a plugin
> could produce. The downside is of course lack of precision and
> redundancy in search results. If you wanted to have a search engine
> which is not format blind, but which would connect indexed
> information with unique occurances in the database or file system
> both the indexer and the search engine would be much more complex and
> harder to use.
>
> Regards
>
> Daniel
>
> > br
> > Pascal
> >
> > Selon Sacha Vorbeck <Vorbeck at moduleBox.com>:
> >
> >> Hi,
> >>
> >> does anybody know if the indexed search will index PDF files when
> >> using
> >> DAM? Would be interesting to know.
> >>
> >> --
> >> thank you - all the best,
> >> Sacha
> >> _______________________________________________
> >> TYPO3-project-dam mailing list
> >> TYPO3-project-dam at lists.netfielders.de
> >> http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-project-
> >> dam
> >>
> >
> >
> > --
> > Pascal Voitot
> > ingénieur en génie informatique de l'ISMRA ENSI de Caen
> > _______________________________________________
> > TYPO3-project-dam mailing list
> > TYPO3-project-dam at lists.netfielders.de
> > http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-project-dam
> >
> >
>
> --/
>
> Daniel Thomas dpool
>
> Hinderink und Thomas Partnerschaft IT-Berater und Projektmanager
>
> Eduard-Schmid-Str. 9 | D-81541 München
> t 08945227582 | m 01793918781 | fax 08945227583
>
> http://www.dpool.net | http://www.typergy.com
> http://typo3partner.net
>
> /--
>
> _______________________________________________
> TYPO3-project-dam mailing list
> TYPO3-project-dam at lists.netfielders.de
> http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-project-dam
>


--
Pascal Voitot
ingénieur en génie informatique de l'ISMRA ENSI de Caen



More information about the TYPO3-project-dam mailing list