[TYPO3-english] Tesseract project and the Google Query data provider

Roberto Presedo rpresedo.typo3 at gmail.com
Wed Oct 27 19:48:03 CEST 2010


Hello Søren,

Sorry for my late answer... Quite busy times around here...

> On 20/10/2010, at 16.00, Roberto Presedo wrote:
>
> > Hi Søren,
> >
> > Here are some answers to your questions...
> >
> > 1) For the caching issue, this is indeed related to session. Tesseract
> > keeps filter's information in the session in order to reuse the filter
> > when browsing page results. If you want to clear the session's
> > information, you can use the clear_cache parameter in the url (either
> > by GET or POST) example : &clear_cache=1
>
> Ok - that's clear. I tried your other suggestions, but as I wrote to you, that clashed with using pagebrowse, since the query parameter is not transferred to the pagebrowser URL parameters, thus not sending a q parameter, and therefore clearing the cache.

I'm trying to find a clean solution for this issue... but I don't have
it yet... I need this functionnality for one of my customers too, so
I'm quite confident you'll have a solution for that very soon..

>
> > 2) Synonyms and Keymatches are available mapping the googleSynonymes
> > or googleKeymatches which both contains a "label" and a "link" field.
> > This information is only available if the query returns a synonym or a
> > keymatch.
>
> Yes - I get that, but I need to place that inside the googleInfos loop, right? But - then it's output for each loop.
> I tried looping googleKeymatches (both inside and outside the googleInfos loop), but that returns nothing.
>
> I obviously have some HTML elements containing my keymatches and/or synonyms, and I only want to output this, if there are matching results. How do I do this?
>
>
> Here is my current template for reference:
>
> <div class="au_gsa">
> <!--LOOP(googleInfos)-->
> <div class="au_gsa_keymatch"><a href="###FIELD.keylink###">###FIELD.keylabel###</a></div>
> <!--IF ( ###COUNTER(googleInfos)### == 0 ) -->
> <p>Side ###RECORD_OFFSET### af <strong>###TOTAL_RECORDS###</strong> fundne resultater</p>
> <p>###PAGE_BROWSER###</p>
> <ul>
> <!--ENDIF-->
>  <li>
>      <span class="title"><a href="###FIELD.url###">###FIELD.title###</a></span><br />
>      <span class="snippet">###FIELD.snippet###</span><br/>
>      <span class="url"><a href="###FIELD.url###">FUNCTION:parse_url(###FIELD.lit_url###,1)</a> (###FIELD.pagelang###) - ###FIELD.Author### - ###FIELD.DC-date###</span><br/>
>  </li>
> <!--IF ( ###COUNTER(googleInfos)### == ( ###TOTAL_RECORDS### - 1 ) ) -->
> </ul>
> <!--ENDIF-->
> <!--ENDLOOP-->
> ###PAGE_BROWSER###
> </div>

Keymatches are in a different "table" than "googleInfos", called
"googleKeymatches". So in order to display those keymatches, and
because we can have more than one, you must put the field in a loop
for "googleKeymatches" like that

[...]

<!--LOOP(googleInfos)-->
<!--IF ( ###COUNTER(googleInfos)### == 0 ) -->
<!--LOOP(googleKeymatches)--><div class="au_gsa_keymatch"><a
href="###FIELD.keylink###">###FIELD.keylabel###</a></div><!--ENDLOOP-->
<p>Side ###RECORD_OFFSET### af <strong>###TOTAL_RECORDS###</strong>
fundne resultater</p>
<p>###PAGE_BROWSER###</p>

[...]

The same structure must be used for synonyms, with a loop on table
"googleSynonymes"


> > 3) To setting up proper paging of results, you must first have
> > "pagebrowse" extension installed, and then configure the datafilter. A
> > typical configuration would be
> > "Max items per view" = 10
> > and
> > "Start at page (offset)" = "vars:page" (this is the value of the
> > "tx_displaycontroller[page]" parameter provided by the "pagebrowse"
> > extension.
>
> vars:page - Thanks!
>
> This re-introduces an old problem we have with the pagebrowse extension though. We have a lot of websites that we want to provide search for individually via collections, but on the same page we also wish to show results from all of the University. Therefore we have at least two search listings on the same page, but the pagebrowse extension can't differ between these two, so when I hit "Next" for one listing, the other listing also get's "Nexted"

Indeed, maybe using ajax calls you'll be able to update only one
list... Otherwise, you'll need to hack pagebrowse...

>
> > 4) and 5) By passing a "debug=1" parameter to a page containing a
> > tesseract element, a table containing data structure information will
> > be displayed (you must be logged in BE to see that table). Using this,
> > you'll be able to see what kind of information is available for each
> > loop, and also how metadatas are stored in the googlequery provider
> > (TIP : take a close look the the record table)
>
> I still don't get how only to output a given marker if it contains data. Some snippets are eg. empty, so I don't want to output the empty containg HTML.
>

You may use the IF marker for that :
<--IF(###FIELD.myField### <> '' )-->I display the field
###FIELD.myField###<!--ENDIF-->


> I also don't get how to map the meta tags.

> The record table (and the XML) reveals the following example meta tags:
> DC.Title persons
> DC.Language da
> DC.Date 2010-09-28T20:41:40+02:00
> viewport width=1000;
> rating general
> DC.Type text/html
>
> So I've tried to enter Author,generator, rating etc. in "Selected meta tags" in the DP, and these are available to me in the template controller - but they don't output anything.

I can see that too... I'll try to see why we cannot get the metatags
values... I'll come back to you asap...

>
>
> > Then, regarding the features provided by GSA, here is what I can say :
> >
> > 1) Clusters (The "Narrow your search" part)
> > This is provided by a Javascript included in the GSA. I didn't find a
> > way to get that information in the XML output provided by the GSA.
> > Maybe there is, but I couldn't find it.
>
> Both clusters and dynamic search suggestions (which I haven't mentioned earlier) are provided by javascript, so perhaps we could figure out a way to interact with the GSA directly?

Well, right now, googleQuery is only parsing the XML answer for a
search on GSA or GoogleMini (and soon also Google Custom Search)....
But, the JS is not outputting XML (JSON if my memory is right) ...
Maybe we should create a new data provider, for JSON ? (that could be
a good idea :) )

>
> > 2) Spelling suggestions for searches, 3) Sort by date / sort by
> > relevance, 4) Display file types and 5) Cached version link
> > All those features are actually not supported by the googlequery
> > provider, but this can easily be added in a future version.
>
> So - they are possible - great. Are these features in a roadmap, or is this something we would need to sponsor the development of?
>

This is not in the roadmap yet, but feel free to send a request on
forge :) and of course, sponsorship would surely accelerate the
development of those features ;)


> > I hope this help...


Sure it does

Roberto PRESEDO
Cobweb


More information about the TYPO3-english mailing list