[TYPO3-Solr] Code for Apache Nutch for TYPO3 CMS has been released

Lienhart Woitok Lienhart.Woitok at netlogix.de
Wed Apr 30 17:12:25 CEST 2014


Hi Thomas,

you should have a look at the log file at logs/hadoop.log. I got the same exception earlier today, and found useful information in that log file (in my case it was because of some errors in my own nutch plugin).

Regards
Lienhart


Lienhart Woitok
Web-Entwickler

Telefon: +49 (911) 539909 - 0
E-Mail: Lienhart.Woitok at netlogix.de
Website: media.netlogix.de



-----------------------------
PRTG Network Monitor
Lernen Sie, wie Sie Ihr Netz einfach und effektiv überwachen können. Jetzt anmelden zum netlogix-Event am 27.05.2014:
Jetzt anmelden: http://it-training.netlogix.de/angebote/events/prtg-network-monitor
------------------------------------



--
netlogix GmbH & Co. KG
IT-Services | IT-Training | Media
Neuwieder Straße 10 | 90411 Nürnberg
Telefon: +49 (911) 539909 - 0 | Fax: +49 (911) 539909 - 99
E-Mail: info at netlogix.de | Internet: http://www.netlogix.de

netlogix GmbH & Co. KG ist eingetragen am Amtsgericht Nürnberg (HRA 13338)
Persönlich haftende Gesellschafterin: netlogix Verwaltungs GmbH (HRB 20634)
Umsatzsteuer-Identifikationsnummer: DE 233472254
Geschäftsführer: Stefan Buchta, Matthias Schmidt



-----Ursprüngliche Nachricht-----
Von: typo3-project-solr-bounces at lists.typo3.org [mailto:typo3-project-solr-bounces at lists.typo3.org] Im Auftrag von thomas macaigne
Gesendet: Dienstag, 29. April 2014 11:44
An: typo3-project-solr at lists.typo3.org
Betreff: Re: [TYPO3-Solr] Code for Apache Nutch for TYPO3 CMS has been released

So I have followed carefully the README included.
Downloaded Nutch 1.8 src, compiled it.
Also tried with the Nutch precompiled binaries.
Modified conf/nutch-site.xml, added API key and baseURL.
The crawl seems to be ok, but it's right at the end that I get this error:


SOLRIndexWriter
        solr.server.url : URL of the SOLR instance (mandatory)
        solr.commit.size : buffer size when sending to SOLR (default 1000)
        solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
        solr.auth : use authentication (default false)
        solr.auth.username : use authentication (default false)
        solr.auth : username for authentication
        solr.auth.password : password for authentication


Getting siteHash for domain: nutch.apache.org
Constructed TYPO3 API URL: http://wiki.mydomain.net/index.php?eID=tx_solr_api&api=siteHash&domain=nutch.apache.org&apiKey=f329b64134351933fbaa916fc05fcbc0d88258d2
TYPO3 Solr API Request sent.
TYPO3 Solr siteHash retrieved: f1c579b9422e0173b0799d5998bd65bc9210f1b2
Indexer: java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
        at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114)
        at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)

_______________________________________________
TYPO3-project-solr mailing list
TYPO3-project-solr at lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-project-solr


More information about the TYPO3-project-solr mailing list