solr - Nutch (2.2.1) Inject Urls Hangs -


i'm running ubuntu 14.04, i'm tying basic nutch web crawl running no avail. following this tutorial set following building blocks:

  • ubuntu 14.04
  • hbase 0.90.4
  • nutch 2.2.1
  • solr 4.3.1

i confirm both hbase , solr running, populate urls/seed.txt file. when call;

bin/nutch inject urls 

i'm presented following output , seems nutch hangs.

injectorjob: starting @ 2014-06-09 23:38:49 injectorjob: injecting urldir: urls/seed.txt 

this stackoverflow question seems similar mine, not behind proxy answer not applicable.

any in resolving issue appreciated.

ubuntu defaults loopback ip address in hosts 127.0.1.1. hbase (according this page) requires loopback ip address 127.0.0.1.

the ubuntu /etc/hosts file default contains (with mycomputername being computer name):

127.0.0.1   localhost 127.0.1.1   mycomputername 

use sudo gedit /etc/hosts update hosts file follow:

127.0.0.1   localhost 127.0.0.1   mycomputername 

reboot ubuntu. nutch should no longer have trouble injecting urls hbase.


Comments

Popular posts from this blog

database - VFP Grid + SQL server 2008 - grid not showing correctly -

jquery - Set jPicker field to empty value -

.htaccess - htaccess convert request to clean url and add slash at the end of the url -