Thursday, January 5, 2012

Error that occured in Hadoop and its sub-projects

1. OOZIE job failed:

Error message : ERROR is considered as FAILED for SLA
Cause 1 : Not able to find hadoop namenode (master), jobtracker machine.
Suppose you are running oozie, hadoop-master and job tracker on one machine  and datanode, tasktracker are running on another machine.

Your file contains following lines:
In above case, FS action will work fine because no map-reduce opertion is perform in FS action case. But, if you run map-reduce action then tasktracker will look hadoop-master on localhost machine becuase we have used localhost:9000 in file.
Solution : Used  IP of hadoop-namenode and jobtracker machine in file instead of localhost.   
Cause 2 : Oozie not able to find Mysql server.
Suppose I am using mysql as a metastore for hive.
Hive hive-default.xml file have following lines :
<description>JDBC connect string for a JDBC metastore</description>
Solution : Use IP of mysql machine instead of localhost. 

2. Zookeeper server not running:
Error message: Could not find my address: zk-serevr1 in list of ZooKeeper quorum servers
Causes :
HBase tries to start a ZK server on some machine but that machine isn't able to find itself in the hbase.zookeeper.quorum configuration. This is a name lookup problem. 

Use the hostname presented in the error message instead of the value you used (zk-server1). If you have a DNS server, you can set hbase.zookeeper.dns.interface and hbase.zookeeper.dns.nameserver in hbase-site.xml to make sure it resolves to the correct FQDN.

3. Hadoop-datanode job failed or datanode not running: File ../mapred/system/ could only be replicated to 0 nodes, instead of 1
Cause 1: Make sure atleast one datanode is running.

Cause 2: namespaceID of master and slaves machines are not same.
If you see the error Incompatible namespaceIDs in the logs of a datanode , chances are you are affected by bug HADOOP-1212 (well, I’ve been affected by it at least).
Solution :               
If namespaceID of master and slaves machines are not same. Than replace the namespaceID of slaves machine with master namespaceID.
- dfs/name/current/VERSION file contains the namespaceID of master machine
- dfs/data/current/VERSION file contains the namespaceID of master machine
Cause 3: Datanode instance running out of space.
Solution : Free some space.

Cause 4 : You may also get this message due to permissions. May be JobTracker can not create on startup.

4.    Sqoop export command failed:
Error message:
attempt_201101151840_1006_m_000001_0, Status : FAILED
at java.util.AbstractList$
at impressions_by_zip.__loadFromFields(
at impressions_by_zip.parse(

Cause : Given field separator is not valid
Solution : Specify correct field delimeter in sqoop export command.

5. HBase regionserver not running :

Error message: 2012-01-02 13:48:49,973 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Master rejected startup because clock is out of sync
org.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hadoop-datanode2,60020,1325492317440 has been rejected; Reported time is too far out of sync with master.  Time difference of 206141ms > max allowed of 30000ms

Solution: Clock of regionservers are not sync with master machine. Synchronized the clock of hbase master and regionserver machines.


Anonymous said...

starting namenode, logging to /usr/local/hadoop/hadoop-
localhost: starting datanode, logging to /usr/local/hadoop/hadoop-
localhost: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-
hduser@hadoop-ThinkCentre-A51:~$ jps
12799 SecondaryNameNode
12837 Jps

plz hlp us to correct this error

Ankit Jain said...

Please share the namenode and datanode logs.

Anonymous said...

ACTION[0000017-120306175650364-oozie-oozi-W@mr-node] checking action, external ID [job_201203062014_0014] status [RUNNING]
2012-03-07 22:06:23,420 INFO CallbackServlet:525 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000017-120306175650364-oozie-oozi-W] ACTION[0000017-120306175650364-oozie-oozi-W@mr-node] callback for action [0000017-120306175650364-oozie-oozi-W@mr-node]
2012-03-07 22:06:23,577 INFO MapReduceActionExecutor:525 - USER[hdfs] GROUP[users] TOKEN[] APP[map-reduce-wf] JOB[0000017-120306175650364-oozie-oozi-W] ACTION[0000017-120306175650364-oozie-oozi-W@mr-node] action completed, external ID [job_201203062014_0014]
2012-03-07 22:06:23,625 WARN MapReduceActionExecutor:528 - USER[hdfs] GROUP[users] TOKEN[] APP[map-reduce-wf] JOB[0000017-120306175650364-oozie-oozi-W] ACTION[0000017-120306175650364-oozie-oozi-W@mr-node] LauncherMapper died, check Hadoop log for job []
2012-03-07 22:06:23,803 INFO ActionEndCommand:525 - USER[hdfs] GROUP[users] TOKEN[] APP[map-reduce-wf] JOB[0000017-120306175650364-oozie-oozi-W] ACTION[0000017-120306175650364-oozie-oozi-W@mr-node] ERROR is considered as FAILED for SLA

hi i am avinash i installed oozie using tar ball and ran oozie job from hdfs user i got error
ERROR is considered as FAILED for SLA

can u help me

Ankit Jain said...

Hi Avinash,

Which oozie action you have run?? .. Look into the hadoop jobtracker log, may you get some clue.

Renata Ghisloti Duarte de Souza said...

Nice post. Thanks!

sundara rami reddy said...

hi ,you have gathered a valuable information on Hadoop...., and i am much impressed with the information and it is useful for Hadoop Learners.
Hadoop Training in hyderabad

Jhon Mick said...

Uniqe informative article and of course True words, thanks for sharing. Today I see myself proud to be a hadoop professional with strong dedication and will power by blasting the obstacles. Thanks to Hadoop Training in Chennai

dhanamlakshmi palu said...

Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.AWS course chennai | AWS Certification in chennai | AWS Certification chennai

dhanalakshmi palu said...

Your posts is really helpful for me.Thanks for your wonderful post. I am very happy to read your post. VMWare Training in chennai | VMWare Training chennai | VMWare course in chennai

Post a Comment