Monday, March 9, 2009

Hbase: Too many open files

While running a job that was inserting a lot of records in HBase (~7,000,000 records), I was getting this exception (in datanode log file):

2009-03-09 20:26:21,072 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(127.0.0.1:50010, storageID=DS-1758498865-127.0.1.1-50010-1236539641254
, infoPort=50075, ipcPort=50020):DataXceiveServer: java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145)
at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:130)
at java.lang.Thread.run(Thread.java:619)


The error is verbose, but I had problem raising this limit until I found this post. It seems like you have to modify two files to raise the limit on ubuntu. First of all, you need to add the following line to /etc/pam.d/common-session (as root):
  • session required pam_limits.so
After enabling the pam limits module, you can simply edit the limits.conf file located at /etc/security/limits.conf to add a specific number of opened files limit to the hadoop user (in my case, hadoop). Add the following lines according to your installation:
  • hadoop hard nofile 65000
  • hadoop soft nofile 30000
Logout and login back into your hadoop user account and your limit should be raised. To check if it's really the case, execute the following command: ulimit -n. It should return 30000 (the soft limit we set)