When using S4, we found sometimes it ends up with one process node owns 2 tasks. I did some investigation, it seems that the handling of ConnectionLossException when creating the ephemeral node is problematic. Sometimes when the response from zookeeper server times out, zookeeper.create() will fail with ConnectionLossException while the creation request might already be sent to server(see http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java line 830). From our logs this is the case we ran into.
Maybe we should handle it in the way that HBase is handling it (http://svn.apache.org/viewvc/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java?view=markup), just simply exit the process when got that exception to let the whole process restart.
When using S4, we found sometimes it ends up with one process node owns 2 tasks. I did some investigation, it seems that the handling of ConnectionLossException when creating the ephemeral node is problematic. Sometimes when the response from zookeeper server times out, zookeeper.create() will fail with ConnectionLossException while the creation request might already be sent to server(see http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java line 830). From our logs this is the case we ran into.
Maybe we should handle it in the way that HBase is handling it (http://svn.apache.org/viewvc/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java?view=markup), just simply exit the process when got that exception to let the whole process restart.