Skip to content

Instantly share code, notes, and snippets.

@rimms
Created October 18, 2012 03:05
Show Gist options
  • Save rimms/3909648 to your computer and use it in GitHub Desktop.
Save rimms/3909648 to your computer and use it in GitHub Desktop.
Jubatus と Zookeeperアサンブル の動作確認

Jubatus と Zookeeperアサンブル の動作確認

事前準備

  1. zookeeper のインストール

     $ cd $HOME/local/src
     $ wget http://ftp.riken.jp/net/apache/zookeeper/zookeeper-3.4.4/zookeeper-3.4.4.tar.gz
     $ cd ../
     $ tar xvfz src/zookeeper-3.4.4.tar.gz
     $ cp -pr zookeeper-3.4.4 zookeeper-node1
     $ cp -pr zookeeper-3.4.4 zookeeper-node2
     $ cp -pr zookeeper-3.4.4 zookeeper-node3
    
  2. zookeeper の設定

     $ vi zookeeper-node1/conf/zoo.cfg
     tickTime=2000
     initLimit=5
     syncLimit=2
     dataDir=/tmp/zookeeper-node1
     clientPort=2181
     server.1=localhost:2888:3888
     server.2=localhost:4888:5888
     server.3=localhost:6888:7888
     $ vi zookeeper-node2/conf/zoo.cfg
     tickTime=2000
     initLimit=5
     syncLimit=2
     dataDir=/tmp/zookeeper-node2
     clientPort=2182
     server.1=localhost:2888:3888
     server.2=localhost:4888:5888
     server.3=localhost:6888:7888
     $ vi zookeeper-node3/conf/zoo.cfg
     tickTime=2000
     initLimit=5
     syncLimit=2
     dataDir=/tmp/zookeeper-node3
     clientPort=2183
     server.1=localhost:2888:3888
     server.2=localhost:4888:5888
     server.3=localhost:6888:7888
     $ mkdir /tmp/zookeeper-node{1,2,3}
     $ for i in 1 2 3; do echo ${i} > /tmp/zookeeper-node${i}/myid ;done;
    
  3. zookeeper の起動確認

     $ for i in 1 2 3; do cd ~/local/zookeeper-node${i}/bin; ./zkServer.sh start; sleep 2; done;
     JMX enabled by default
     Using config: $HOME/local/zookeeper-node1/bin/../conf/zoo.cfg
     Starting zookeeper ... STARTED
     JMX enabled by default
     Using config: $HOME/local/zookeeper-node2/bin/../conf/zoo.cfg
     Starting zookeeper ... STARTED
     JMX enabled by default
     Using config: $HOME/local/zookeeper-node3/bin/../conf/zoo.cfg
     Starting zookeeper ... STARTED
     $ for i in 1 2 3; do echo stat | nc localhost 218${i}; done;
     Zookeeper version: 3.4.4-1386507, built on 09/17/2012 08:33 GMT
     Clients:
      /127.0.0.1:36368[0](queued=0,recved=1,sent=0)
    
     Latency min/avg/max: 0/0/0
     Received: 1
     Sent: 0
     Connections: 1
     Outstanding: 0
     Zxid: 0x0
     Mode: follower
     Node count: 4
     Zookeeper version: 3.4.4-1386507, built on 09/17/2012 08:33 GMT
     Clients:
      /127.0.0.1:48581[0](queued=0,recved=1,sent=0)
    
     Latency min/avg/max: 0/0/0
     Received: 1
     Sent: 0
     Connections: 1
     Outstanding: 0
     Zxid: 0x100000000
     Mode: leader
     Node count: 4
     Zookeeper version: 3.4.4-1386507, built on 09/17/2012 08:33 GMT
     Clients:
      /127.0.0.1:60577[0](queued=0,recved=1,sent=0)
    
     Latency min/avg/max: 0/0/0
     Received: 1
     Sent: 0
     Connections: 1
     Outstanding: 0
     Zxid: 0x100000000
     Mode: follower
     Node count: 4
    
  4. zookeeper の停止確認

     $ for i in 1 2 3; do cd ~/local/zookeeper-node${i}/bin; ./zkServer.sh stop; sleep 2; rm -rf /tmp/zookeeper-node${i}/version-2; done;
    

Jubatus 動作確認

jubakeeper

zookeeper 1台指定(Follower)

  1. jubakeeper 起動

     $ jubaclassifier_keeper --zookeeper localhost:2181
     I1018 13:31:30.795192 20936 keeper.cpp:50] running in port=9199
    
    1. Leader 停止 ... NG

       I1018 13:34:19.302273 20938 zk.cpp:267] zk connection expiration : type(-1) state(1)
      
    2. 指定した Follower 停止 ... OK

       I1018 13:35:48.148105 21144 zk.cpp:267] zk connection expiration : type(-1) state(1)
      
    3. 指定していない Follower 停止 ... OK

       jubakeeper へ影響なし
      

zookeeper 1台指定(Leader)

  1. jubakeeper 起動

     $ jubaclassifier_keeper --zookeeper localhost:2182
     I1018 13:38:24.767088 21564 keeper.cpp:50] running in port=9199
    
    1. Leader 停止 ... OK

       I1018 13:39:27.727032 21566 zk.cpp:267] zk connection expiration : type(-1) state(1)
      
    2. Follower 停止 ... OK

       jubakeeper へ影響なし
      
      1. 停止したノードを再起動

      2. 再起動した Follower 以外の Follower を停止 ... OK

         jubakeeper へ影響なし
        

zookeeper 複数台指定

  1. jubakeeper 起動

     $ jubaclassifier_keeper --zookeeper localhost:2181,localhost:2182,localhost:2183
     I1018 13:43:27.501405 21983 keeper.cpp:50] running in port=9199
    
    1. Leader 停止 ... NG

       I1018 13:45:45.778084 21985 zk.cpp:267] zk connection expiration : type(-1) state(1)
      
    2. Follower 停止 ... OK

       jubakeeper へ影響なし
      
      1. 停止したノードを再起動

      2. 再起動した Follower 以外の Follower を停止 ... OK

         jubakeeper へ影響なし
        

jubaserver

zookeeper 1台指定(Follower)

  1. jubaserver 起動

     $ jubaclassifier --zookeeper localhost:2181 --name case1
     I1018 13:56:59.121166 22954 server_util.cpp:90] starting jubaclassifier0.3.2 RPC server at 10.7.226.237:9199 with timeout: 10
    
    1. Leader 停止 ... NG

       I1018 13:57:22.589130 22956 zk.cpp:267] zk connection expiration : type(-1) state(1)
      
    2. 指定した Follower 停止 ... OK

       I1018 13:58:12.304455 23154 zk.cpp:267] zk connection expiration : type(-1) state(1)
      
    3. 指定していない Follower 停止 ... OK

       jubaserver へ影響なし
      

zookeeper 1台指定(Leader)

  1. jubaserver 起動

     $ jubaclassifier --zookeeper localhost:2182 --name case2
     I1018 14:00:15.167892 23526 server_util.cpp:90] starting jubaclassifier0.3.2 RPC server at 10.7.226.237:9199 with timeout: 10
    
    1. Leader 停止 ... OK

       I1018 14:00:39.095746 23528 zk.cpp:267] zk connection expiration : type(-1) state(1)
      
    2. Follower 停止 ... OK

       jubaserver へ影響なし
      
      1. 停止したノードを再起動

      2. 再起動した Follower 以外の Follower を停止 ... OK

         jubaserver へ影響なし
        

zookeeper 複数台指定

  1. jubaserver 起動

     $ jubaclassifier --zookeeper localhost:2181,localhost:2182,localhost:2183 
    

--name case3 I1018 14:02:53.397080 24012 server_util.cpp:90] starting jubaclassifier0.3.2 RPC server at 10.7.226.237:9199 with timeout: 10

1. Leader 停止 ... __NG__

        I1018 14:03:24.159762 24014 zk.cpp:267] zk connection expiration : type(-1) state(1)

2. Follower 停止 ... _OK_

        jubaserver へ影響なし

    1. 停止したノードを再起動

    2. 再起動した Follower 以外の Follower を停止 ... _OK_

            jubaserver へ影響なし

結論

  • jubakeeper と jubaserver の振舞いは一緒。
  • zookeeper を 1台指定した場合
    • 指定した zookeeper が落ちた場合、つなぎ先の zookeeper がないので、session expire で良い。
    • 指定した zookeeper が Follower のとき、Leader が落ちた場合、Leader が選出されていないだけであり、過半数落ちていないのであれば、つなぎ先の zookeeper は利用可能なので、session expire として扱わなくても良いと考える。
      • 一貫性 という観点で問題はないか?
  • zookeeper を 複数台指定した場合
    • Leader が落ちた場合、Leader が選出されていないだけであり、過半数落ちていないのであれば、zookeeper は利用可能なので、session expire として扱わなくても良いと考える。
      • 一貫性 という観点で問題はないか?

補足

  • type = -1 (ZOO_SESSION_EVENT)
  • state = 1 (ZOO_CONNECTING_STATE)

Watcher にログを入れての動作検証

  1. 起動

    • type = -1 (ZOO_SESSION_EVENT)
    • state = 3 (ZOO_CONNECTED_STATE)
  2. Leader 停止

    • type = -1 (ZOO_SESSION_EVENT)
    • state = 1 (ZOO_CONNECTING_STATE)
  3. Leader 再選出

    • type = -1 (ZOO_SESSION_EVENT)
    • state = 3 (ZOO_CONNECTED_STATE)
  4. 残った 2つ のノードのうち、1つ を停止 (過半数割れ)

    • type = -1 (ZOO_SESSION_EVENT)

    • state = 1 (ZOO_CONNECTING_STATE)

        E1018 17:37:04.085659 32080 zk.cpp:119] /jubatus/actors/classifier/test/master_lock/lock_ failed in creation - connection loss
      

対応案

https://github.com/rimms/jubatus/commit/65a421fe10c0f70f56e2666168cefe1d232f8f94

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment