閱讀530 返回首頁    go 阿裏雲 go 技術社區[雲棲]


《Apache Zookeeper 官方文檔》-3 快速指南:使用zookeeper來協調分布式應用

本節內容讓你快速入門zookeeper。它主要針對想嚐試使用zookeeper的開發者,並包含一個ZooKeeper單機服務器的安裝說明,你可以用一些命令來驗證它的運行,以及簡單的編程實例。最後,為了考慮到方便性,有一些複雜的安裝部分,例如運行集群式的部署安裝,優化事務日誌將不在本文檔中說明。對於商業部署的完整說明,請參閱管理員指南

一:前提準備條件

請看下管理員指南中的  System Requirements 。

二:下載

從Apache 鏡像裏麵下載最近的一個穩定版本ZooKeeper 。

三:單機配置

在單機模式中配置一個ZooKeeper服務器是非常簡單的。一個JAR文件裏包含了這個服務,安裝隻需要創建一個配置文件。一旦你下載了一個穩定版的ZooKeeper,解壓它並用cd命令進入ZooKeeper的根目錄。

你需要配置一個文件來啟動ZooKeeper,下麵有個例子,創建一個文件conf/zoo.cfg:

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181

可以用任何應用程序打開conf/zoo.cfg。並且可以通過改變 dataDir的值來指定一個新的目錄。每個字段的含義如下:

  • tickTime:是zookeeper的最小時間單元的長度(以毫秒為單位),它被用來設置心跳檢測和會話最小超時時間(tickTime的兩倍)
  • dataDir:用來配置服務器存儲數據快照的目錄,除非特別配置說明,事務日誌也會被存儲到這個目錄。
  • clientPort:用來配置監聽客戶端的連接的端口。

當這些都配置好之後,就可以使用如下命令啟動zookeeper:

1 bin/zkServer.sh start

Zookeeper的日誌使用了log4j,更多細節信息請查看zookeeper程序指南中的Logging章節

可以從控製台看到日誌信息,或者從log4j的配置的日誌文件中查看日誌。 這一小節,主要講了如何啟動單機模式的zookeeper。在這裏沒有使用集群的設置,一旦ZooKeeper 進程出現故障,服務就會終止,這對於大多數時候的開發環境是沒問題的,但想要運行以集群的方式來運行ZooKeeper ,請查看Running Replicated ZooKeeper

四:Zookeeper的存儲管理

對於長時間運行的生產係統 ,ZooKeeper存儲必須要經常維護(dataDir和日誌)。如果想了解更多細節請看maintenance章節。

五:連接Zookeeper

$ bin/zkCli.sh -server 127.0.0.1:2181

一旦Zookeeper運行起來,你可以有多種方式來連接它,一旦你連接上Zookeeper你會看到下麵這些信息:

1 Connecting to localhost:2181
2 log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
3 log4j:WARN Please initialize the log4j system properly.
4 Welcome to ZooKeeper! JLine support is enabled
5 [zkshell: 0]

從shell端你輸入help,獲得一個可以在客戶端執行的命令清單,如下所示:

01 [zkshell: 0] help
02 ZooKeeper host:port cmd args
03 get path [watch]
04 ls path [watch]
05 set path data [version]
06 delquota [-n|-b] path
07 quit
08 printwatches on|off
09 create path data acl
10 stat path [watch]
11 listquota path
12 history
13 setAcl path acl
14 getAcl path
15 sync path
16 redo cmdno
17 addauth scheme auth
18 delete path [version]
19 deleteall path
20 setquota -n|-b val path

從這裏,你可以嚐試一些簡單的命令來了解這個命令行接口。首先,開始通過發一行命令,如ls,

[zkshell: 8] ls /

[zookeeper]

Next, create a new znode by running create /zk_test my_data. This creates a new znode and associates the string “my_data” with the node. You should see:

接下來,通過運行create /zk_test my_data命令來創建一個新的節點。這將創建一個新的znode和關聯字符串”my_data”節點。您應該看到

[zkshell: 9] create /zk_test my_data

Created /zk_test

發出另一個ls /命令的目錄是什麼樣子

[zkshell: 11] ls /

[zookeeper, zk_test]

現在注意到這個zk_test目錄已經被創建,接下來我們還通過set命令來改變zk_test的數據,如下所示:

01 zkshell: 14] set /zk_test junk
02  
03 cZxid = 5
04  
05 ctime = Fri Jun 05 13:57:06 PDT 2009
06  
07 mZxid = 6
08  
09 mtime = Fri Jun 05 14:01:52 PDT 2009
10  
11 pZxid = 5
12  
13 cversion = 0
14  
15 dataVersion = 1
16  
17 aclVersion = 0
18  
19 ephemeralOwner = 0
20  
21 dataLength = 4
22  
23 numChildren = 0
24  
25 [zkshell: 15] get /zk_test
26  
27 junk
28  
29 cZxid = 5
30  
31 ctime = Fri Jun 05 13:57:06 PDT 2009
32  
33 mZxid = 6
34  
35 mtime = Fri Jun 05 14:01:52 PDT 2009
36  
37 pZxid = 5
38  
39 cversion = 0
40  
41 dataVersion = 1
42  
43 aclVersion = 0
44  
45 ephemeralOwner = 0
46  
47 dataLength = 4
48  
49 numChildren = 0

(注意:我們可以在set命令執行之後,使用get來查正式數據是否已經改變) 最後讓我們刪除這個節點:

[zkshell: 16] delete /zk_test

[zkshell: 17] ls /

[zookeeper] [zkshell: 18]

六:編程

ZooKeeper 有C語言和java兩個版本 ,它們功能上是一樣的。C語言版本有2個不同點,單線程和多線程。這些差異僅僅在消息循環時候體現出來。更多細節,請查看Zookeeper編程指南中的的編程案例,演示了使用不同的API的樣例代碼。

七:集群模式運行

在單機模式下運行ZooKeeper主要用於學習,開發和測試。但是如果在產品中使用,你應該在集群模式下運行ZooKeeper。同一個應用服務器的一個集群組我們稱為一個集群。在集群模式下, 在集群下的所有服務器可以複製同樣的配置。

注意:
在集群模式中,至少需要三個服務器,強烈推薦你使用奇數數量的服務器。如果你僅僅隻有兩台服務器,一旦一個服務器掛了,你將會麵臨一個局麵,沒有足夠的機器組成集群,兩台服務器本來就比一台服務器更加不穩定,因為會有兩個故障點。

集群模式和單點模式一樣需要使用conf/zoo.cfg 文件,但是有一些不同,這裏有一個例子:

1 tickTime=2000
2 dataDir=/var/lib/zookeeper
3 clientPort=2181
4 initLimit=5
5 syncLimit=2
6 server.1=zoo1:2888:3888
7 server.2=zoo2:2888:3888
8 server.3=zoo3:2888:3888

The new entry, initLimit is timeouts ZooKeeper uses to limit the length of time the ZooKeeper servers in quorum have to connect to a leader. The entry syncLimit limits how far out of date a server can be from a leader.

With both of these timeouts, you specify the unit of time using tickTime. In this example, the timeout for initLimit is 5 ticks at 2000 milleseconds a tick, or 10 seconds.

The entries of the form server.X list the servers that make up the ZooKeeper service. When the server starts up, it knows which server it is by looking for the filemyid in the data directory. That file has the contains the server number, in ASCII.

Finally, note the two port numbers after each server name: ” 2888″ and “3888”. Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.

Note

If you want to test multiple servers on a single machine, specify the servername as localhost with unique quorum & leader election ports (i.e. 2888:3888, 2889:3889, 2890:3890 in the example above) for each server.X in that server’s config file. Of course separate dataDirs and distinct clientPorts are also necessary (in the above replicated example, running on a single localhost, you would still have three config files).

Please be aware that setting up multiple servers on a single machine will not create any redundancy. If something were to happen which caused the machine to die, all of the zookeeper servers would be offline. Full redundancy requires that each server have its own machine. It must be a completely separate physical server. Multiple virtual machines on the same physical host are still vulnerable to the complete failure of that host.

八:其他優化

There are a couple of other configuration parameters that can greatly increase performance:

  • To get low latencies on updates it is important to have a dedicated transaction log directory. By default transaction logs are put in the same directory as the data snapshots and myid file. The dataLogDir parameters indicates a different directory to use for the transaction logs.
  • [tbd: what is the other config param?]

最後更新:2017-05-22 09:01:47

  上一篇:go  《Redis官方文檔》 redis 虛擬內存
  下一篇:go  GPU資源的監控和報警,支撐高效深度學習的利器