《kafka中文手冊》- 構架設計（二）

4.6 Message Delivery Semantics 消息分發語義

Now that we understand a little about how producers and consumers work, let’s discuss the semantic guarantees Kafka provides between producer and consumer. Clearly there are multiple possible message delivery guarantees that could be provided:

現在我們大致理解生產者和消息者是怎麼工作的, 現在我們討論下kafka提供的基於生產者和消費者之間提供的保障語言. 很清楚地, 這裏有三種消息的發送保障機製

At most once—Messages may be lost but are never redelivered. 最多一次, 消息可能會丟失, 但是不會被重複分發
At least once—Messages are never lost but may be redelivered. 至少一次, 消息不會丟失, 但有可能會重複分發
Exactly once—this is what people actually want, each message is delivered once and only once. 有且僅有一次, 這是人們最終想要的, 消息僅且隻會被分發一次

It’s worth noting that this breaks down into two problems: the durability guarantees for publishing a message and the guarantees when consuming a message.

值得注意的是, 這會拆分成兩個問題, 發布消息可用性的保障和消息消息可用性的保障.

Many systems claim to provide “exactly once” delivery semantics, but it is important to read the fine print, most of these claims are misleading (i.e. they don’t translate to the case where consumers or producers can fail, cases where there are multiple consumer processes, or cases where data written to disk can be lost).

很多係統聲稱他們能夠提供僅此一次的分發語義, 但是仔細讀的話, 所有的這些聲明都是誤導(他們沒有考慮到消費者或提供者可能會失敗的情況, 或是多個消費者的情況, 或是數據寫入磁盤時丟失的情況)

Kafka’s semantics are straight-forward. When publishing a message we have a notion of the message being “committed” to the log. Once a published message is committed it will not be lost as long as one broker that replicates the partition to which this message was written remains “alive”. The definition of alive as well as a description of which types of failures we attempt to handle will be described in more detail in the next section. For now let’s assume a perfect, lossless broker and try to understand the guarantees to the producer and consumer. If a producer attempts to publish a message and experiences a network error it cannot be sure if this error happened before or after the message was committed. This is similar to the semantics of inserting into a database table with an autogenerated key

.kafka提供的語義十分直接. 在發布消息的時, 我們有一個消息正在被提及到日誌的概念, 一旦消費被提交上, 就不會丟失, kafka服務器把對這個分區上的消息複製到其他活著的服務器上.關於存活的定義和失敗的定義我們將會在下麵講到, 現在, 我們假設有一個很好的, 不會丟失的服務器, 嚐試對消費者和生產者提供保障. 如果生產者嚐試發布消息碰到網絡異常, 它無法確認這個錯誤是在消息提及之前還是在消息提及之後. 這個類似於使用插入數據到數據庫時使用自動增長的主鍵的情形.

These are not the strongest possible semantics for publishers. Although we cannot be sure of what happened in the case of a network error, it is possible to allow the producer to generate a sort of “primary key” that makes retrying the produce request idempotent. This feature is not trivial for a replicated system because of course it must work even (or especially) in the case of a server failure. With this feature it would suffice for the producer to retry until it receives acknowledgement of a successfully committed message at which point we would guarantee the message had been published exactly once. We hope to add this in a future Kafka version.

這裏沒有對生產者使用強製可能的語境. 因為, 我們無法確定網絡是否會發生異常, 有可能讓生產者創建有序的主鍵, 使得提供者在重試請求是對等的. 這個特性對一個複製係統非常重要, 它甚至要保證服務器宕機時也能工作, 使用這個特性允許生產者重試, 直到接收到消息已經成功提交的反饋信息, 在這個點上可以保證消息指北發布一次. 我們打算把這個特性發布到未來的kafka版本中

Not all use cases require such strong guarantees. For uses which are latency sensitive we allow the producer to specify the durability level it desires. If the producer specifies that it wants to wait on the message being committed this can take on the order of 10 ms. However the producer can also specify that it wants to perform the send completely asynchronously or that it wants to wait only until the leader (but not necessarily the followers) have the message.

並不是所有的情況都需要這樣強的保障的. 對於那些對延遲比較敏感的生產者, 我們允許生產者自定義可用性級別. 比如生產者願意等候消息10ms後再被提交. 然而, 生產者也可以配置完全使用異步發送, 或者等候到主服務器(而不是所有的副本)已經擁有這份消息.

Now let’s describe the semantics from the point-of-view of the consumer. All replicas have the exact same log with the same offsets. The consumer controls its position in this log. If the consumer never crashed it could just store this position in memory, but if the consumer fails and we want this topic partition to be taken over by another process the new process will need to choose an appropriate position from which to start processing. Let’s say the consumer reads some messages — it has several options for processing the messages and updating its position.

現在, 讓我來查看下消費者在這方麵對應的語義. 所有的副本都有同樣的日誌和位移, 消費者控製它在日誌中的唯一位置, 如果消費者沒有奔潰, 它隻需要簡單地把位置儲存到內存中, 但是如果消費者失敗了, 我們希望這個主題的分區能夠被其它新的消費者進程使用, 並能從原先的合適的位置開始讀取. 我們說消費者需要讀取一些信息–它有一些選項用於處理消息和更新它的位置.

It can read the messages, then save its position in the log, and finally process the messages. In this case there is a possibility that the consumer process crashes after saving its position but before saving the output of its message processing. In this case the process that took over processing would start at the saved position even though a few messages prior to that position had not been processed. This corresponds to “at-most-once” semantics as in the case of a consumer failure messages may not be processed.
它可以先讀取消息, 然後把位置保存到日誌中, 然後處理消息, 在這種情況下, 消費者有可能在保存了日誌點後, 在處理消息輸出數據是奔潰, 這時進程可能從日誌點讀取位移位置接下去處理數據, 盡管之前有一些數據處理失敗了. 這種對於在消費者失敗時最多被處理一次的語義
It can read the messages, process the messages, and finally save its position. In this case there is a possibility that the consumer process crashes after processing messages but before saving its position. In this case when the new process takes over the first few messages it receives will already have been processed. This corresponds to the “at-least-once” semantics in the case of consumer failure. In many cases messages have a primary key and so the updates are idempotent (receiving the same message twice just overwrites a record with another copy of itself).
它可以先讀取, 然後處理消息, 最後保存位移. 在這種情況下, 消費者有可能在保存位移時奔潰, 新進程重啟是可能就會接收到一些之前處理過的數據, 這對應於至少被消費一次的語義, 很多情況下, 消息可能有自己的主鍵, 所以在更新上是等效的(收到一份相同的消息兩次, 隻不過對同一份記錄覆蓋兩次)
So what about exactly once semantics (i.e. the thing you actually want)? The limitation here is not actually a feature of the messaging system but rather the need to co-ordinate the consumer’s position with what is actually stored as output. The classic way of achieving this would be to introduce a two-phase commit between the storage for the consumer position and the storage of the consumers output. But this can be handled more simply and generally by simply letting the consumer store its offset in the same place as its output. This is better because many of the output systems a consumer might want to write to will not support a two-phase commit. As an example of this, our Hadoop ETL that populates data in HDFS stores its offsets in HDFS with the data it reads so that it is guaranteed that either data and offsets are both updated or neither is. We follow similar patterns for many other data systems which require these stronger semantics and for which the messages do not have a primary key to allow for deduplication.
那什麼是僅此一次的語義(這個是不是你真的想要的), 這個限製其實並不是消息係統的特性, 而是要協調消費者的位置和它實際輸出的儲存方式. 經典的解決方式是在數據儲存和消費者位移儲存間引入來兩次提交的方式, 但是可以使用更簡單的方式把消費者的位移位置和數據輸出保存到同一個位置上, 因為有很多儲存係統並不支持兩相提交. 例如, 我們的hadoop ETL工具從保存數據到dhfs上的同時也把位移位置也保存到hdfs中了, 這樣可以保證數據和位移位置同時被更新或者都沒更新.我們在很多係統上使用類似的模式, 用於解決那些需要這種強語義但是卻沒有主鍵用於區分重複的儲存係統中.
So effectively Kafka guarantees at-least-once delivery by default and allows the user to implement at most once delivery by disabling retries on the producer and committing its offset prior to processing a batch of messages. Exactly-once delivery requires co-operation with the destination storage system but Kafka provides the offset which makes implementing this straight-forward.
因此kafka默認保障最少一次的分發語義, 並允許用戶禁止重試和在處理數據之前提及它的位移位置來實現最多一次的語義, 有且僅有一次, 這種語義需要和輸出的目的儲存係統相結合, 但是kafka提供的位移語義使得實現這些功能非常簡單.

4.7 Replication 複製

Kafka replicates the log for each topic’s partitions across a configurable number of servers (you can set this replication factor on a topic-by-topic basis). This allows automatic failover to these replicas when a server in the cluster fails so messages remain available in the presence of failures.

kafka為每個主題的分區日誌複製到一個可配置的數據的服務器集群上(你可以對每個主題設置副本數). 這保證了如果集群中有服務器宕機時能夠自動恢複, 消息可以從剩餘的服務器中讀取.

Other messaging systems provide some replication-related features, but, in our (totally biased) opinion, this appears to be a tacked-on thing, not heavily used, and with large downsides: slaves are inactive, throughput is heavily impacted, it requires fiddly manual configuration, etc. Kafka is meant to be used with replication by default—in fact we implement un-replicated topics as replicated topics where the replication factor is one.

其他消息係統也提供複製特性, 但是, 在我們(有點偏見地)看來, 這視乎是一個附加的特性, 不能大量使用, 並且伴隨大量的缺點, 備機不是活躍的, 吞吐量嚴重受到影響. 還需要繁瑣的人工配置等等. kafka默認開啟複製功能, 實際上我們把沒有實現複製的主題當作副本隻有一個的複製主題來看待.

The unit of replication is the topic partition. Under non-failure conditions, each partition in Kafka has a single leader and zero or more followers. The total number of replicas including the leader constitute the replication factor. All reads and writes go to the leader of the partition. Typically, there are many more partitions than brokers and the leaders are evenly distributed among brokers. The logs on the followers are identical to the leader’s log—all have the same offsets and messages in the same order (though, of course, at any given time the leader may have a few as-yet unreplicated messages at the end of its log).

複製是基於主題分區. 在沒有失敗的情況下, 每個分區在kafka中有一個主分區和零個或多個備份分區, 所有的這些副本包括主分區構成了複製因子.所有讀寫都使用主分區. 正常情況下, 分區數量一般比服務器多的多, 所有的主分區最終分布到所有的服務器上. 在備份服務器上的日誌一般和主服務器的日誌一致, 擁有相同的偏移量和消息順序(當然, 在特定的時間內, 主分區日誌的尾部可能有一些消息沒有複製到主服務器上)

Followers consume messages from the leader just as a normal Kafka consumer would and apply them to their own log. Having the followers pull from the leader has the nice property of allowing the follower to naturally batch together log entries they are applying to their log.

備份服務器獲從主服務器獲取消息就像kafka的消費著讀取並記錄到自己的日誌中. 這些從服務器有個很好的特性, 就是能自然地獲取批量數據並應用到他們自己的日誌中.

As with most distributed systems automatically handling failures requires having a precise definition of what it means for a node to be “alive”. For Kafka node liveness has two conditions

和大部分分布式係統一樣, 自動處理容錯需要對節點”存活”有一個準確的定義, 比如kafka節點存活有兩個條件

A node must be able to maintain its session with ZooKeeper (via ZooKeeper’s heartbeat mechanism) 節點必須能夠和zookeeper機器建立心跳信號
If it is a slave it must replicate the writes happening on the leader and not fall “too far” behind 如果是個備份節點, 必須在主節點寫的時候進行複製, 不能落下太遠.

We refer to nodes satisfying these two conditions as being “in sync” to avoid the vagueness of “alive” or “failed”. The leader keeps track of the set of “in sync” nodes. If a follower dies, gets stuck, or falls behind, the leader will remove it from the list of in sync replicas. The determination of stuck and lagging replicas is controlled by the replica.lag.time.max.ms configuration.

我們稱滿足這兩個條件的節點為”in sync”(在同步中), 避免使用”alive”或”failed”這中模煳的術語. 主節點保持跟蹤同步中的節點, 如果一個備份節點宕機, 卡住, 或跟不上, 主幾點將會把它從已經同步的複製集中刪除, 用於判定卡住或者落後延遲, 使用 replica.lag.time.max.ms 這個配置參數

In distributed systems terminology we only attempt to handle a “fail/recover” model of failures where nodes suddenly cease working and then later recover (perhaps without knowing that they have died). Kafka does not handle so-called “Byzantine” failures in which nodes produce arbitrary or malicious responses (perhaps due to bugs or foul play).

在一個分布式的術語裏, 我們嚐試處理”失敗/恢複”模型, 像節點突然停止工作, 然後又恢複的(可能不知道他們是否宕機了). kafka不處理所謂的“拜占庭”故障，比如節點產生任意或惡意的反饋(比如bug或不規範行為)

A message is considered “committed” when all in sync replicas for that partition have applied it to their log. Only committed messages are ever given out to the consumer. This means that the consumer need not worry about potentially seeing a message that could be lost if the leader fails. Producers, on the other hand, have the option of either waiting for the message to be committed or not, depending on their preference for tradeoff between latency and durability. This preference is controlled by the acks setting that the producer uses.

一條消息隻有在它在所有的同步副本集的日誌分區都已經提交了, 才被當作是”已提交”. 隻有已經提交的消息才會分發給消費者. 這說明消費者不需要擔心會看到主節點宕機時消息會丟失. 生產者可以在延遲和持久性中, 決定是否等待消息提交. 這個在生產者中的反饋配置項中可以設置.

The guarantee that Kafka offers is that a committed message will not be lost, as long as there is at least one in sync replica alive, at all times.

kafka保證已經提交的數據不會丟失, 同步的複製集中有一個節點是一直都是存活的.

Kafka will remain available in the presence of node failures after a short fail-over period, but may not remain available in the presence of network partitions.

kafka可以保證節點在一個短暫的宕機時, 仍然可用. 但是無法保證網絡出現腦裂時仍然可用.

Replicated Logs: Quorums, ISRs, and State Machines (Oh my!) 複製日誌

At its heart a Kafka partition is a replicated log. The replicated log is one of the most basic primitives in distributed data systems, and there are many approaches for implementing one. A replicated log can be used by other systems as a primitive for implementing other distributed systems in the state-machine style.

kafka分區機製的核心是複製日誌. 複製日誌是分布式係統中最基礎的最原始的東西, 要實現這一的功能有很多方式, 複製日誌可以被其他係統用作分布式係統的基礎措施.

A replicated log models the process of coming into consensus on the order of a series of values (generally numbering the log entries 0, 1, 2, …). There are many ways to implement this, but the simplest and fastest is with a leader who chooses the ordering of values provided to it. As long as the leader remains alive, all followers need to only copy the values and ordering the leader chooses.

複製日誌模型用於處理連續輸入的, 有序的記錄值(像有編號的日誌1, 2, 3). 這裏有很多實現的方式, 但是最簡單和最有效的方式是, 主節點選擇和提供這個順序值. 隻要主節點存活, 備份節點隻要按主幾點選擇的順序拷貝這些值就可以了.

Of course if leaders didn’t fail we wouldn’t need followers! When the leader does die we need to choose a new leader from among the followers. But followers themselves may fall behind or crash so we must ensure we choose an up-to-date follower. The fundamental guarantee a log replication algorithm must provide is that if we tell the client a message is committed, and the leader fails, the new leader we elect must also have that message. This yields a tradeoff: if the leader waits for more followers to acknowledge a message before declaring it committed then there will be more potentially electable leaders.

當然如果主節點不宕機, 我們也不需要備份節點! 如果主節點宕機了, 我們需要從備份節點中選擇一個新的主節點. 但是備份節點本身也有可能宕機或者延遲, 所以我們必須選擇最新的備份節點. 最基本的保證是, 一個複製算法必須提供,”如果我們告訴客戶端消息已經提交了, 這個時候主節點宕機, 新的主節點被選舉出來時必須保證也有擁有這條消息”, 這裏有一個權衡, 主節點必須等待多個從節點反饋消息已經提交, 這樣才能有更多備節點能用來做為主節點的候選節點.

If you choose the number of acknowledgements required and the number of logs that must be compared to elect a leader such that there is guaranteed to be an overlap, then this is called a Quorum.

如果你選擇那些需要反饋的數量和可以用於選舉為主節點的日誌數可以保證重疊, 這個叫做 Quorum

A common approach to this tradeoff is to use a majority vote for both the commit decision and the leader election. This is not what Kafka does, but let’s explore it anyway to understand the tradeoffs. Let’s say we have 2f+1 replicas. If f+1 replicas must receive a message prior to a commit being declared by the leader, and if we elect a new leader by electing the follower with the most complete log from at least f+1 replicas, then, with no more than f failures, the leader is guaranteed to have all committed messages. This is because among any f+1 replicas, there must be at least one replica that contains all committed messages. That replica’s log will be the most complete and therefore will be selected as the new leader. There are many remaining details that each algorithm must handle (such as precisely defined what makes a log more complete, ensuring log consistency during leader failure or changing the set of servers in the replica set) but we will ignore these for now.

一種達到這種目標的最常用的方法是, 在提交決策和選舉決策上都使用最多投票的方式, 這不是kafka的實現, 但是我們為了明白這個原理還是解釋下.如果說我們有2f+1個副本, 那麼f+1的副本必須在主節點提交日誌前接受到消息, 這樣我們就可以從擁有最完全的日誌的f+1個副本集中選擇出主服務器. 因為在任何f+1個副本中, 肯定有一個副本是包含全部的日誌的, 這個副本的日誌是最新的, 因此會被選擇為主節點. 這裏有很多關於這個算法的細節需要處理(像如果定義使日誌更全些, 再主節點宕機時保證日誌的一致性, 修改複製集中日誌的副本數 ), 但是我們現在先忽略

This majority vote approach has a very nice property: the latency is dependent on only the fastest servers. That is, if the replication factor is three, the latency is determined by the faster slave not the slower one.

There are a rich variety of algorithms in this family including ZooKeeper’s Zab, Raft, and Viewstamped Replication. The most similar academic publication we are aware of to Kafka’s actual implementation is PacificA from Microsoft.

主選舉機製有一個相當好的屬性, 延遲隻依賴於最快的服務器. 因此, 如果複製集是3, 延遲取決於最快的備份節點而不是最慢的那個.

這裏還有很多類似的算法變體, 例如ZooKeeper’s Zab, Raft, and Viewstamped Replication. 我們關注的最相似的學術刊物是kafka的實際實現

The downside of majority vote is that it doesn’t take many failures to leave you with no electable leaders. To tolerate one failure requires three copies of the data, and to tolerate two failures requires five copies of the data. In our experience having only enough redundancy to tolerate a single failure is not enough for a practical system, but doing every write five times, with 5x the disk space requirements and 1/5th the throughput, is not very practical for large volume data problems. This is likely why quorum algorithms more commonly appear for shared cluster configuration such as ZooKeeper but are less common for primary data storage. For example in HDFS the namenode’s high-availability feature is built on a majority-vote-based journal, but this more expensive approach is not used for the data itself.

多數投票的缺點是，它並不容忍很多的失敗, 導致你沒有可被選擇為主的備節點.為了容忍1個錯誤需要3份的副本數據, 要容忍3個失敗需要5份副本數據. 在我們的經驗中, 用足夠多的冗餘來來容忍單一的錯誤在現實中的係統是不夠的, 這樣每次寫5次, 使用5被的硬盤空間和5份之一的帶寬, 在大體量的數據儲存上不是特別實踐, 所以quorum的算法機製在共享的集群配置中好像更為常見寫, 但是在主儲存結構上比較少用, 例如, hdfs的namenode節點使用基於主副本的選舉機製建立高可用性能, 但是由於代價太高沒有用在數據方麵

Kafka takes a slightly different approach to choosing its quorum set. Instead of majority vote, Kafka dynamically maintains a set of in-sync replicas (ISR) that are caught-up to the leader. Only members of this set are eligible for election as leader. A write to a Kafka partition is not considered committed until all in-sync replicas have received the write. This ISR set is persisted to ZooKeeper whenever it changes. Because of this, any replica in the ISR is eligible to be elected leader. This is an important factor for Kafka’s usage model where there are many partitions and ensuring leadership balance is important. With this ISR model and f+1 replicas, a Kafka topic can tolerate f failures without losing committed messages.

kafka使用有點兒不太一樣的策略來選擇他的quorum集合. 不像多數投票一樣, kafka動態維護能跟得上主節點的複製集合(ISR), 隻有在這個集合裏麵的成員才有資格被選舉為主節點, 這個對於kafka這種擁有很多分區並且需要保證主節點的負載均衡的模型來說非常重要. 使用ISR這樣的模型和f+1個副本, kafka的主題可以容忍f個節點宕機後已經提交的消息不會丟失.

For most use cases we hope to handle, we think this tradeoff is a reasonable one. In practice, to tolerate f failures, both the majority vote and the ISR approach will wait for the same number of replicas to acknowledge before committing a message (e.g. to survive one failure a majority quorum needs three replicas and one acknowledgement and the ISR approach requires two replicas and one acknowledgement). The ability to commit without the slowest servers is an advantage of the majority vote approach. However, we think it is ameliorated by allowing the client to choose whether they block on the message commit or not, and the additional throughput and disk space due to the lower required replication factor is worth it.

在大部分情況下我們希望能處理的, 認為這樣的權衡是合理的. 在實踐中, 為了對f幾個節點宕機進行容錯, 無論是多數投票還是ISR策略都需要等待一樣數量的副本都得到通知後才能提交消息(為了避免一個節點宕機, 多數投票策略需要3個副本和一次反饋, isr策略需要2個副本和一次反饋). 不需要等待最慢的服務器就能提交消息是多數投票策略最大的優點.盡管這樣, 我們進行改善,讓客戶端選擇是否等待消息提交, 使用較小的副本因子會帶來額外的吞吐量帶來的價值可能不菲.

Another important design distinction is that Kafka does not require that crashed nodes recover with all their data intact. It is not uncommon for replication algorithms in this space to depend on the existence of “stable storage” that cannot be lost in any failure-recovery scenario without potential consistency violations. There are two primary problems with this assumption. First, disk errors are the most common problem we observe in real operation of persistent data systems and they often do not leave data intact. Secondly, even if this were not a problem, we do not want to require the use of fsync on every write for our consistency guarantees as this can reduce performance by two to three orders of magnitude. Our protocol for allowing a replica to rejoin the ISR ensures that before rejoining, it must fully re-sync again even if it lost unflushed data in its crash.

一個重要的設計是kafka並沒有要求宕機的節點需要完整無缺恢複數據. 要求複製算法在故障恢複是沒有任何一致性衝突依賴於底層的儲存介質. 這個假設有兩個主要的問題, 第一磁盤錯誤是最通常的問題, 我們從一個現實的數據儲存係統的操作可以觀察到, 這個可能導致數據不完整. 第二, 即使這些沒有問題, 我們也不想每次寫數據是調用fsync方法來保證數據的一致性, 因為這會降低2到3倍的性能. 我們允許一個複製節點從新加入到ISR集合前, 必須完全同步上, 即使它宕機是把未flush的數據丟失了

Unclean leader election: What if they all die? 不清楚主選舉:如果全部宕機了?

Note that Kafka’s guarantee with respect to data loss is predicated on at least one replica remaining in sync. If all the nodes replicating a partition die, this guarantee no longer holds.

注意, kafka隻有在isr集合至少有一個副本時, 才能保障數據不會丟失, 如果一個分區的所有節點都宕機了, 就保證不了了.

However a practical system needs to do something reasonable when all the replicas die. If you are unlucky enough to have this occur, it is important to consider what will happen. There are two behaviors that could be implemented:

盡管這樣, 在現實中的係統, 需要做一些事情在所有副本都宕機的情況下. 如果你不幸遇到了, 你需要仔細考慮下將碰到的問題. 有兩種行為需要去做:

Wait for a replica in the ISR to come back to life and choose this replica as the leader (hopefully it still has all its data). 等待ISR中一個副本起來, 然後選擇這個副本作為主節點(期望數據不會丟失)
Choose the first replica (not necessarily in the ISR) that comes back to life as the leader.選擇第一個存活的副本(不一定在ISR副本集中)直接作為主節點

This is a simple tradeoff between availability and consistency. If we wait for replicas in the ISR, then we will remain unavailable as long as those replicas are down. If such replicas were destroyed or their data was lost, then we are permanently down. If, on the other hand, a non-in-sync replica comes back to life and we allow it to become leader, then its log becomes the source of truth even though it is not guaranteed to have every committed message. By default, Kafka chooses the second strategy and favor choosing a potentially inconsistent replica when all replicas in the ISR are dead. This behavior can be disabled using configuration property unclean.leader.election.enable, to support use cases where downtime is preferable to inconsistency.

這個必須在可用性和一致性之間作權衡. 如果我們等待在ISR集合中的副本再次啟動起來, 那麼在所有副本及都宕機這段時間, 我們會維持不可用的狀態.如果這些副本已經壞了, 或對應的數據已經丟失了, 則我們永久宕機了.如果, 換種方式, 從沒有同步的副本中選擇一個存活的變成主的, 那麼這個副本的日誌就變成當前主要的數據源, 但是保證當前所有已經提交的消息還存在. 默認情況下, kafka使用第二種策略, 在ISR中的所有副本集都宕機時, 使用一個潛在的非一致性的副本,如果我們更期望是不可用狀態而不是不一致狀態時, 這個特性可以通過配置unclean.leader.election.enable來禁用,

This dilemma is not specific to Kafka. It exists in any quorum-based scheme. For example in a majority voting scheme, if a majority of servers suffer a permanent failure, then you must either choose to lose 100% of your data or violate consistency by taking what remains on an existing server as your new source of truth.

這種困境不是kafka特有的, 這存在於任何基於quorum方式的結構中. 例如, 多數投票算法, 如果大多數的服務器都永久性失效了, 你必須選擇丟失全部的數據或者接受某一台可能數據不一致的服務器上的數據.

Availability and Durability Guarantees 可用性和可靠性保證

When writing to Kafka, producers can choose whether they wait for the message to be acknowledged by 0,1 or all (-1) replicas. Note that “acknowledgement by all replicas” does not guarantee that the full set of assigned replicas have received the message. By default, when acks=all, acknowledgement happens as soon as all the current in-sync replicas have received the message. For example, if a topic is configured with only two replicas and one fails (i.e., only one in sync replica remains), then writes that specify acks=all will succeed. However, these writes could be lost if the remaining replica also fails. Although this ensures maximum availability of the partition, this behavior may be undesirable to some users who prefer durability over availability. Therefore, we provide two topic-level configurations that can be used to prefer message durability over availability:

當往kafka寫數據時, 生產者可以選擇是否等待0, 1 or all(-1)個副本反饋. 注意這裏接收到所有副本反饋並沒有保證所有配分配的副本集都接收到數據. 默認所有反饋指的是那些處於同步狀態的副本集都接受到數據後反饋, 如果一個主題配置有兩個副本, 其中一個失敗了(也及時隻有一個副本在ISR集合中), 這樣所有副本反饋策略寫成功後, 如果在剩餘的這個副本也失敗了, 數據就會丟失. 這樣保證了這個分區的最大可用性, 但是這種行為可能不是一些用戶希望的, 他們更看重持久性而不是可用性, 因此我們提供兩個主題級別的配置, 來保證消息的持久性大於可用性.

Disable unclean leader election – if all replicas become unavailable, then the partition will remain unavailable until the most recent leader becomes available again. This effectively prefers unavailability over the risk of message loss. See the previous section on Unclean Leader Election for clarification.
禁止無主選舉, 如果所有的副本都不可用, 這個分區就要等到最近一個主分區可以用時才可用, 這比較可能導致不可用, 而不是數據丟失.
Specify a minimum ISR size – the partition will only accept writes if the size of the ISR is above a certain minimum, in order to prevent the loss of messages that were written to just a single replica, which subsequently becomes unavailable. This setting only takes effect if the producer uses acks=all and guarantees that the message will be acknowledged by at least this many in-sync replicas. This setting offers a trade-off between consistency and availability. A higher setting for minimum ISR size guarantees better consistency since the message is guaranteed to be written to more replicas which reduces the probability that it will be lost. However, it reduces availability since the partition will be unavailable for writes if the number of in-sync replicas drops below the minimum threshold.
指定一個最小的ISR集合, 分區隻有在ISR集合的個數大於指定值時, 才能進行讀寫, 這樣可以阻止消息隻寫入到一個副本的, 隨後這個副本宕機導致數據丟失.這個隻有參數生產者開啟了全反饋的時才能保證消息會在所有同步的副本集中至少有這麼多個反饋. 這個參數提供了一致性和可用性之前的權衡, 較大最小ISR可以保證比較好的一致性, 因為消息被寫入更多的副本, 減少丟失的可能性, 但是同時也減低了可用性, 因為分區的副本數如果達不到最小ISR集合時將不可用.

Replica Management 複製管理

The above discussion on replicated logs really covers only a single log, i.e. one topic partition. However a Kafka cluster will manage hundreds or thousands of these partitions. We attempt to balance partitions within a cluster in a round-robin fashion to avoid clustering all partitions for high-volume topics on a small number of nodes. Likewise we try to balance leadership so that each node is the leader for a proportional share of its partitions.

上麵討論的複製日誌實際上隻涉及到單個日誌, 即一個主題分區. 但是, kafka集群管理成百上千這樣的分區. 我們用輪詢的方式對分區進行負載分攤, 避免一個主題的所有分區都被集聚到幾個節點上. 同樣的我們要分攤平衡主節點, 這樣每個服務器上承載的主分區節點都有一定的比例.

It is also important to optimize the leadership election process as that is the critical window of unavailability. A naive implementation of leader election would end up running an election per partition for all partitions a node hosted when that node failed. Instead, we elect one of the brokers as the “controller”. This controller detects failures at the broker level and is responsible for changing the leader of all affected partitions in a failed broker. The result is that we are able to batch together many of the required leadership change notifications which makes the election process far cheaper and faster for a large number of partitions. If the controller fails, one of the surviving brokers will become the new controller.

對不可用的時間端, 優化主節點的選舉也很重要. 一個直觀的選舉實現是如果一個節點宕機了, 那麼這個節點上的每個分區都獨立選舉. 但是, 我們選舉一個節點作為控製器. 這個控製或在節點級別上進行容錯管理, 和負責修改所有的被影響的分區的選舉. 這樣好處是,我們可以批量處理選舉, 減少很多獨立選舉時大量通知, 這使得在大量分區時選舉代價更小, 更快. 如果這個控製器失敗了, 其中一個還存活的節點會變成主控製器.

4.8 Log Compaction 日誌壓縮

Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition. It addresses use cases and scenarios such as restoring state after application crashes or system failure, or reloading caches after application restarts during operational maintenance. Let’s dive into these use cases in more detail and then describe how compaction works.

日誌壓縮保證了在同一個主題分區中, 最終保留每條消息相同主鍵最後一個值. 這用於這樣的使用場景, 如應用宕機和係統失敗後恢複狀態, 或者在持久化操作時重新加載緩存信息.讓我們更深入地分析更多細節信息, 然後解釋下壓縮是怎麼工作的.

So far we have described only the simpler approach to data retention where old log data is discarded after a fixed period of time or when the log reaches some predetermined size. This works well for temporal event data such as logging where each record stands alone. However an important class of data streams are the log of changes to keyed, mutable data (for example, the changes to a database table).

到目前為止, 我們隻討論了簡單的日誌維護方式, 如定期清理過期的數據, 或者清除超過指定大小的數據. 這種方式對於每條數據記錄臨時日誌信息非常有效. 但是有一種重要的數據流類型, 是根據主鍵, 變化的數據(例如: 數據庫表的變化)

Let’s discuss a concrete example of such a stream. Say we have a topic containing user email addresses; every time a user updates their email address we send a message to this topic using their user id as the primary key. Now say we send the following messages over some time period for a user with id 123, each message corresponding to a change in email address (messages for other ids are omitted):

讓我們使用具體實例來討論這個問題. 比如說我們有個關於用戶郵件的主題, 每次有用戶的郵件變化時, 我們發送一條消息到這個主題, 這個消息使用用戶的ID作為主鍵. 現在我們過段時間發送一些消息給用戶(id:123), 每條消息包含著郵件變更信息(其他人的消息暫時忽略)

        123 => bill@microsoft.com
                .
                .
                .
        123 => bill@gatesfoundation.org
                .
                .
                .
        123 => bill@gmail.com

Log compaction gives us a more granular retention mechanism so that we are guaranteed to retain at least the last update for each primary key (e.g. bill@gmail.com). By doing this we guarantee that the log contains a full snapshot of the final value for every key not just keys that changed recently. This means downstream consumers can restore their own state off this topic without us having to retain a complete log of all changes.

日誌壓縮提供我們更多的日誌保存機製, 我們可以保證每個主鍵的的最後一條記錄都會保留(如: bill@mail.com), 這樣做, 我們能保證日誌擁有每個鍵的最終值的完整鏡像, 下遊的消費者在從主題恢複狀態的時候也不需要消費完整的日誌信息.

Let’s start by looking at a few use cases where this is useful, then we’ll see how it can be used.

讓我們看下這個功能在那些情況下有用, 然後我們在看能夠怎麼被使用:

Database change subscription. It is often necessary to have a data set in multiple data systems, and often one of these systems is a database of some kind (either a RDBMS or perhaps a new-fangled key-value store). For example you might have a database, a cache, a search cluster, and a Hadoop cluster. Each change to the database will need to be reflected in the cache, the search cluster, and eventually in Hadoop. In the case that one is only handling the real-time updates you only need recent log. But if you want to be able to reload the cache or restore a failed search node you may need a complete data set.
數據庫變更訂閱. 很常有這樣的情況, 一份數據會出現在多個係統裏麵, 而且很常其中的一個係統是數據庫類型的(如RDBMS或者新的鍵值係統), 比如, 你有一個數據庫, 一個緩存係統, 一個檢索集群, 一個hadoop集群. 每次對數據庫的變更需要同步到在緩存, 檢索集群, 並最終保存到hadoop中. 在這種情況下, 你隻需要當前實時係統最新的更新日誌. 但是, 如果你要重新加載緩存, 或恢複宕機的檢索節點, 你需要完整的數據
Event sourcing. This is a style of application design which co-locates query processing with application design and uses a log of changes as the primary store for the application.
事件源. 這是一種應用程序設計風格，它將查詢處理與應用程序設計結合在一起，並使用日誌的更改作為應用程序的主存儲.
Journaling for high-availability. A process that does local computation can be made fault-tolerant by logging out changes that it makes to its local state so another process can reload these changes and carry on if it should fail. A concrete example of this is handling counts, aggregations, and other “group by”-like processing in a stream query system. Samza, a real-time stream-processing framework, uses this feature for exactly this purpose.
高可用日誌. 一個本地計算進程, 可以把變更日誌輸進行, 達到容錯的目的, 這樣另外一個進程就能夠在當前那個進程宕機的時接受繼續處理. 例如, 像流數據查詢例子, 如計數, 匯總或其他的分組操作. 實時係統框架如Samza, 就是為了達到這個目的使用這個特性的

In each of these cases one needs primarily to handle the real-time feed of changes, but occasionally, when a machine crashes or data needs to be re-loaded or re-processed, one needs to do a full load. Log compaction allows feeding both of these use cases off the same backing topic. This style of usage of a log is described in more detail in this blog post.

這些用例中主要需要處理實時係統的實時反饋, 但是有時, 機器宕機, 數據需要重新加載或重新處理, 有時要全部加載, 日誌壓縮允許使用同樣的主題來支持這些使用用例, 這種日誌的使用風格更多的細節可以看這裏.

The general idea is quite simple. If we had infinite log retention, and we logged each change in the above cases, then we would have captured the state of the system at each time from when it first began. Using this complete log, we could restore to any point in time by replaying the first N records in the log. This hypothetical complete log is not very practical for systems that update a single record many times as the log will grow without bound even for a stable dataset. The simple log retention mechanism which throws away old updates will bound space but the log is no longer a way to restore the current state—now restoring from the beginning of the log no longer recreates the current state as old updates may not be captured at all.

想法很簡單, 我們有無限的日誌, 在每種情況下記錄變更日誌, 我們從一開始就開始記錄係統的每一次變更情況. 著用這種日誌, 我們可以通過回放N個記錄, 把狀態恢複到任何一個時間點. 在這種假設前提下, 完全日誌對於那些對同一條記錄會更新很多次, 即使數據集是規定大小的, 日誌也會迅速增長. 這種簡單的日誌維護方式除了浪費空間外, 但是這些日誌也不能恢複當前的狀態, 從當前的日誌不能恢複當前狀態, 是因為舊的日誌可能沒有全部的更新記錄.

Log compaction is a mechanism to give finer-grained per-record retention, rather than the coarser-grained time-based retention. The idea is to selectively remove records where we have a more recent update with the same primary key. This way the log is guaranteed to have at least the last state for each key.

日誌壓縮提供對每條記錄的保存方式提供細粒度的機製, 而不是基於時間範圍的粗款的方式. 我們可以在對具有相同主鍵的記錄更新時, 選擇性刪除記錄.這樣日誌可以保證擁有每個鍵值的最後狀態

This retention policy can be set per-topic, so a single cluster can have some topics where retention is enforced by size or time and other topics where retention is enforced by compaction.

這種保存策略可以針對主題基本設置, 這樣一個集群的一些主題可以按大小和時間進行保存, 一些可以按壓縮策略進行保存

This functionality is inspired by one of LinkedIn’s oldest and most successful pieces of infrastructure—a database changelog caching service called Databus. Unlike most log-structured storage systems Kafka is built for subscription and organizes data for fast linear reads and writes. Unlike Databus, Kafka acts as a source-of-truth store so it is useful even in situations where the upstream data source would not otherwise be replayable.

這項功能被linkedin的一個最為古老和成功的基礎設施所使用, 數據庫日誌緩存服務叫做 Databus.

Log Compaction Basics 日誌壓縮基礎

Here is a high-level picture that shows the logical structure of a Kafka log with the offset for each message.

這張圖展示了kafka的每條信息的位置情況的日誌結構圖

The head of the log is identical to a traditional Kafka log. It has dense, sequential offsets and retains all messages. Log compaction adds an option for handling the tail of the log. The picture above shows a log with a compacted tail. Note that the messages in the tail of the log retain the original offset assigned when they were first written—that never changes. Note also that all offsets remain valid positions in the log, even if the message with that offset has been compacted away; in this case this position is indistinguishable from the next highest offset that does appear in the log. For example, in the picture above the offsets 36, 37, and 38 are all equivalent positions and a read beginning at any of these offsets would return a message set beginning with 38.

日誌的同步和傳統的kafka日誌一樣, 擁有密集的, 順序的位移, 並保存所有的消息. 日誌壓縮對尾部添加而外的壓縮選項.這張圖展示了已經壓縮的尾部. 注意, 在尾部的消息保存它們第一次寫入時的原始位移位置, 也要注意, 這些消息的位移位置即使在壓縮過後也是合法的, 在這種情況下, 這個位置和下次出現在日誌中的最高位移位置是很難區分的. 比如這圖上, 36, 37和38位移位置都是同等的, 在開始讀取這些位移位置時, 將會從38開始讀取.

Compaction also allows for deletes. A message with a key and a null payload will be treated as a delete from the log. This delete marker will cause any prior message with that key to be removed (as would any new message with that key), but delete markers are special in that they will themselves be cleaned out of the log after a period of time to free up space. The point in time at which deletes are no longer retained is marked as the “delete retention point” in the above diagram.

壓縮同時也允許刪除, 如果一個帶鍵值的消息沒有任何負載數據會被認為是要從日誌中刪除記錄, 這個刪除標誌會導致先前帶有這個鍵值的消息都被刪除. 但是刪除標誌比較特殊, 在過一段時期後會被清除後釋放空間.這個執行刪除的時間點, 標記為”刪除保留點”

The compaction is done in the background by periodically recopying log segments. Cleaning does not block reads and can be throttled to use no more than a configurable amount of I/O throughput to avoid impacting producers and consumers. The actual process of compacting a log segment looks something like this:

日誌壓縮在後台定時拷貝日誌段的方式進行. 清除操作可以通過配置讀寫I/O的限額避免對消費額和生產者產生影響. 實際的日誌段壓縮過程有點像如下:

What guarantees does log compaction provide? 日誌壓縮提供了什麼保障?

Log compaction guarantees the following: 日誌壓縮提供了如下保障:

Any consumer that stays caught-up to within the head of the log will see every message that is written; these messages will have sequential offsets. The topic’s min.compaction.lag.ms can be used to guarantee the minimum length of time must pass after a message is written before it could be compacted. I.e. it provides a lower bound on how long each message will remain in the (uncompacted) head.
任何追上頭部的消費者, 都會接受到任何寫入的消息, 這些消息都有順序的位移值. 參數min.compaction.lag.ms 可以控製消息必須經過質指定的時間間隔後才能被壓縮, 它提供了一個消息可以儲存多久在頭部的短暫時間範圍
Ordering of messages is always maintained. Compaction will never re-order messages, just remove some.
消息順序永遠被保證, 壓縮不會重新排序, 隻會刪除一些
The offset for a message never changes. It is the permanent identifier for a position in the log.
消息的位移永遠不會變, 這是消息在日誌中的永久性標誌
Any consumer progressing from the start of the log will see at least the final state of all records in the order they were written. All delete markers for deleted records will be seen provided the consumer reaches the head of the log in a time period less than the topic’s delete.retention.mssetting (the default is 24 hours). This is important as delete marker removal happens concurrently with read, and thus it is important that we do not remove any delete marker prior to the consumer seeing it.
任何從頭開始消費記錄的消費者, 都會按順序得到最終的狀態. 所有的刪除標誌的記錄會在消費者到達頭部之前, 小於主題設置的 delete.retention.ms(默認是24小時)時間之內被處理.這個在刪除標誌發生在並行讀之前很重要, 這樣我們可以保證我們在消費者讀取之前沒有刪除任何標誌.

Log Compaction Details 日誌壓縮細節

Log compaction is handled by the log cleaner, a pool of background threads that recopy log segment files, removing records whose key appears in the head of the log. Each compactor thread works as follows:

日誌壓縮使用日誌處理器進行處理, 後台線程池負責重複拷貝日誌段文件, 刪除那些出現在頭部日誌的記錄. 所有的壓縮線程按照如下的方式進行:

It chooses the log that has the highest ratio of log head to log tail
它選擇有比較大比例的日誌頭去記錄到日誌尾
It creates a succinct summary of the last offset for each key in the head of the log
它在日誌頭部為每個鍵左最後位移創建一個簡潔的摘要
It recopies the log from beginning to end removing keys which have a later occurrence in the log. New, clean segments are swapped into the log immediately so the additional disk space required is just one additional log segment (not a fully copy of the log).
它從頭開始拷貝日誌, 刪除在日誌最後出現的鍵值.新的幹淨日誌段會被立刻交換到日誌裏麵, 所以隻需要額外的一個日誌分段.
The summary of the log head is essentially just a space-compact hash table. It uses exactly 24 bytes per entry. As a result with 8GB of cleaner buffer one cleaner iteration can clean around 366GB of log head (assuming 1k messages).
日誌頭部匯總實際上是一個空間緊湊的hash表, 使用24個字節一個條目的形式, 所以如果有8G的整理緩衝區, 則能迭代處理大約366G的日誌頭部(假設消息大小為1k)

Configuring The Log Cleaner 配置日誌整理器

The log cleaner is enabled by default. This will start the pool of cleaner threads. To enable log cleaning on a particular topic you can add the log-specific property

日誌整理器默認開啟, 這個會啟動日誌整理器線程池, 如果要開啟特定主題的日誌整理功能, 可以開啟日誌特定的屬性

  log.cleanup.policy=compact

This can be done either at topic creation time or using the alter topic command.

這個可以在主題創建時配置, 或是使用更改主題配置的命令

The log cleaner can be configured to retain a minimum amount of the uncompacted “head” of the log. This is enabled by setting the compaction time lag.

日誌整理起可以配置保留一段沒有被壓縮整理的日誌頭部, 這個可用通過配置日誌壓縮延遲時間參數:

  log.cleaner.min.compaction.lag.ms

This can be used to prevent messages newer than a minimum message age from being subject to compaction. If not set, all log segments are eligible for compaction except for the last segment, i.e. the one currently being written to. The active segment will not be compacted even if all of its messages are older than the minimum compaction time lag.這個可以用於防止消息比當前正在壓縮的最小消息時間更新, 如果沒有設置, 所有的日誌都會壓縮, 除了最後一個正在被讀寫的段. 當前段甚至在消息大於最小壓縮延遲時間也不會被壓縮.

Further cleaner configurations are described here.

更多的整理器的配置可以參考這裏

4.9 Quotas 配額

Starting in 0.9, the Kafka cluster has the ability to enforce quotas on produce and fetch requests. Quotas are basically byte-rate thresholds defined per group of clients sharing a quota.

從0.9版本開始, kafka集群對生產和消費請求進行限額配置, 配置主要是根據客戶端分組按字節速率進行限定的.

Why are quotas necessary? 配額有必要麼?

It is possible for producers and consumers to produce/consume very high volumes of data and thus monopolize broker resources, cause network saturation and generally DOS other clients and the brokers themselves. Having quotas protects against these issues and is all the more important in large multi-tenant clusters where a small set of badly behaved clients can degrade user experience for the well behaved ones. In fact, when running Kafka as a service this even makes it possible to enforce API limits according to an agreed upon contract.

生產者/消費者可能在生產/消費大量的數據, 因此會對服務器資源的大量獨占, 導致網絡達到飽和, 對其它客戶端造成影響. 如果項目配置了限額就可以降低這些問題, 特別是在多租戶的集群中, 一小部分低質量的客戶端用戶會降低這個用戶集群的體驗, 當使用kafka作為一項服務時, 甚至可以通過上層的協議來使用api進行強製限製

Client groups 客戶端分組

The identity of Kafka clients is the user principal which represents an authenticated user in a secure cluster. In a cluster that supports unauthenticated clients, user principal is a grouping of unauthenticated users chosen by the broker using a configurable PrincipalBuilder. Client-id is a logical grouping of clients with a meaningful name chosen by the client application. The tuple (user, client-id) defines a secure logical group of clients that share both user principal and client-id.

kafka客戶端身份在鑒權的集群中是用用戶身份. 在無鑒權機製的集群中, 用戶身份是由服務器使用可配置的PrincipalBuilder進行選擇的, Client-id作為客戶端邏輯分組, 是由客戶端應用選擇的一個有意義的名稱. 標量(user, client-id)定義共享這個用戶身份和客戶端ID的邏輯客戶端分組.

Quotas can be applied to (user, client-id), user or client-id groups. For a given connection, the most specific quota matching the connection is applied. All connections of a quota group share the quota configured for the group. For example, if (user=”test-user”, client-id=”test-client”) has a produce quota of 10MB/sec, this is shared across all producer instances of user “test-user” with the client-id “test-client”.

配額可以用於(user, client-id)組合, 或user, client-id分組. 對一個給定的連接, 最符合這個連接的配額被使用到, 一個限額組的所有連接共享這個限額配置, 例如: 如果(user=”test-user”, client-id=”test-client”) 10MB/s的配額, 這個配置會被所有的具有”test-user”用戶和客戶端ID是 “test-client”的所有生產者所共享.

Quota Configuration 限額配置

Quota configuration may be defined for (user, client-id), user and client-id groups. It is possible to override the default quota at any of the quota levels that needs a higher (or even lower) quota. The mechanism is similar to the per-topic log config overrides. User and (user, client-id) quota overrides are written to ZooKeeper under /config/users and client-id quota overrides are written under /config/clients. These overrides are read by all brokers and are effective immediately. This lets us change quotas without having to do a rolling restart of the entire cluster. See here for details. Default quotas for each group may also be updated dynamically using the same mechanism.

配額可以按照(user, client-id)或者, user或client-id進行分組, 如果需要更高或更低的配額, 可以覆蓋默配額, 這個機製類似於對日誌主題配置的覆蓋, user 或者 (user, client-id)配額可以覆蓋寫入到zookeeper下的 /config/users ,client-id配置, 可以寫入到 /config/clients. 這些覆蓋寫入會被服務器很快的讀取到, 這讓我們修改配置不需要重新啟動服務器. 每個分組的默認配置也可以同樣的方式動態修改.

The order of precedence for quota configuration is:

限額的配置順序如下:

/config/users/<user>/clients/<client-id>
/config/users/<user>/clients/<default>
/config/users/<user>
/config/users/<default>/clients/<client-id>
/config/users/<default>/clients/<default>
/config/users/<default>
/config/clients/<client-id>
/config/clients/<default>

Broker properties (quota.producer.default, quota.consumer.default) can also be used to set defaults for client-id groups. These properties are being deprecated and will be removed in a later release. Default quotas for client-id can be set in Zookeeper similar to the othe

最後更新：2017-05-19 10:24:51

《kafka中文手冊》- 構架設計（二）

4.6 Message Delivery Semantics 消息分發語義

4.7 Replication 複製

Replicated Logs: Quorums, ISRs, and State Machines (Oh my!) 複製日誌

Unclean leader election: What if they all die? 不清楚主選舉:如果全部宕機了?

Availability and Durability Guarantees 可用性和可靠性保證

Replica Management 複製管理

4.8 Log Compaction 日誌壓縮

Log Compaction Basics 日誌壓縮基礎

What guarantees does log compaction provide? 日誌壓縮提供了什麼保障?

Log Compaction Details 日誌壓縮細節

Configuring The Log Cleaner 配置日誌整理器

4.9 Quotas 配額

Why are quotas necessary? 配額有必要麼?

Client groups 客戶端分組

Quota Configuration 限額配置

上一篇：《kafka中文手冊》-快速開始（一）

下一篇：《kafka中文手冊》- 構架設計（一）

相關內容

熱門內容

最新內容

《kafka中文手冊》- 構架設計（二）

4.6 Message Delivery Semantics 消息分發語義

4.7 Replication 複製

Replicated Logs: Quorums, ISRs, and State Machines (Oh my!) 複製日誌

Unclean leader election: What if they all die? 不清楚主選舉:如果全部宕機了?

Availability and Durability Guarantees 可用性和可靠性保證

Replica Management 複製管理

4.8 Log Compaction 日誌壓縮

Log Compaction Basics 日誌壓縮基礎

What guarantees does log compaction provide? 日誌壓縮提供了什麼保障?

Log Compaction Details 日誌壓縮細節

Configuring The Log Cleaner 配置日誌整理器

4.9 Quotas 配額

Why are quotas necessary? 配額有必要麼?

Client groups 客戶端分組

Quota Configuration 限額配置

上一篇： 《kafka中文手冊》-快速開始（一）

下一篇： 《kafka中文手冊》- 構架設計（一）

相關內容

熱門內容

最新內容

上一篇：《kafka中文手冊》-快速開始（一）

下一篇：《kafka中文手冊》- 構架設計（一）