閱讀263 返回首頁    go 阿裏雲 go 技術社區[雲棲]


筆記:Ceph: A Scalable, High-Performance Distributed File System

關於Ceph的名篇。Ceph是現在很火的一個存儲係統,不同於HDSF主要是麵向大數據應用,Ceph是立誌要做一個通用的存儲解決方案,要同時很好的支持對象存儲(Object Storage),塊存儲(Block Storage)以及文件係統(File System) 。現在很多Openstack私有雲的存儲都是基於Ceph的。Ceph就是基於這篇論文做得。

摘要
很明確的指出了Ceph的使命:
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability.
以及關鍵方法和技術:
Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs).
We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system.
A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads.
然後就是性能
Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supportingmore than 250,000metadata operations per second.

介紹:
先把NFS和傳統OSD的問題說了一下。
然後介紹Ceph:
We present Ceph, a distributed file system that provides excellent performance and reliability while promising unparalleled scalability.
這句是一個關鍵:Our architecture is based on the assumption that systems at the petabyte scale are inherently dynamic: large systems are inevitably built incrementally, node failures are the norm rather than the exception, and the quality and character of workloads are constantly shifting over time.
Ceph的架構如下:

係統介紹:
Ceph分3部分:
the client, each instance of which exposes a near-POSIX file system interface to a host or process;
a cluster of OSDs, which collectively stores all data and metadata;
A metadata server cluster, which manages the namespace (file names and directories) while coordinating security, consistency and coherence (see Figure 1).
如下圖所示:
screenshot

主要做法:
Decoupled Data and Metadata
Dynamic Distributed Metadata Management
Reliable Autonomic Distributed Object Storage

後麵幾章是對每部分具體實現的介紹,沒有什麼太高深的公式和理論,大家一般都能看明白,挺有意思的。
原文鏈接:
https://www.ece.eng.wayne.edu/~sjiang/ECE7650-winter-15/topic5B-S.pdf
如果下不了可以去百度學術上再搜一下。

最後更新:2017-08-28 11:02:52

  上一篇:go  阿裏單身狗四級考試
  下一篇:go  阿裏的第一個雙11是這麼搞出來的?