MOM係列文章之 - zero copy 那些事(上)
最近準備了兩篇文章,主要是針對MOM中的關鍵技術zero copy(物理層麵和邏輯層麵)進行一些介紹。
在基於文件存儲的MOM Kafka,ActiveMQ以及其它諸如Hornetq,Kestrel中的Journal設計實現中,無不見zero copy的神威。為此我準備了一個係列文章,希望能夠為大家解開zero copy的神秘麵紗,也希望大家能夠喜歡。
這篇文章主要聚焦在zero copy的基礎部分。首先通過E文導讀來理解其內在原理,理解為什麼zero copy能夠提升一些IO密集型應用的性能,為什麼能夠將上下文切換從4次降到2次,數據copy從4次降低到3次(注:隻有一次會占用CPU cycle)?其次,簡單介紹下Java世界,尤其是Netty中的zero-copy的設計實現。最後通過幾篇擴展閱讀,開闊一下視野,帶領大家了解一下國外同行在zero copy上的一些技術性研究及其成果。OK,開篇~
Zero copy View
下麵,我們以數據傳輸為例,來重點分析一下傳統與零拷貝傳輸方式:


The read() call causes a context switch (see Figure 2) from user mode to kernel mode. Internally a sys_read() (or equivalent) is issued to read the data from the file. The first copy (see Figure 1) is performed by the direct memory access (DMA) engine, which reads file contents from the disk and stores them into a kernel address space buffer.
The requested amount of data is copied from the read buffer into the user buffer, and the read() call returns. The return from the call causes another context switch from kernel back to user mode. Now the data is stored in the user address space buffer.
The send() socket call causes a context switch from user mode to kernel mode. A third copy is performed to put the data into a kernel address space buffer again. This time, though, the data is put into a different buffer, one that is associated with the destination socket.
The send() system call returns, creating the fourth context switch. Independently and asynchronously, a fourth copy happens as the DMA engine passes the data from the kernel buffer to the protocol engine.
Use of the intermediate kernel buffer (rather than a direct transfer of the data into the user buffer) might seem inefficient. But intermediate kernel buffers were introduced into the process to improve performance. Using the intermediate buffer on the read side allows the kernel buffer to act as a "readahead cache" when the application hasn't asked for as much data as the kernel buffer holds. This significantly improves performance when the requested data amount is less than the kernel buffer size. The intermediate buffer on the write side allows the write to complete asynchronously.
Unfortunately, this approach itself can become a performance bottleneck if the size of the data requested is considerably larger than the kernel buffer size. The data gets copied multiple times among the disk, kernel buffer, and user buffer before it is finally delivered to the application.
Zero copy improves performance by eliminating these redundant data copies.
zero copy approach


The transferTo() method causes the file contents to be copied into a read buffer by the DMA engine. Then the data is copied by the kernel into the kernel buffer associated with the output socket.
The third copy happens as the DMA engine passes the data from the kernel socket buffers to the protocol engine.
This is an improvement: we've reduced the number of context switches from four to two and reduced the number of data copies from four to three (only one of which involves the CPU). But this does not yet get us to our goal of zero copy. We can further reduce the data duplication done by the kernel if the underlying network interface card supports gather operations. In Linux kernels 2.4 and later, the socket buffer descriptor was modified to accommodate this requirement. This approach not only reduces multiple context switches but also eliminates the duplicated data copies that require CPU involvement. The user-side usage still remains the same, but the intrinsics have changed:
The transferTo() method causes the file contents to be copied into a kernel buffer by the DMA engine.
No data is copied into the socket buffer. Instead, only descriptors with information about the location and length of the data are appended to the socket buffer. The DMA engine passes data directly from the kernel buffer to the protocol engine, thus eliminating the remaining final CPU copy.
Zero copy In java
Zero copy readings
1.https://zeromq.org/blog:zero-copy
2.https://www.mellanox.com/pdf/whitepapers/SDP_Whitepaper.pdf
3.https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.58.8050&rep=rep1&type=pdf
4.https://www.cse.ohio-state.edu/~subramon/Tech-Reports/ftp08-tr.pdf
5.https://cscjournals.org/csc/manuscript/Journals/IJCSS/volume6/Issue4/IJCSS-756.pdf
6.https://www.info.kochi-tech.ac.jp/yama/papers/ispdc05_active.pdf
最後更新:2017-04-03 12:55:16
上一篇:
centsOS下安裝vsftp的配置
下一篇:
大話無線客戶端安全之數據存儲安全——Android篇
簡單的web server性能測試
[usaco]4.2.2偶圖匹配 The Perfect Stall
Android 2.3 Dev Guide (55)-- Android Supported Media Formats
單例設計模式的實現代碼
《Servlet、JSP和Spring MVC初學指南》——1.8 GenericServlet
Installation error: INSTALL_PARSE_FAILED_MANIFEST_MALFORMED
《Spring Data實戰》——1.4 示例代碼
C語言的編譯過程
SVN版本管理係統的安裝 CentOS + Subversion + Apache + Jsvnadmin
淺談物聯網智能設備數據安全麵臨的挑戰