阿裏雲RDS金融數據庫(三節點版) - 性能篇
標簽
PostgreSQL , MySQL , 三節點版 , 金融數據庫 , Raft , 分布式共享存儲版
背景
終於到了性能篇,三節點同時滿足了企業對數據庫的可用性、可靠性的要求,那麼性能如何呢?
提到性能測試,我有幾點一定要說明一下,很多朋友對性能的理解可能有偏差,那麼如何評判性能的好壞呢?
1、首先要明確測試的環境,包括數據庫主機(主要包括CPU、內存、網卡,如果你的數據庫能用上FPGA、GPU的計算能力,還得算上他們,例如PostgreSQL就可以利用GPU和FPGA進行計算加速。)、數據存儲(主要是各個塊大小、Queue deep的連續、隨機IOPS能力,連續、隨機讀寫帶寬)、測試機和數據庫主機的網絡(帶寬和RT),測試機的硬件指標。
2、明確參照物,沒有參照物我們無法評判性能的好與壞。例如三節點、兩節點、單節點的對比。
3、明確測試的benchmark,例如對於OLTP場景,可以按工業標準TPC-C進行測試。或者用戶可以自己建模進行測試。而對於MySQL的測試,大家喜歡用sysbench以及sysbench中自帶的那些test case。
4、測試中,注意數據庫驅動、緩存預熱、測試客戶端、連接數、測試數據量、測試數據的分布。。。等對測試結果帶來的幹擾。我曾經就有遇到過有朋友同樣的硬件環境,因為各種原因,測試出來的結果大相庭徑的。
例如測試客戶端,開啟了DEBUG日誌輸出,導致測試TPS下降嚴重。同樣的測試CASE,用JAVA寫的和用C寫的,測試結果也有一定的差異。
5、數據庫、存儲、OS、防火牆的優化對測試結果的影響也巨大。如果要對比測試結果,那麼環境務必保持一致,包括這些配置。
在考慮到以上因素的情況下,與參照物進行對比(例如pg 9.6和pg 10比較,pg和mysql 比較, 三節點和單節點、2節點的比較等),評判性能的好壞才有價值。
性能評測case設計
相信大家會比較關心三節點和單節點、雙節點的性能對比,為了更加貼近現實場景我們來看看架構的區別。
單節這裏不多說,沒有負擔性能肯定是首當其衝的。
我們從雙節點開始說。
雙節點可以部署在同一機房,也可以部署在同城異地機房。
當雙節點部署在同城異地機房時,RT一定是打折扣的,所以對於小事務性能一定會下降明顯,但是獲得的好處是抵禦機房級故障。
同一機房可以獲得良好的RT,但是無法抵禦機房級故障,也無法同時做到可用性和可靠性,在滿足可用性時,可靠性就必須打折扣(因為不能使用完全同步複製)。
對於三節點,部署非常有彈性,我們可以選擇同機房+同城機房的部署方法。可以抵禦機房級故障,同時還有極好的RT。做到了性能、可靠性兼得。
采樣同機房+異地機房的部署方法,不僅能抵禦機房故障,還能抵禦城市災難。
對於三節點,還可以選擇3機房的部分方法,犧牲一定的RT,可以抵禦城市級災難。
根據以上論證,不難發現,性能和部署有關,部署和保護級別有關。
三節點的部署極為靈活,根據主備節點的分布,抵禦的風險級別不一樣,當然RT也不一樣。
我們可以得出這樣的預期。
性能怎麼樣呢?
三節點 vs 2節點 vs 單節點
阿裏雲RDS目前提供了單節點、雙節點、三節點幾種形態的產品。
1、單節點,主打經濟實用,雖然是單節點,但是數據存儲依舊是有多份的,備份和時間點恢複一個都不少。
2、雙節點,具備更高的可用性,主機故障帶來的故障時間更短,但是采用的是異步(同步可降級模式)可靠性在定義的SLA範圍內(無法做到0丟失)。
3、三節點,具備可用性的同時,還具備可靠性(數據0丟失),金融級的最佳選擇。
下麵分別對比三種形態的性能,給用戶一個參考,用戶有個直觀的認識。
測試環境
1、單節點配置:
32C,80萬IOPS,512G內存,10GB網絡。
2、雙節點配置:
32C,80萬IOPS,512G內存,同機房10GB網絡。同步複製模式(等待超過1秒自動降級為異步模式)。
3、三節點配置(同機房+同城機房版本):
32C,80萬IOPS,512G內存,同機房10GB網絡,同城機房間網絡帶寬未知。同步模式(隻是1個備庫響應COMMIT RECORD ACK)。
都不使用分組提交,WAL級別都為replica。
以PostgreSQL 單節點、雙節點、三節點為例,對比各個模式的性能。
隻讀事務
預期:
隻讀事務,不管是幾節點,性能是一樣的。
-- 構造1000萬記錄,按PK查詢。
create table test1(id int primary key, info text, crt_time timestamp);
insert into test1 select generate_series(1,10000000), md5(random()::text), now();
-- 測試腳本如下,隨機按PK查詢。
vi test1.sql
\set id random(1,10000000)
select * from test1 where id=:id;
-- 並發測試
pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 64 -j 64 -T 120
測試結果
單節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 120630026
latency average = 0.063 ms
latency stddev = 0.031 ms
tps = 1005132.560523 (including connections establishing)
tps = 1016279.347885 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.001 \set id random(1,10000000)
0.062 select * from test1 where id=:id;
三節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 125610106
latency average = 0.061 ms
latency stddev = 0.015 ms
tps = 1046622.166617 (including connections establishing)
tps = 1046771.049111 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.001 \set id random(1,10000000)
0.060 select * from test1 where id=:id;
隻寫小事務
預期:
隻寫的小事務,保護級別越高,RT越高,RT在整個事務中的占比越高,性能越差。
-- 構造表,UPSERT操作,按PK,有則更新,無則插入。
create table test2(id int primary key, info text, crt_time timestamp);
-- 測試腳本如下,ID範圍1到10億,有則更新,無則插入。
vi test2.sql
\set id random(1,1000000000)
insert into test2 values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
-- 並發測試
pgbench -M prepared -n -r -P 1 -f ./test2.sql -c 64 -j 64 -T 120
測試結果
單節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 24987162
latency average = 0.307 ms
latency stddev = 0.383 ms
tps = 208177.704141 (including connections establishing)
tps = 208197.846664 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.001 \set id random(1,1000000000)
0.306 insert into test2 values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
三節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 17516730
latency average = 0.439 ms
latency stddev = 8.756 ms
tps = 145723.501673 (including connections establishing)
tps = 145743.685418 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.001 \set id random(1,1000000000)
0.438 insert into test2 values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
讀寫混合事務(讀多寫少)
預期:
讀多寫少的混合事務,保護級別越高,RT越高,但RT在整個事務中的占比與事務本身的耗時有關,整個事務的時間越短,性能越差。
-- 構造讀請求表,構造1000萬記錄,按PK查詢。
create table test3(id int primary key, info text, crt_time timestamp);
insert into test3 select generate_series(1,10000000), md5(random()::text), now();
-- 構造表,UPSERT操作,按PK,有則更新,無則插入。
create table test4(id int primary key, info text, crt_time timestamp);
-- 測試腳本如下,10個隻讀,一筆寫操作,ID範圍1到10億,有則更新,無則插入。
vi test3.sql
\set id1 random(1,10000000)
\set id2 random(1,1000000000)
select * from test3 where id=:id1;
select * from test3 where id=:id1+1000;
select * from test3 where id=:id1+5000;
select * from test3 where id=:id1+10000;
select * from test3 where id=:id1+100;
select * from test3 where id=:id1-1000;
select * from test3 where id=:id1-5000;
select * from test3 where id=:id1-10000;
select * from test3 where id=:id1+800;
select * from test3 where id=:id1-800;
insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
-- 並發測試
pgbench -M prepared -n -r -P 1 -f ./test3.sql -c 64 -j 64 -T 120
測試結果
單節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 7497290
latency average = 1.024 ms
latency stddev = 0.294 ms
tps = 62469.713487 (including connections establishing)
tps = 62476.551706 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.001 \set id1 random(1,10000000)
0.001 \set id2 random(1,1000000000)
0.080 select * from test3 where id=:id1;
0.074 select * from test3 where id=:id1+1000;
0.072 select * from test3 where id=:id1+5000;
0.071 select * from test3 where id=:id1+10000;
0.069 select * from test3 where id=:id1+100;
0.069 select * from test3 where id=:id1-1000;
0.068 select * from test3 where id=:id1-5000;
0.068 select * from test3 where id=:id1-10000;
0.067 select * from test3 where id=:id1+800;
0.066 select * from test3 where id=:id1-800;
0.320 insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
三節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 6113773
latency average = 1.256 ms
latency stddev = 8.362 ms
tps = 50936.622922 (including connections establishing)
tps = 50943.325789 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.001 \set id1 random(1,10000000)
0.001 \set id2 random(1,1000000000)
0.085 select * from test3 where id=:id1;
0.078 select * from test3 where id=:id1+1000;
0.075 select * from test3 where id=:id1+5000;
0.074 select * from test3 where id=:id1+10000;
0.072 select * from test3 where id=:id1+100;
0.072 select * from test3 where id=:id1-1000;
0.071 select * from test3 where id=:id1-5000;
0.070 select * from test3 where id=:id1-10000;
0.069 select * from test3 where id=:id1+800;
0.068 select * from test3 where id=:id1-800;
0.519 insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
讀寫混合事務(讀少寫多)
預期:
讀少寫多的混合事務,保護級別越高,RT越高,但RT在整個事務中的占比與事務本身的耗時有關,整個事務的時間越短,性能越差。
-- 測試腳本如下,10個寫,一個讀。
vi test4.sql
\set id1 random(1,10000000)
\set id2 random(1,1000000000)
select * from test3 where id=:id1;
insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2+100, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2+1000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2+2000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2+5000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2+10000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2-100, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2-1000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2-5000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
insert into test4 values (:id2-9000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
-- 並發測試
pgbench -M prepared -n -r -P 1 -f ./test4.sql -c 64 -j 64 -T 120
測試結果
單節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 2519640
latency average = 3.048 ms
latency stddev = 1.192 ms
tps = 20994.303252 (including connections establishing)
tps = 20996.499448 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.001 \set id1 random(1,10000000)
0.000 \set id2 random(1,1000000000)
0.078 select * from test3 where id=:id1;
0.301 insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.297 insert into test4 values (:id2+100, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.296 insert into test4 values (:id2+1000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.296 insert into test4 values (:id2+2000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.296 insert into test4 values (:id2+5000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.297 insert into test4 values (:id2+10000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.295 insert into test4 values (:id2-100, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.296 insert into test4 values (:id2-1000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.297 insert into test4 values (:id2-5000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.297 insert into test4 values (:id2-9000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
三節點:
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 1638278
latency average = 4.687 ms
latency stddev = 30.096 ms
tps = 13649.890904 (including connections establishing)
tps = 13651.483514 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set id1 random(1,10000000)
0.000 \set id2 random(1,1000000000)
0.079 select * from test3 where id=:id1;
0.490 insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.464 insert into test4 values (:id2+100, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.463 insert into test4 values (:id2+1000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.454 insert into test4 values (:id2+2000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.448 insert into test4 values (:id2+5000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.461 insert into test4 values (:id2+10000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.455 insert into test4 values (:id2-100, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.452 insert into test4 values (:id2-1000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.466 insert into test4 values (:id2-5000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
0.453 insert into test4 values (:id2-9000, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
性能對比報表如圖:
TPS | 隻讀 | 隻寫 | 讀多寫少 | 讀少寫多 |
---|---|---|---|---|
單節點(tps) | 1016279 | 208197 | 62476 | 20996 |
三節點(tps) | 1046771 | 145743 | 50943 | 13651 |
響應時間方差 | 隻讀 | 隻寫 | 讀多寫少 | 讀少寫多 |
---|---|---|---|---|
單節點(ms) | 0.031 | 0.383 | 0.294 | 1.192 |
三節點(ms) | 0.015 | 8.756 | 8.362 | 30.096 |
複製層麵 - MySQL和PostgreSQL的差異
複製機製決定了兩種產品的差異。
PostgreSQL,通過WAL的物理式複製同步備庫。產生多少WAL就複製多少WAL,不需要等待事務結束才開始複製。因此備庫與主庫的WAL延遲與事務大小無關,僅僅與網絡帶寬和網絡RT有關。每次事務結束時(不論事務大小),僅僅等待COMMIT RECORD ACK即可(commit record是固定大小的,非常小),所以不管事務多大,延遲都是等效的。
MySQL,通過binlog進行複製同步備庫。主庫上沒有結束的事務,binlog不會發給備庫,因此備庫的延遲和事務大小直接相關。事務越大(指產生影響的ROW越多的事務),產生的BINLOG越多,事務提交的RT越高,延遲越嚴重。MySQL業務應盡量避免大事務。
小結
從測試結果不難發現,三節點在寫事務層麵相比單節點要低一些(但是在接受範圍內),但給用戶換來的是高可用和高可靠兼得的效果。畢竟性能可以通過拆庫,優化等很多手段來彌補。金融級用戶看中的更多是可靠性和可用性,阿裏雲RDS三節點版本是很好的選擇。
三節點的性能影響主要來自事務提交後,等待WAL或binlog發送給備庫,收到ACK需要多久。PostgreSQL和MySQL的差異如上所述。
不同類型的場景,預期和實際測試效果一致。
雖然沒有測試兩節點,但是可以根據前麵介紹的架構以及數據庫的複製原理,推論出它的結果。
1、如果兩節點為異步複製配置,那麼性能應該和單節點相當。
2、如果兩節點為同步(帶自動降級功能)複製配置,那麼性能和三節點相當。
隻讀事務
隻讀事務,不管是幾節點,性能是一樣的。
隻寫小事務
隻寫的小事務,保護級別越高,RT越高,RT在整個事務中的占比越高,性能越差。
讀寫混合事務(讀多寫少)
讀多寫少的混合事務,保護級別越高,RT越高,但RT在整個事務中的占比與事務本身的耗時有關,整個事務的時間越短,性能越差。
讀寫混合事務(讀少寫多)
讀少寫多的混合事務,保護級別越高,RT越高,但RT在整個事務中的占比與事務本身的耗時有關,整個事務的時間越短,性能越差。
係列文章
《阿裏雲RDS金融數據庫(三節點版) - 案例篇》
阿裏雲RDS金融數據庫(三節點版)
阿裏雲RDS金融數據庫 - PostgreSQL三節點版(敬請期待)
最後更新:2017-07-14 18:02:24