閱讀115 返回首頁    go 技術社區[雲棲]


Deepgreen的TPC-H測試步驟(簡略版)

本文僅簡略介紹Deepgreen數據庫在TPC-H測試中需要注意的幾個步驟,為朋友們以後做測試防雷,詳細步驟參見德哥帖子

1.上傳文件並解壓縮:


ChrisdeMacBook-Pro:tpch chris$ scp master.zip tpc-h-tool.zip dgadmin@192.168.100.107:/home/dgadmin
dgadmin@192.168.100.107's password:
master.zip                                                                                                                                       100%   21KB   3.6MB/s   00:00
tpc-h-tool.zip

unzip xxx

2.設置Makefile參數:


[dgadmin@linux1 ~]$ cd tpch_2_17_0/
[dgadmin@linux1 tpch_2_17_0]$ ls
dbgen  dev-tools  ref_data
[dgadmin@linux1 tpch_2_17_0]$ cd dbgen/
[dgadmin@linux1 dbgen]$ cp makefile.suite Makefile
[dgadmin@linux1 dbgen]$ vim Makefile

主要修改CC、Database、Machine、Workload四行:

################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC      = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
#                                  SQLSERVER, SYBASE, ORACLE, VECTORWISE
# Current values for MACHINE are:  ATT, DOS, HP, IBM, ICL, MVS,
#                                  SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are:  TPCH
DATABASE= ORACLE
MACHINE = LINUX
WORKLOAD = TPCH

3.編譯:make


4.生成測試數據:


[dgadmin@linux1 dbgen]$ ./dbgen -s 1 -f
TPC-H Population Generator (Version 2.17.0)
Copyright Transaction Processing Performance Council 1994 - 2010

5.將測試數據轉換為postgresql識別的格式,刪除末尾的分隔符|


[dgadmin@linux1 dbgen]$ for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; done

6.把包含csv文件的目錄,軟鏈接到/tmp/dss-data。tpch-pg腳本中一會要用到這個目錄


[dgadmin@linux1 dbgen]$ pwd
/home/dgadmin/tpch_2_17_0/dbgen
[dgadmin@linux1 dbgen]$ ln -s /home/dgadmin/tpch_2_17_0/dbgen /tmp/dss-data

7.切換到腳本文件master目錄下:


[dgadmin@linux1 ~]$ cd pg_tpch-master/
[dgadmin@linux1 pg_tpch-master]$ ls
drop-caches.sh  dss  LICENSE  process.php  README.md  tpch.sh
[dgadmin@linux1 pg_tpch-master]$ cd dss/
[dgadmin@linux1 dss]$ ls
templates  tpch-alter.sql  tpch-create.sql  tpch-index.sql  tpch-load.sql  tpch-pkeys.sql

8.修改tpch-load.sql文件,適應dg語法:


##########COPY命令格式有問題,為了獲得更好的效果,使用列存儲,修改如下舉例##########
) with (APPENDONLY=true,BLOCKSIZE=2097152,ORIENTATION=COLUMN,COMPRESSTYPE=lz4,CHECKSUM=true,OIDS=false);
備注:greenplum的壓縮類型為:QuickLZ

單獨用一個小文件測試數據是否可用:


psql -d tech
COPY region FROM '/tmp/dss-data/region.csv' WITH csv DELIMITER '|’;

9.把master下的所有文件複製到tpch的dbgen目錄下:


[dgadmin@linux1 ~]$ cp -r pg_tpch-master/* tpch_2_17_0/dbgen/
10.在dbgen文件夾下的dss目錄下創建文件夾:
[dgadmin@linux1 dbgen]$ mkdir dss/queries

10.生成測試sql:


[dgadmin@linux1 dbgen]$ 
for q in `seq 1 22`
 do
     DSS_QUERY=dss/templates ./qgen $q >> dss/queries/$q.sql
     sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql
     cat dss/queries/$q.sql >> dss/queries/$q.explain.sql;
 done

11.連接並創建測試數據庫tpch


[dgadmin@linux1 dbgen]$ psql -d postgres -p 15432
psql (8.2.15)
Type "help" for help.

postgres=# create database tpch owner=dgadmin;
CREATE DATABASE

12.設置幾個數據庫參數:


##########設置幾個參數##########
gpconfig -c enable_nestloop -v off
gpconfig -c work_mem -v 256MB
gpstop -u


13.執行測試:


nohup ./tpch.sh ./results tpch dgadmin &



最後更新:2017-06-11 14:33:03

  上一篇:go  Deepgreen(Greenplum) 模板數據庫template0和template1 探討
  下一篇:go  淺析分布式係統中的 Linearizability