Greenplum TPC-H測試
請參考如下
https://www.tpc.org/information/current_specifications.asp
https://github.com/digoal/pg_tpch
下載tpch
git clone https://github.com/digoal/pg_tpch.git
安裝tpch
$ cd gp_tpch
需要在Makefile中設置幾個參數:
################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
# SQLSERVER, SYBASE, ORACLE, VECTORWISE
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE=ORACLE
MACHINE = LINUX
WORKLOAD = TPCH
安裝
$make
輸出dbgen用法:
$./dbgen -h
TPC-H Population Generator (Version 2.17.0 build 0)
Copyright Transaction Processing Performance Council 1994 - 2010
USAGE:
dbgen [-{vf}][-T {pcsoPSOL}]
[-s <scale>][-C <procs>][-S <step>]
dbgen [-v] [-O m] [-s <scale>] [-U <updates>]
Basic Options
===========================
-C <n> -- separate data set into <n> chunks (requires -S, default: 1)
-f -- force. Overwrite existing files
-h -- display this message
-q -- enable QUIET mode
-s <n> -- set Scale Factor (SF) to <n> (default: 1)
-S <n> -- build the <n>th step of the data/update set (used with -C or -U)
-U <n> -- generate <n> update sets
-v -- enable VERBOSE mode
Advanced Options
===========================
-b <s> -- load distributions for <s> (default: dists.dss)
-d <n> -- split deletes between <n> files (requires -U)
-i <n> -- split inserts between <n> files (requires -U)
-T c -- generate cutomers ONLY
-T l -- generate nation/region ONLY
-T L -- generate lineitem ONLY
-T n -- generate nation ONLY
-T o -- generate orders/lineitem ONLY
-T O -- generate orders ONLY
-T p -- generate parts/partsupp ONLY
-T P -- generate parts ONLY
-T r -- generate region ONLY
-T s -- generate suppliers ONLY
-T S -- generate partsupp ONLY
To generate the SF=1 (1GB), validation database population, use:
dbgen -vf -s 1
To generate updates for a SF=1 (1GB), use:
dbgen -v -U 1 -s 1
使用dbgen產生一些測試數據, -s 表示scale(單位為GB),根據需要測試的數據量指定:
$./dbgen -s 100 -f
$ll -h *.tbl
-rw-r--r-- 1 digoal users 12G Nov 19 13:35 customer.tbl
-rw-r--r-- 1 digoal users 6.6G Nov 19 13:39 lineitem.tbl
-rw-r--r-- 1 digoal users 2.2K Nov 19 12:48 nation.tbl
-rw-r--r-- 1 digoal users 1.5G Nov 19 13:39 orders.tbl
-rw-r--r-- 1 digoal users 1.2G Nov 19 12:48 partsupp.tbl
-rw-r--r-- 1 digoal users 233M Nov 19 12:48 part.tbl
-rw-r--r-- 1 digoal users 389 Nov 19 12:48 region.tbl
-rw-r--r-- 1 digoal users 686M Nov 19 13:30 supplier.tbl
生成的測試數據格式如下:
$head -n 10 customer.tbl
1|Customer#000000001|IVhzIApeRb ot,c,E|15|25-989-741-2988|711.56|BUILDING|to the even, regular platelets. regular, ironic epitaphs nag e|
2|Customer#000000002|XSTf4,NCwDVaWNe6tEgvwfmRchLXak|13|23-768-687-3665|121.65|AUTOMOBILE|l accounts. blithely ironic theodolites integrate boldly: caref|
3|Customer#000000003|MG9kdTD2WBHm|1|11-719-748-3364|7498.12|AUTOMOBILE| deposits eat slyly ironic, even instructions. express foxes detect slyly. blithely even accounts abov|
4|Customer#000000004|XxVSJsLAGtn|4|14-128-190-5944|2866.83|MACHINERY| requests. final, regular ideas sleep final accou|
5|Customer#000000005|KvpyuHCplrB84WgAiGV6sYpZq7Tj|3|13-750-942-6364|794.47|HOUSEHOLD|n accounts will have to unwind. foxes cajole accor|
6|Customer#000000006|sKZz0CsnMD7mp4Xd0YrBvx,LREYKUWAh yVn|20|30-114-968-4951|7638.57|AUTOMOBILE|tions. even deposits boost according to the slyly bold packages. final accounts cajole requests. furious|
7|Customer#000000007|TcGe5gaZNgVePxU5kRrvXBfkasDTea|18|28-190-982-9759|9561.95|AUTOMOBILE|ainst the ironic, express theodolites. express, even pinto beans among the exp|
8|Customer#000000008|I0B10bB0AymmC, 0PrRYBCP1yGJ8xcBPmWhl5|17|27-147-574-9335|6819.74|BUILDING|among the slyly regular theodolites kindle blithely courts. carefully even theodolites haggle slyly along the ide|
9|Customer#000000009|xKiAFTjUsCuxfeleNqefumTrjS|8|18-338-906-3675|8324.07|FURNITURE|r theodolites according to the requests wake thinly excuses: pending requests haggle furiousl|
10|Customer#000000010|6LrEaV6KR6PLVcgl2ArL Q3rqzLzcT1 v2|5|15-741-346-9870|2753.54|HOUSEHOLD|es regular deposits haggle. fur|
將測試數據轉換為postgresql識別的格式,刪除末尾的分隔符|。
$for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; done
$ll -rth *.csv
-rw-r--r-- 1 digoal users 12G Dec 1 13:32 customer.csv
-rw-r--r-- 1 digoal users 6.5G Dec 1 13:33 lineitem.csv
-rw-r--r-- 1 digoal users 2.2K Dec 1 13:33 nation.csv
-rw-r--r-- 1 digoal users 1.5G Dec 1 13:34 orders.csv
-rw-r--r-- 1 digoal users 1.2G Dec 1 13:34 partsupp.csv
-rw-r--r-- 1 digoal users 231M Dec 1 13:34 part.csv
-rw-r--r-- 1 digoal users 384 Dec 1 13:34 region.csv
-rw-r--r-- 1 digoal users 681M Dec 1 13:34 supplier.csv
把包含csv文件的目錄,軟鏈接到/tmp/dss-data。
$ ln -s `pwd` /tmp/dss-data
生成測試SQL , SF= dbgen指定的scale值
SF=?
mkdir dss/queries
for q in `seq 1 22`
do
DSS_QUERY=dss/templates ./qgen -s $SF $q > dss/queries/$q.sql
sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql
done
在greenplum數據庫中創建數據庫和用戶(也可以不創建,隻要有就可以)
psql
postgres=# create role digoal login;
CREATE ROLE
postgres=# create database postgres template template0 owner digoal;
CREATE DATABASE
配置pg_hba.conf
$ vi $MASTER_DATA_DIRECTORY/pg_hba.conf
host all all 127.0.0.1/32 trust
$ gpstop -u
設置幾個參數:
gpconfig -c enable_nestloop -v off
gpconfig -c work_mem -v 256MB
gpstop -u
測試,使用digoal用戶連接到postgres數據庫,結果輸出到./results目錄:
自動創建表,加載數據。詳見tpch.sh腳本
$ ./tpch.sh ./results ip port tpch-db tpch-user password {row|column}
如果要測試redshift可以這樣
Redshift (copy by ssh):
// add manifest file to s3 first
// manifest file must in $S3/${table}.manifest
$ ./tpch.sh ./results ip port tpch-db tpch-user password redshift S3 EC2_ID EC2_KEY
可以在results中查看一些信息,例如每個SQL的explain,時間。
結束後,可以使用以下方法生成CSV報告。
$ php process.php ./results output.csv
測試結果
2015-12-08 10:40:23 [1449542423] : running TPC-H benchmark
2015-12-08 10:40:23 [1449542423] : running queries defined in TPC-H benchmark
2015-12-08 10:40:23 [1449542423] : running query 1
2015-12-08 10:40:43 [1449542443] : query 1 finished OK (10 seconds)
2015-12-08 10:40:43 [1449542443] : running query 2
2015-12-08 10:40:47 [1449542447] : query 2 finished OK (2 seconds)
2015-12-08 10:40:47 [1449542447] : running query 3
2015-12-08 10:40:54 [1449542454] : query 3 finished OK (4 seconds)
2015-12-08 10:40:54 [1449542454] : running query 4
2015-12-08 10:40:56 [1449542456] : query 4 finished OK (1 seconds)
2015-12-08 10:40:56 [1449542456] : running query 5
2015-12-08 10:40:58 [1449542458] : query 5 finished OK (1 seconds)
2015-12-08 10:40:58 [1449542458] : running query 6
2015-12-08 10:40:59 [1449542459] : query 6 finished OK (1 seconds)
2015-12-08 10:40:59 [1449542459] : running query 7
2015-12-08 10:41:10 [1449542470] : query 7 finished OK (6 seconds)
2015-12-08 10:41:10 [1449542470] : running query 8
2015-12-08 10:41:18 [1449542478] : query 8 finished OK (4 seconds)
2015-12-08 10:41:18 [1449542478] : running query 9
2015-12-08 10:41:54 [1449542514] : query 9 finished OK (18 seconds)
2015-12-08 10:41:54 [1449542514] : running query 10
2015-12-08 10:41:55 [1449542515] : query 10 finished OK (1 seconds)
2015-12-08 10:41:55 [1449542515] : running query 11
2015-12-08 10:41:57 [1449542517] : query 11 finished OK (1 seconds)
2015-12-08 10:41:57 [1449542517] : running query 12
2015-12-08 10:41:59 [1449542519] : query 12 finished OK (1 seconds)
2015-12-08 10:41:59 [1449542519] : running query 13
2015-12-08 10:42:04 [1449542524] : query 13 finished OK (3 seconds)
2015-12-08 10:42:04 [1449542524] : running query 14
2015-12-08 10:42:05 [1449542525] : query 14 finished OK (1 seconds)
2015-12-08 10:42:05 [1449542525] : running query 15
2015-12-08 10:42:07 [1449542527] : query 15 finished OK (1 seconds)
2015-12-08 10:42:07 [1449542527] : running query 16
2015-12-08 10:42:10 [1449542530] : query 16 finished OK (2 seconds)
2015-12-08 10:42:10 [1449542530] : running query 17
2015-12-08 10:42:26 [1449542546] : query 17 finished OK (8 seconds)
2015-12-08 10:42:26 [1449542546] : running query 18
2015-12-08 10:42:46 [1449542566] : query 18 finished OK (10 seconds)
2015-12-08 10:42:46 [1449542566] : running query 19
2015-12-08 10:42:49 [1449542569] : query 19 finished OK (2 seconds)
2015-12-08 10:42:49 [1449542569] : running query 20
2015-12-08 10:42:51 [1449542571] : query 20 finished OK (1 seconds)
2015-12-08 10:42:51 [1449542571] : running query 21
2015-12-08 10:43:06 [1449542586] : query 21 finished OK (8 seconds)
2015-12-08 10:43:06 [1449542586] : running query 22
2015-12-08 10:43:10 [1449542590] : query 22 finished OK (2 seconds)
2015-12-08 10:43:10 [1449542590] : finished TPC-H benchmark
最後更新:2017-04-01 13:37:08