289 阿裏雲技術社區[雲棲]

新零售空間數據庫實踐一例 - PostGIS 點麵疊加視覺判斷輸出

標簽

PostgreSQL , 點麵視覺輸出 , subquery , nestloop join , 空間索引 , gist

背景

在新零售、快遞等行業，有大量的點數據（例如包裹位置、快遞員位置、倉庫位置等），同時有大量的麵數據（如小區，商圈，寫字樓等）。

如何判斷實時的正在配送的包裹落在哪個麵呢？並且將之聯係起來。

這個從視覺角度來思考，非常簡單。

例如有一個地圖，將其劃分為若幹個麵（例如前麵提到的小區）。

然後有一些小點，這些是POINT數據。

我們從圖上一眼就能看出每個點落在哪個小區（麵）裏麵。

在數據庫中，這可能是兩份數據（一份為點，一份為麵）。輸出的實際上是點+麵(ID)的數據。

怎麼做到高效的輸出呢？

DEMO

搜索某些訂單，當前處於哪個麵。這是非常典型的點麵判斷需求。

接下來的例子，有25萬個麵，查詢若幹筆訂單屬於哪個麵。

1、創建、生成靜態的麵數據（通常麵的數據是靜態的，例如小區，商圈，大樓，倉庫覆蓋範圍等）

postgres=# create table t2(id int, pos box);  
CREATE TABLE  
  
-- 在(0,0)到(500,500)的平麵上，劃分成251001個正方形的小麵。  
  
postgres=# do language plpgsql $$    
declare  
x int;  
y int;  
begin  
for x in 0..500 loop  
for y in 0..500 loop   
  insert into t2 values (x+y, box(point(x,y),point(x+1,y+1)));  
end loop;  
end loop;  
end;  
$$;  
DO  
  
postgres=# select count(*) from t2;  
 count    
--------  
 251001  
(1 row)

創建空間索引

postgres=# create index idx_t2 on t2 using gist(pos);
CREATE INDEX

2、創建、生成點的數據。

postgres=# create table t1(id int, pos point);  
CREATE TABLE

-- 在(0,0),(500,500)的平麵上，生成10000個隨機的點  
  
postgres=# insert into t1 select id, point(random()*500, random()*500) from generate_series(1,10000) t(id);  
INSERT 0 10000

3、查詢每個點，屬於哪個麵。

方法1，JOIN

postgres=# explain analyze select t1.*,t2.* from t1 join t2 on (t2.pos @> box(t1.pos, t1.pos));  
                                                       QUERY PLAN                                                         
------------------------------------------------------------------------------------------------------------------------  
 Nested Loop  (cost=0.29..73322.20 rows=2510010 width=56) (actual time=0.094..1191.076 rows=10000 loops=1)  
   ->  Seq Scan on t1  (cost=0.00..116.00 rows=10000 width=20) (actual time=0.020..1.047 rows=10000 loops=1)  
   ->  Index Scan using idx_t2 on t2  (cost=0.29..4.81 rows=251 width=36) (actual time=0.039..0.118 rows=1 loops=10000)  
         Index Cond: (pos @> box(t1.pos, t1.pos))  
 Planning time: 0.102 ms  
 Execution time: 1191.619 ms  
(6 rows)

方法2，SUBQUERY

postgres=# explain analyze select t1.*, (select t2 from t2 where t2.pos @> box(t1.pos,t1.pos) limit 1) from t1;  
                                                            QUERY PLAN                                                              
----------------------------------------------------------------------------------------------------------------------------------  
 Seq Scan on t1  (cost=0.00..13706.74 rows=10000 width=52) (actual time=0.077..427.466 rows=10000 loops=1)  
   SubPlan 1  
     ->  Limit  (cost=0.29..1.36 rows=1 width=60) (actual time=0.042..0.042 rows=1 loops=10000)  
           ->  Index Scan using idx_t2 on t2  (cost=0.29..269.88 rows=251 width=60) (actual time=0.042..0.042 rows=1 loops=10000)  
                 Index Cond: (pos @> box(t1.pos, t1.pos))  
 Planning time: 0.080 ms  
 Execution time: 427.942 ms  
(7 rows)

如果是1000筆訂單，返回差不多40毫秒

postgres=# explain (analyze,verbose,timing,costs,buffers) select t1.*, (select t2 from t2 where t2.pos @> box(t1.pos,t1.pos) limit 1) from t1 limit 1000;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..1370.67 rows=1000 width=52) (actual time=0.069..39.754 rows=1000 loops=1)
   Output: t1.id, t1.pos, ((SubPlan 1))
   Buffers: shared hit=3002
   ->  Seq Scan on public.t1  (cost=0.00..13706.74 rows=10000 width=52) (actual time=0.069..39.658 rows=1000 loops=1)
         Output: t1.id, t1.pos, (SubPlan 1)
         Buffers: shared hit=3002
         SubPlan 1
           ->  Limit  (cost=0.29..1.36 rows=1 width=60) (actual time=0.039..0.039 rows=1 loops=1000)
                 Output: t2.*
                 Buffers: shared hit=3000
                 ->  Index Scan using idx_t2 on public.t2  (cost=0.29..269.88 rows=251 width=60) (actual time=0.039..0.039 rows=1 loops=1000)
                       Output: t2.*
                       Index Cond: (t2.pos @> box(t1.pos, t1.pos))
                       Buffers: shared hit=3000
 Planning time: 0.066 ms
 Execution time: 39.830 ms
(16 rows)

因為@>暫時不支持hash join，因此subquery更優一些。

本文沒有用到PostGIS空間數據庫插件，而是使用了內置的平麵幾何類型，用於演示。

真實場景請使用PostGIS。

https://postgis.net/

例如

select  t1.*, (select t2 from t2 where ST_Within(t1.geom, t2.geom) limit 1) from t1;

小結

點麵判斷在GIS信息崛起的今天，在越來越多的企業中成為了非常常見的需求，比如文中提到的。

PostgreSQL在空間數據庫領域有非常豐富的應用，從科研、軍工、商業到民用，無處不在。

結合空間索引，BRIN索引實現空間數據的高效率檢索是很輕鬆的事情。

最後更新：2017-09-18 00:04:29

新零售空間數據庫實踐一例 - PostGIS 點麵疊加視覺判斷輸出

標簽

背景

DEMO

小結

上一篇：海量用戶實時定位和圈人 - 團圓社會公益係統(位置尋人\圈人)

下一篇： Greenplum ao表和heap表垃圾回收的細微差別

相關內容

熱門內容

最新內容

新零售空間數據庫實踐一例 - PostGIS 點麵疊加視覺判斷輸出

標簽

背景

DEMO

小結

上一篇： 海量用戶實時定位和圈人 - 團圓社會公益係統(位置尋人\圈人)

下一篇： Greenplum ao表和heap表垃圾回收的細微差別

相關內容

熱門內容

最新內容

上一篇：海量用戶實時定位和圈人 - 團圓社會公益係統(位置尋人\圈人)