PostgreSQL 中生成隨機漢字
標簽
PostgreSQL , 字符集 , 漢字編碼 , 隨機漢字 , chr , ascii , unicode
背景
漢字unicode編碼範圍
php中utf-8編碼下用正則表達式匹配漢字的最終正確表達式 ——
/[\x{4e00}-\x{9fa5}]/u
\u4e00-\u9fa5 (中文)
\x3130-\x318F (韓文)
\xAC00-\xD7A3 (韓文)
\u0800-\u4e00 (日文)
4e00 對應 19968
9fa5 對應 40869
一共 20901 個漢字
例子1
- 注意,為了保證能輸出所有的漢字,確保數據庫的編碼為UTF8,否則會報類似錯誤。
db=# select '\u9fa5'::text;
ERROR: Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8 at or near "'\u9fa5"
LINE 1: select '\u9fa5'::text;
^
例子
set standard_conforming_strings =off;
set escape_string_warning=off;
postgres=# select '\u9fa5'::text;
text
------
龥
(1 row)
例子2
除了unicode的寫法,PostgreSQL還提供了兩個函數,支持數值的寫法。
Function | Return Type | Description | Example | Result |
---|---|---|---|---|
ascii(string) | int | ASCII code of the first character of the argument. For UTF8 returns the Unicode code point of the character. For other multibyte encodings, the argument must be an ASCII character. | ascii('x') | 120 |
chr(int) | text | Character with the given code. For UTF8 the argument is treated as a Unicode code point. For other multibyte encodings the argument must designate an ASCII character. The NULL (0) character is not allowed because text data types cannot store such bytes. | chr(65) | A |
postgres=# select chr(19968);
chr
-----
一
(1 row)
輸出所有漢字
例子
do language plpgsql $$
declare
res text := '';
begin
for i in 19968..40869 loop
res := res||chr(i);
end loop;
raise notice '%', res;
end;
$$;
NOTICE: 一丁丂七丄丅丆萬丈三上下丌不與丏丐醜丒專且丕世丗丘丙業叢東絲.............................
輸出隨機漢字
例子
創建函數
create or replace function gen_hanzi(int) returns text as $$
declare
res text;
begin
if $1 >=1 then
select string_agg(chr(19968+(random()*20901)::int), '') into res from generate_series(1,$1);
return res;
end if;
return null;
end;
$$ language plpgsql strict;
使用函數,生成隨機漢字
postgres=# select gen_hanzi(10);
gen_hanzi
----------------------
析埲錀噝穎燈嬪閸醿厧
(1 row)
postgres=# select gen_hanzi(10);
gen_hanzi
----------------------
仫哸擡襖批梹繜嚪隸別
(1 row)
參考
https://baike.baidu.com/item/Unicode
https://en.wikipedia.org/wiki/Unicode
《PostgreSQL 轉義、UNICODE、與SQL注入》
https://www.postgresql.org/docs/9.6/static/functions-string.html
最後更新:2017-04-26 17:31:25