655
技術社區[雲棲]
E-MapReduce集群中HDFS服務集成Kerberos
本文介紹在E-MapReduce集群中HDFS服務集成Kerberos。
前置:
- 創建E-MapReduce集群,本文以
非HA集群
的HDFS為例 - HDFS服務在hdfs賬號下啟動
- HDFS軟件包路徑
/usr/lib/hadoop-current
,配置在/etc/emr/hadoop-conf/
一、 安裝 配置Kerberos
1. 安裝Kerberos
master節點執行:
sudo yum install krb5-server krb5-devel krb5-workstation -y
slave節點執行:
sudo yum install krb5-devel krb5-workstation -y
2. 配置Kerberos
-
master節點上麵修改配置:
a) /etc/krb5.conf[logging] default = FILE:/var/log/krb5libs.log kdc = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log [libdefaults] default_realm = EMR.COM dns_lookup_realm = false dns_lookup_kdc = false ticket_lifetime = 24h renew_lifetime = 7d forwardable = true [realms] EMR.COM = { kdc = emr-header-1 admin_server = emr-header-1 } [domain_realm] .emr.com = EMR.COM emr.com = EMR.COM
b) /var/kerberos/krb5kdc/kdc.conf
[kdcdefaults] kdc_ports = 88 kdc_tcp_ports = 88 [realms] EMR.COM = { #master_key_type = aes256-cts acl_file = /var/kerberos/krb5kdc/kadm5.acl dict_file = /usr/share/dict/words admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal }
c) /var/kerberos/krb5kdc/kadm5.acl
*/admin@EMR.COM *
slave節點修改配置
隻需將上麵master節點修改過的/etc/krb5.conf
文件拷貝到slave節點對應文件夾即可。
3. 創建數據庫
在master節點
上麵執行:
sudo kdb5_util create -r EMR.COM -s
備注:
若出現Loading random data
卡住(需要等一會),可以另外開一個終端執行一些耗費cpu的操作,增加隨機數采集
4. 啟動Kerberos
在master節點
執行:
sudo service krb5kdc start
sudo service kadmin start
5. 創建kadmin管理員賬號
在master節點root賬號
上麵執行
$kadmin.local
#進入kadmin後繼續執行:
$addprinc root/admin
#輸入密碼,記住後麵執行kadmin時需要輸入
後續可以在所有集群所有節點上使用kadmin命令
來管理Kerberos的一些數據庫操作(如添加principal等)
備注:kadmin.local
隻能在kadmin server所在的機器(即master節點)且擁有root權限情況下才能執行,其它情況使用kadmin
二、HDFS服務集成Kerberos
1. 創建keytab文件
在集群的每個節點上麵創建對應的keytab文件,用於HDFS服務各個Daemon(如NameNode/DataNode等)之間的身份認證,防止非法的節點
加入集群。
E-MapReduce集群中的HDFS的所有Daemon都是在hdfs
賬號下啟動,所以各個Daemon使用共用相同的keytab配置。
接下來分別在集群的每台機器上麵分別執行下麵命令:以master節點為例,其它節點按照同樣的方式操作
$sudo su hdfs
$hostname
emr-header-1.cluster-xxxx
#後麵需要使用hostname
$sudo kadmin
#輸入密碼,進入kadmin後執行
# principal使用了上麵的hostname即emr-header-1.cluster-xxxx
$kadmin: addprinc -randkey hdfs/emr-header-1.cluster-xxxx@EMR.COM
$kadmin: addprinc -randkey HTTP/emr-header-1.cluster-xxxx@EMR.COM
$kadmin: xst -k hdfs-unmerged.keytab hdfs/emr-header-1.cluster-xxxx@EMR.COM
$kadmin: xst -k http.keytab HTTP/emr-header-1.cluster-xxxx@EMR.COM
$kadmin: exit
#合並http.keytab和hdfs-unmerged.keytab
$sudo ktutil
#進入ktutil後執行:
$ktutil: rkt hdfs-unmerged.keytab
$ktutil: rkt http.keytab
$ktutil: wkt hdfs.keytab
$ktutil: exit
#將hdfs.keytab拷貝到/etc/emr/hadoop-conf
$sudo cp hdfs.keytab /etc/emr/hadoop-conf
$sudo chown hdfs:hadoop /etc/emr/hadoop-conf/hdfs.keytab
$sudo chmod 400 /etc/emr/hadoop-conf/hdfs.keytab
2. 修改HDFS服務配置
HDFS服務集成Kerberos需要修改core-site.xml
和hdfs-site.xml
,如下:
備注: 集群所有節點都需要修改
a) core-site.xml
路徑: /etc/emr/hadoop-conf/core-site.xml
使用hadoop賬號來操作sudo su hadoop
添加如下配置項:
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value> <!-- A value of "simple" would disable security. -->
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
b) hdfs-site.xml
路徑: /etc/emr/hadoop-conf/hdfs-site.xml
使用hadoop賬號來操作sudo su hadoop
添加如下配置項:
<!-- General HDFS security config -->
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<!-- NameNode security config -->
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/emr/hadoop-conf/hdfs.keytab</value> <!-- path to the HDFS keytab -->
</property>
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/_HOST@EMR.COM</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/_HOST@EMR.COM</value>
</property>
<!-- Secondary NameNode security config -->
<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/etc/emr/hadoop-conf/hdfs.keytab</value> <!-- path to the HDFS keytab -->
</property>
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>hdfs/_HOST@EMR.COM</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/_HOST@EMR.COM</value>
</property>
<!-- DataNode security config -->
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>700</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/emr/hadoop-conf/hdfs.keytab</value> <!-- path to the HDFS keytab -->
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/_HOST@EMR.COM</value>
</property>
<!-- datanode SASL配置 -->
<property>
<name>dfs.http.policy</name>
<value>HTTPS_ONLY</value>
</property>
<property>
<name>dfs.data.transfer.protection</name>
<value>integrity</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>HTTP/_HOST@EMR.COM</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.keytab</name>
<value>/etc/emr/hadoop-conf/hdfs.keytab</value> <!-- path to the HTTP keytab -->
</property>
3. 生成keystore文件
HDFS中使用HTTPS來傳輸數據,需要有keystore相關配置。
在master節點
上麵執行:
$sudo su hadoop
#生成了ca相關文件
$openssl req -new -x509 -keyout ca-key -out ca-cert -days 1000
繼續在master節點
上重複
按照如下命令,分別為集群所有節點
生成keystore/truststore文件
備注: 每次為新節點重複
執行,需要更換命令中的一些文件名稱(防止被覆蓋),下麵以尖括號(<>)標出
# 以為master節點生成keystore/truststore為例
$keytool -keystore <keystore> -alias localhost -validity 1000 -genkey
輸入密鑰庫口令:
再次輸入新口令:
您的名字與姓氏是什麼?
[Unknown]: emr-header-1 #備注: 不同節點不一樣,如emr-worker-1
您的組織單位名稱是什麼?
[Unknown]: EMR
您的組織名稱是什麼?
[Unknown]: EMR
您所在的城市或區域名稱是什麼?
[Unknown]: EMR
您所在的省/市/自治區名稱是什麼?
[Unknown]: EMR
該單位的雙字母國家/地區代碼是什麼?
[Unknown]: EMR
CN=emr-worker-2, OU=EMR, O=EMR, L=EMR, ST=EMR, C=EMR是否正確?
輸入 <localhost> 的密鑰口令
(如果和密鑰庫口令相同, 按回車):
$keytool -keystore <truststore> -alias CARoot -import -file ca-cert
$keytool -keystore <keystore> -alias localhost -certreq -file <cert-file>
$openssl x509 -req -CA ca-cert -CAkey ca-key -in <cert-file> -out <cert-signed> -days 1000 -CAcreateserial -passin pass:AMRtest1234
$keytool -keystore <keystore> -alias CARoot -import -file ca-cert
$keytool -keystore <keystore> -alias localhost -import -file <cert-signed>
執行完上述命令後,將在當前文件夾下會生成新文件<keystore>
和<truststore>
拷貝scp
到對應機器
的/etc/emr/hadoop-conf/
目錄下
#master節點不需要scp,直接cp過去
$cp keystore /etc/emr/hadoop-conf
$cp keystore /etc/emr/hadoop-conf
4. 配置ssl
在master節點
上麵執行
$sudo su hadoop
$cp /etc/emr/hadoop-conf/ssl-server.xml.example /etc/emr/hadoop-conf/ssl-server.xml
修改,不是覆蓋
ssl-server.xml文件中相關配置項對應的key
備注:
配置中密碼需要替換成自己的上麵生成keystore/truststore時的密碼
<property>
<name>ssl.server.truststore.location</name>
<value>/etc/emr/hadoop-conf/truststore</value>
<description>Truststore to be used by NN and DN. Must be specified.
</description>
</property>
<property>
<name>ssl.server.truststore.password</name>
<value>YOUR_TRUSTSTORE_PASSWD</value>
<description>Optional. Default value is "".
</description>
</property>
<property>
<name>ssl.server.keystore.location</name>
<value>/etc/emr/hadoop-conf/keystore</value>
<description>Keystore to be used by NN and DN. Must be specified.
</description>
</property>
<property>
<name>ssl.server.keystore.password</name>
<value>YOUR_KEYSTORE_PASSWD</value>
<description>Must be specified.
</description>
</property>
<property>
<name>ssl.server.keystore.keypassword</name>
<value>YOUR_KEYSTORE_PASSWD</value>
<description>Must be specified.
</description>
</property>
最後,將master節點
的這個ssl-server.xml
文件 scp
到其它所有節點/etc/emr/hadoop-conf目錄下麵。
5. 重啟HDFS服務
在master
節點上麵執行:
$sudo su hdfs
#停止集群HDFS服務
$/usr/lib/hadoop-current/sbin/stop-dfs.sh
#啟動NameNode
$/usr/lib/hadoop-current/sbin/hadoop-daemon.sh start namenode
在slave
節點上麵執行:
#啟動DataNode
$sudo su hdfs
$/usr/lib/hadoop-current/sbin/hadoop-daemon.sh start datanode
6. 驗證HDFS
在master節點
上麵執行:
$useradd testkb
$sudo su testkb
$hadoop fs -ls /
17/05/09 12:04:19 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "emr-header-1.cluster-xxxx/10.26.6.62"; destination host is: "emr-header-1.cluster-xxxx":9000;
出現上麵錯誤,說明HDFS服務的Kerberos認證生效了,接著執行:
#從testkb賬號退出到root賬號執行
# 添加testkb的principal
$kadmin.local
$kadmin.local: addprinc testkb
重新進入testkb賬號
$sudo su testkb
$hadoop fs -ls /
17/05/09 12:04:19 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "emr-header-1.cluster-xxxx/10.26.6.62"; destination host is: "emr-header-1.cluster-xxxx":9000;
#獲取testkb的TGT
$kinit testkb
#驗證成功
$hadoop fs -ls /
drwxr-xr-x - hadoop hadoop 0 2017-05-09 10:12 /apps
drwxr-xr-x - hadoop hadoop 0 2017-05-09 11:57 /spark-history
drwxrwxrwx - hadoop hadoop 0 2017-05-09 10:12 /tmp
drwxr-xr-x - hadoop hadoop 0 2017-05-09 10:14 /usr
最後更新:2017-05-09 15:31:40