528
技術社區[雲棲]
搭建Gateway向E-MapReduce集群提交作業
Gateway
一些客戶需要自主搭建Gateway向E-MapReduce集群提交作業,目前E-MapReduce在產品頁麵上不支持購買Gateway,後續可以在產品上直接購買GatewayCentOS 7.2
,並把Hadoop環境準備好供用戶使用。
網絡
首先要保證Gateway機器在EMR對應集群的安全組中,Gateway節點可以順利的訪問EMR集群。設置機器的安全組請參考ECS的安全組設置說明。
環境
- EMR-3.1.1版本
將下麵腳本拷貝到Gataway機器並執行.
示例: sh deploy.sh 10.27.227.223 /root/master_password_file
#!/usr/bin/bash
if [ $# != 2 ]
then
echo "Usage: $0 master_ip master_password_file"
exit 1;
fi
masterip=$1
masterpwdfile=$2
if ! type sshpass >/dev/null 2>&1; then
yum install -y sshpass
fi
if ! type java >/dev/null 2>&1; then
yum install -y java-1.8.0-openjdk
fi
mkdir -p /opt/apps
mkdir -p /etc/emr
echo "Start to copy package from $masterip to local gateway(/opt/apps)"
echo " -copying hadoop-2.7.2"
sshpass -f $masterpwdfile scp -r root@$masterip:/opt/apps/hadoop-2.7.2 /opt/apps/
echo " -copying hive-2.0.1"
sshpass -f $masterpwdfile scp -r root@$masterip:/opt/apps/apache-hive-2.0.1-bin /opt/apps/
echo " -copying spark-2.1.1"
sshpass -f $masterpwdfile scp -r root@$masterip:/opt/apps/spark-2.1.1-bin-hadoop2.7 /opt/apps/
echo "Start to link /usr/lib/\${app}-current to /opt/apps/\${app}"
if [ -L /usr/lib/hadoop-current ]
then
unlink /usr/lib/hadoop-current
fi
ln -s /opt/apps/hadoop-2.7.2 /usr/lib/hadoop-current
if [ -L /usr/lib/hive-current ]
then
unlink /usr/lib/hive-current
fi
ln -s /opt/apps/apache-hive-2.0.1-bin /usr/lib/hive-current
if [ -L /usr/lib/spark-current ]
then
unlink /usr/lib/spark-current
fi
ln -s /opt/apps/spark-2.1.1-bin-hadoop2.7 /usr/lib/spark-current
echo "Start to copy conf from $masterip to local gateway($confhome)"
sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/hadoop-conf-2.7.2 /etc/emr/hadoop-conf-2.7.2
if [ -L /etc/emr/hadoop-conf ]
then
unlink /etc/emr/hadoop-conf
fi
ln -s /etc/emr/hadoop-conf-2.7.2 /etc/emr/hadoop-conf
sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/hive-conf-2.0.1 /etc/emr/hive-conf-2.0.1/
if [ -L /etc/emr/hive-conf ]
then
unlink /etc/emr/hive-conf
fi
ln -s /etc/emr/hive-conf-2.0.1 /etc/emr/hive-conf
sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/spark-conf /etc/emr/spark-conf-2.1.1
if [ -L /etc/emr/spark-conf ]
then
unlink /etc/emr/spark-conf
fi
ln -s /etc/emr/spark-conf-2.1.1 /etc/emr/spark-conf
echo "Start to copy environment from $masterip to local gateway(/etc/profile.d)"
sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hadoop.sh /etc/profile.d/
if [ -L /usr/lib/jvm/java ]
then
unlink /usr/lib/jvm/java
fi
ln -s /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre /usr/lib/jvm/java
echo "Start to copy host info from $masterip to local gateway(/etc/hosts)"
sshpass -f $masterpwdfile scp root@$masterip:/etc/hosts /etc/hosts_bak
cat /etc/hosts_bak | grep emr | grep cluster >>/etc/hosts
if ! id hadoop >& /dev/null
then
useradd hadoop
fi
- EMR-3.2.0
將下麵腳本拷貝到Gataway機器並執行.
示例: sh deploy.sh 10.27.227.223 /root/master_password_file
#!/usr/bin/bash
if [ $# != 2 ]
then
echo "Usage: $0 master_ip master_password_file"
exit 1;
fi
masterip=$1
masterpwdfile=$2
if ! type sshpass >/dev/null 2>&1; then
yum install -y sshpass
fi
if ! type java >/dev/null 2>&1; then
yum install -y java-1.8.0-openjdk
fi
mkdir -p /opt/apps
mkdir -p /etc/ecm
echo "Start to copy package from $masterip to local gateway(/opt/apps)"
echo " -copying hadoop-2.7.2"
sshpass -f $masterpwdfile scp -r root@$masterip:/opt/apps/ecm/service/hadoop/2.7.2/package/hadoop-2.7.2 /opt/apps/
echo " -copying hive-2.0.1"
sshpass -f $masterpwdfile scp -r root@$masterip:/opt/apps/ecm/service/hive/2.0.1/package/apache-hive-2.0.1-bin /opt/apps/
echo " -copying spark-2.1.1"
sshpass -f $masterpwdfile scp -r root@$masterip:/opt/apps/ecm/service/spark/2.1.1/package/spark-2.1.1-bin-hadoop2.7 /opt/apps/
echo "Start to link /usr/lib/\${app}-current to /opt/apps/\${app}"
if [ -L /usr/lib/hadoop-current ]
then
unlink /usr/lib/hadoop-current
fi
ln -s /opt/apps/hadoop-2.7.2 /usr/lib/hadoop-current
if [ -L /usr/lib/hive-current ]
then
unlink /usr/lib/hive-current
fi
ln -s /opt/apps/apache-hive-2.0.1-bin /usr/lib/hive-current
if [ -L /usr/lib/spark-current ]
then
unlink /usr/lib/spark-current
fi
ln -s /opt/apps/spark-2.1.1-bin-hadoop2.7 /usr/lib/spark-current
echo "Start to copy conf from $masterip to local gateway($confhome)"
sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/hadoop-conf-2.7.2 /etc/ecm/hadoop-conf-2.7.2
if [ -L /etc/ecm/hadoop-conf ]
then
unlink /etc/ecm/hadoop-conf
fi
ln -s /etc/ecm/hadoop-conf-2.7.2 /etc/ecm/hadoop-conf
sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/hive-conf-2.0.1 /etc/ecm/hive-conf-2.0.1/
if [ -L /etc/ecm/hive-conf ]
then
unlink /etc/ecm/hive-conf
fi
ln -s /etc/ecm/hive-conf-2.0.1 /etc/ecm/hive-conf
sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/spark-conf /etc/ecm/spark-conf-2.1.1
if [ -L /etc/ecm/spark-conf ]
then
unlink /etc/ecm/spark-conf
fi
ln -s /etc/ecm/spark-conf-2.1.1 /etc/ecm/spark-conf
echo "Start to copy environment from $masterip to local gateway(/etc/profile.d)"
sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hdfs.sh /etc/profile.d/
sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/yarn.sh /etc/profile.d/
sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hive.sh /etc/profile.d/
sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/spark.sh /etc/profile.d/
if [ -L /usr/lib/jvm/java ]
then
unlink /usr/lib/jvm/java
fi
echo "" >>/etc/profile.d/hdfs.sh
echo export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre >>/etc/profile.d/hdfs.sh
ln -s /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre /usr/lib/jvm/java
echo "Start to copy host info from $masterip to local gateway(/etc/hosts)"
sshpass -f $masterpwdfile scp root@$masterip:/etc/hosts /etc/hosts_bak
cat /etc/hosts_bak | grep emr | grep cluster >>/etc/hosts
if ! id hadoop >& /dev/null
then
useradd hadoop
fi
完成以上以後,配置就完成了。
測試
切換到hadoop賬號
- Hive
[hadoop@iZ23bc05hrvZ ~]$ hive
hive> show databases;
OK
default
Time taken: 1.124 seconds, Fetched: 1 row(s)
hive> create database school;
OK
Time taken: 0.362 seconds
hive>
- 運行Hadoop作業
[hadoop@iZ23bc05hrvZ ~]$ hadoop jar /usr/lib/hadoop-current/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 10 10
Number of Maps = 10
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
File Input Format Counters
Bytes Read=1180
File Output Format Counters
Bytes Written=97
Job Finished in 29.798 seconds
Estimated value of Pi is 3.20000000000000000000
- 運行Spark作業
[hadoop@iZ23bc05hrvZ ~]$spark-submit --class org.apache.spark.examples.JavaWordCount --master yarn-client ./sparkbench-4.0-SNAPSHOT-MR2-spark1.4-jar-with-dependencies.jar /path/Input /path/Output
最後更新:2017-06-15 00:32:17
上一篇:
從超模轉職成為程序媛是一種怎樣的體驗
下一篇:
盤點那些改變世界的女神
J2EE中獲取IP地址
Is a multi-cloud strategy the new normal?
《vSphere性能設計:性能密集場景下CPU、內存、存儲及網絡的最佳設計實踐》一第3章
《Spring 3.0就這麼簡單》——第1章 快速入門 1.1 Spring概述
《TensorFlow技術解析與實戰》——第2章 TensorFlow環境的準備 2.3基於Java的安裝
在ASP.NET中從SQL Server檢索圖片
WCF技術剖析之十四:泛型數據契約和集合數據契約(上篇)
新科技新文娛:從天貓雙11看優酷背後的產品技術升級
centos 源碼安裝 code:blocks方法和注意項
用戶層關閉瑞星2009殺毒軟件安全保護