一、虚拟机的设置
这里我采用的是使用虚拟机构建大数据集群,如果有钱也可以采用云平台进行大数据集群的构建。
在VMware的虚拟网络编辑器中,将VMnet8虚拟网卡的:
网段设置为192.168.88.0
网关设置为:192.168.88.2
下载CentOs7.6的镜像文件,然后创建新的虚拟机,选择标准安装,全自动化安装。等待安装完成以后,再进行克隆,克隆完整的系统,这里克隆三个,分别命名为node1、node2、node3,设置node1的内存为4GB、node2和node3的内存为2GB。
然后打开node1,修改主机名,并修改固定IP为:192.168.88.101,node2和node3分别为102和103
1 2 3 4 5 6 7
| hostnamectl set-hostname node1
vim /etc/sysconfig/network-scripts/ifcfg-ens33 IPADDR="192.168.88.101"
systemctl restart network
|
在windows系统中修改hosts文件,新增以下内容
1 2 3
| 192.168.88.101 node1 192.168.88.102 node2 192.168.88.103 node3
|
在3台Linux的/etc/hosts中,新增以下内容:
1 2 3
| 192.168.88.101 node1 192.168.88.102 node2 192.168.88.103 node3
|
配置SSH免密登录(每台都要执行一遍)
1 2 3 4
| ssh-keygen -t rsa -b 4096 ssh-copy-id node1 ssh-copy-id node2 ssh-copy-id node3
|
新增hadoop普通用户.然后重复上诉的SSH操作:
1 2 3 4 5 6
| useradd hadoop
passwd hadoop
su - hadoop
|
下载JDK
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| mkdir -p /export/server
tar -zxvf jdk-8u351-linux-x64.tar.gz -C /export/server
ln -s /export/server/jdk1.8.0_351 jdk
export JAVA_HOME=/export/server/jdk export PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
rm -f /usr/bin/java ln -s /export/server/jdk/bin/java /usr/bin/java
java -version javac -version
|
为了防止防火墙干扰大数据的软件,将防火墙关闭
1 2 3 4 5 6 7
| systemctl stop firewalld systemctl disable firewalld
vim /etc/sysconfig/selinux
SELINUX=disabled
|
调整每个服务器的时区为中国东八区的时区(校验时间)
1 2 3 4 5 6 7 8 9 10
| yum install -y ntp
rm -f /etc/localtime sudo ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
ntpdate -u ntp.aliyun.com
systemctl start ntpd systemctl enable ntpd
|
二、Hadoop的部署
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
|
tar -zxvf hadoop-3.3.4.tar.gz -C /export/server
ln -s /export/server/hadoop-3.3.4 hadoop
vim workers
node1 node2 node3
export JAVA_HOME=/export/server/jdk export HADOOP_HOME=/export/server/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HADOOP_LOG_DIR=$HADOOP_HOME/logs
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://node1:8020</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> </configuration>
<configuration> <property> <name>dfs.datanode.data.dir.perm</name> <value>700</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/data/nn</value> </property> <property> <name>dfs.namenode.hosts</name> <value>node1,node2,node3</value> </property> <property> <name>dfs.blocksize</name> <value>268435456</value> </property> <property> <name>dfs.namenode.handler.count</name> <value>100</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/dn</value> </property> </configuration>
mkdir -p /data/nn mkdir -p /data/dn
mkdir -p /data/dn
cd /export/server scp -r hadoop-3.3.4 node2:`pwd`/ scp -r hadoop-3.3.4 node3:`pwd`/
ln -s /export/server/hadoop-3.3.4 /export/server/hadoop
export HADOOP_HOME=/export/server/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
chown -R hadoop:hadoop /data chown -R hadoop:hadoop /export
su - hadoop hadoop namenode -format
start-dfs.sh
stop-dfs.sh
|