做网站需要买服务器,邢台做网站的公司有那个,网站做接口到app价格,高端网站设计公司排行榜大数据开发-Hadoop分布式集群搭建 文章目录 大数据开发-Hadoop分布式集群搭建环境准备Hadoop配置启动Hadoop集群Hadoop客户端节点Hadoop客户端节点 环境准备
JDK1.8Hadoop3.X三台服务器 主节点需要启动namenode、secondary namenode、resource manager三个进程
从节点需要启动…大数据开发-Hadoop分布式集群搭建 文章目录 大数据开发-Hadoop分布式集群搭建环境准备Hadoop配置启动Hadoop集群Hadoop客户端节点Hadoop客户端节点 环境准备
JDK1.8Hadoop3.X三台服务器 主节点需要启动namenode、secondary namenode、resource manager三个进程
从节点需要启动datanode、node manager两个进程下面按照步骤进行搭建。 环境配置 # 三台服务器都要改 hosts文件
[roothadoop01 ~]# vim /etc/hosts
[roothadoop02 ~]# vim /etc/hosts
[roothadoop03 ~]# vim /etc/hosts# 添加如下信息根据自己的服务器ip以及名称进行修改
192.168.52.100 hadoop01
192.168.52.101 hadoop02
192.168.52.102 hadoop03# 同步服务器时间
[roothadoop01 ~]# yum install ntpdate
[roothadoop01 ~]# ntpdate -u ntp.sjtu.edu.cn5 Mar 09:38:26 ntpdate[1746]: step time server 17.253.84.125 offset 1.068029 sec[roothadoop02 ~]# yum install ntpdate
[roothadoop02 ~]# ntpdate -u ntp.sjtu.edu.cn5 Mar 09:38:26 ntpdate[1746]: step time server 17.253.84.125 offset 1.068029 sec[roothadoop03 ~]# yum install ntpdate
[roothadoop03 ~]# ntpdate -u ntp.sjtu.edu.cn5 Mar 09:38:26 ntpdate[1746]: step time server 17.253.84.125 offset 1.068029 sec# 定时同步
[roothadoop01 ~]# vim /etc/crontab
[roothadoop02 ~]# vim /etc/crontab
[roothadoop03 ~]# vim /etc/crontab# crontab中添加如下内容* * * * * root /usr/sbin/ntpdate -u ntp.sjtu.edu.cn 主节点免密登陆从节点 # 将主节点的认证文件copy到其它两个从节点
[roothadoop01 ~]# scp ~/.ssh/authorized_keys hadoop02:~/
The authenticity of host hadoop02 (192.168.52.101) cant be established.
ECDSA key fingerprint is SHA256:sc01Vk7PIabS9viczEKdgVfwzIYVHA1xib77Q8tczk.
ECDSA key fingerprint is MD5:ea:15:4e:5f:b0:83:4f:75:ed:1d:2f:02:c4:fa:04:3f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added hadoop02,192.168.52.101 (ECDSA) to the list of known hosts.
roothadoop02s password:
authorized_keys [roothadoop01 ~]# scp ~/.ssh/authorized_keys hadoop03:~/
The authenticity of host hadoop03 (192.168.52.102) cant be established.
ECDSA key fingerprint is SHA256:sc01Vk7PIabS9viczEKdgVfwzIYVHA1xib77Q8tczk.
ECDSA key fingerprint is MD5:ea:15:4e:5f:b0:83:4f:75:ed:1d:2f:02:c4:fa:04:3f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added hadoop03,192.168.52.102 (ECDSA) to the list of known hosts.
authorized_keys ## 如果没有authorized_keys 可以通过以下生成
[roothadoop01 ~]# ssh-keygen -t rsa
[roothadoop01 ~]# cat ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys# 在其它两个从节点执行
[roothadoop02 ~]# cat ~/authorized_keys ~/.ssh/authorized_keys
[roothadoop03 ~]# cat ~/authorized_keys ~/.ssh/authorized_keys# 执行完成之后可以在主节点免密登陆其它两个从节点
[roothadoop01 ~]# ssh hadoop02
Last login: Mon Mar 4 16:41:58 2024 from fe80::32d8:512f:316e:a311%ens33
到此为止环境配置完毕
Hadoop配置
# 解压
[roothadoop01 soft]# tar -zxvf hadoop-3.2.0.tar.gz
#修改配置文件
[roothadoop01 hadoop]# vim hadoop-env.sh
export JAVA_HOME/home/soft/jdk1.8
export HADOOP_LOG_DIR/home/hadoop_repo/logs/hadoop# core-site.xml
[roothadoop01 hadoop]# vim core-site.xml
configurationpropertynamefs.defaultFS/namevaluehdfs://hadoop01:9000/value/propertypropertynamehadoop.tmp.dir/namevalue/home/hadoop_repo/data/value/property
/configuration[roothadoop01 hadoop]# vim hdfs-site.xml configurationpropertynamedfs.replication/namevalue2/value/propertypropertynamedfs.namenode.secondary.http-address/namevaluehadoop01:50090/value/property/configuration# mapred-site.xml
[roothadoop01 hadoop]# vim mapred-site.xml configurationpropertynamemapreduce.framework.name/namevalueyarn/value/property
/configuration # yarn-site.xml
[roothadoop01 hadoop]# vim yarn-site.xml
configuration
!--指定MR走shuffle--propertynameyarn.nodemanager.aux-services/namevaluemapreduce_shuffle/value/property!-- Site specific YARN configuration properties --!--环境变量的继承--propertynameyarn.nodemanager.env-whitelist/namevalueJAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME/value/propertypropertynameyarn.resourcemanager.hostname/namevaluehadoop01/value/property
/configuration# workers 指定从节点
[roothadoop01 hadoop]# vim workers hadoop02
hadoop03 Hadoop脚本修改 ## start-dfs.sh
[roothadoop01 sbin]# vim start-dfs.sh
# 文件起始位置添加
HDFS_DATANODE_USERroot
HDFS_DATANODE_SECURE_USERhdfs
HDFS_NAMENODE_USERroot
HDFS_SECONDARYNAMENODE_USERroot## stop-dfs.sh
[roothadoop01 sbin]# vim stop-dfs.sh
# 文起始位置添加
HDFS_DATA_NODE_USERroot
HDFS_DATANODE_SECURE_USERhdfs
HDFS_NAMENODE_USERroot
HDFS_SECONDARYNAMENODE_USERroot## start-yarn.sh
[roothadoop01 sbin]# vim start-yarn.sh
# 文起始位置添加
YARN_RESOURCEMANAGER_USERroot
HADOOP_SECURE_DN_USERyarn
YARN_NODEMANAGER_USERroot## stop-yarn.sh
[roothadoop01 sbin]# vim stop-yarn.sh
# 文起始位置添加
YARN_RESOURCEMANAGER_USERroot
HADOOP_SECURE_DN_USERyarn
YARN_NODEMANAGER_USERroot 从节点配置 # 将修改好的hadoop拷贝到其它两台机器上
[roothadoop01 soft]# scp -rq hadoop-3.2.0 hadoop02:/home/soft/
[roothadoop01 soft]# scp -rq hadoop-3.2.0 hadoop03:/home/soft/## 格式化主节点
[roothadoop01 hadoop-3.2.0]# bin/hdfs namenode -format
启动Hadoop集群
# 启动
[roothadoop01 sbin]# start-all.sh
Starting namenodes on [hadoop01]
Last login: Tue Mar 5 11:11:55 CST 2024 from 192.168.52.1 on pts/0
Starting datanodes
Last login: Tue Mar 5 11:16:32 CST 2024 on pts/0
Starting secondary namenodes [hadoop01]
Last login: Tue Mar 5 11:16:35 CST 2024 on pts/0
Starting resourcemanager
Last login: Tue Mar 5 11:16:40 CST 2024 on pts/0
Starting nodemanagers
Last login: Tue Mar 5 11:16:47 CST 2024 on pts/0
You have new mail in /var/spool/mail/root# 查看进程
[roothadoop01 sbin]# jps # 主节点
1765 SecondaryNameNode
2007 ResourceManager
2329 Jps
1500 NameNode[roothadoop03 ~]# jps # 从节点
1361 NodeManager
1252 DataNode
1477 Jps
You have new mail in /var/spool/mail/root[roothadoop02 ~]# jps # 从节点
1513 Jps
1418 NodeManager
1308 DataNode
启动完成
官方文档地址https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
Hadoop客户端节点
在实际工作中不建议直接连接集群中的节点来操作集群所以我们需要在业务机器上安装Hadoop这样就可以在业务机器上操作Hadoop集群了这个机器就称为Hadoop的客户端节点。
在这个客户端节点只需要安装基本的java环境、hadoop环境就可以使用了(不要启动hadoop进程不然就变成集群中的机器了)。
e.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
Hadoop客户端节点
在实际工作中不建议直接连接集群中的节点来操作集群所以我们需要在业务机器上安装Hadoop这样就可以在业务机器上操作Hadoop集群了这个机器就称为Hadoop的客户端节点。
在这个客户端节点只需要安装基本的java环境、hadoop环境就可以使用了(不要启动hadoop进程不然就变成集群中的机器了)。