1.安装jdk
		  1)下载jdk-8u65-linux-x64.tar.gz
2)创建/soft文件夹
  $>sudo mkdir /soft
  $>sudo chown grj:grj /soft
3)tar开
$>tar -xzvf jdk-8u65-linux-x64.tar.gz -C /soft
4)创建符号连接
			  $>ln -s /soft/jdk-1.8.0_65 /soft/jdk
		  5)验证jdk安装是否成功
			  $>cd /soft/jdk/bin
			  $>./java -version
6)centos配置环境变量
	  a)编辑/etc/profile
		  $>sudo nano /etc/profile
		  ...
		  export JAVA_HOME=/soft/jdk
		  exprot PATH=$PATH:$JAVA_HOME/bin
b)使环境变量即刻生效
		  $>source /etc/profile
	  c)进入任意目录下,测试是否ok
		  $>cd ~
		  $>java -version
2.安装hadoop(需要在集群的每一台主机都安装)
1)下载hadoop-2.7.3.tar.gz
2)tar开
			
$>tar -xzvf hadoop-2.7.3.tar.gz -C /soft
3)创建符号连接
			
$>ln -s /soft/hadoop-2.7.3 /soft/hadoop
4)验证hadoop安装是否成功
			
$>cd /soft/hadoop/bin
			$>./hadoop version
5)配置hadoop环境变量
$>sudo nano /etc/profile
		...
		export JAVA_HOME=/soft/jdk
		exprot PATH=$PATH:$JAVA_HOME/bin
		export HADOOP_HOME=/soft/hadoop
		export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
	
6)生效
$>source /etc/profile
3.集群机器配置(修改主机名)
1)每个主机都需在此文件中按照hosts文件中的主机名进行修改,重启后生效
/etc/hostname
		s201
2)/etc/hosts
		127.0.0.1 localhost
		192.168.24.201 s201
		192.168.24.202 s202
		192.168.24.203 s203
		192.168.24.204 s204
3)每个主机都需在此文件中按照hosts文件中的ip进行修改
/etc/sysconfig/network-scripts/ifcfg-exxxxx
...
		IPADDR=..
重启网络服务$>sudo service network restart
4)修改/etc/resolv.conf文件,所有主机都改为相同的nameserver
nameserver 192.168.24.2
4.准备完全分布式主机的ssh(实现用户在s201无密登录其它属于该集群的主机)
1)在s201主机上生成密钥对
		$>ssh-keygen -t rsa -P ‘‘ -f ~/.ssh/id_rsa
2)将s201的公钥文件id_rsa.pub远程复制到202 ~ 204主机上。
	  并放置/home/centos/.ssh/authorized_keys
		$>scp id_rsa.pub centos@s201:/home/grj/.ssh/authorized_keys
		$>scp id_rsa.pub centos@s202:/home/grj/.ssh/authorized_keys
		$>scp id_rsa.pub centos@s203:/home/grj/.ssh/authorized_keys
		$>scp id_rsa.pub centos@s204:/home/grj/.ssh/authorized_keys
5.配置完全分布式
1)创建配置目录(此时可以把原来的配置文件夹/soft/hadoop/etc/hadoop删除或重命名,防止后续符号链接与其重名)
$>cp -r /soft/hadoop/etc/hadoop /soft/hadoop/etc/full
2)创建符号连接
		$>ln -s /soft/hadoop/etc/full hadoop
3)修改配置文件(${hadoop_home}/etc/full/)
[core-site.xml]
		<?xml version="1.0" encoding="UTF-8"?>
		<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
		<configuration>
				<property>
						<name>fs.defaultFS</name>
						<value>hdfs://s201/</value>
				</property>
		</configuration>
		[hdfs-site.xml]
		<?xml version="1.0" encoding="UTF-8"?>
		<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
		<configuration>
				<property>
						<name>dfs.replication</name>
						<value>3</value>
				</property>
		</configuration>
		
		[mapred-site.xml]
			不变
		
		[yarn-site.xml]
		<?xml version="1.0"?>
		<configuration>
				<property>
						<name>yarn.resourcemanager.hostname</name>
						<value>s201</value>
				</property>
				<property>
						<name>yarn.nodemanager.aux-services</name>
						<value>mapreduce_shuffle</value>
				</property>
		</configuration>
4)修改slaves文件(该文件存放所有的数据节点主机名)
[/soft/hadoop/etc/full/slaves]
		s202
		s203
		s204
5)修改Hadoop环境变量文件[/soft/hadoop/etc/full/hadoop-env.sh]
...
		export JAVA_HOME=/soft/jdk
		...
6)分发配置
 $>cd /soft/hadoop/
		$>scp -r etc centos@s202:/soft/hadoop
		$>scp -r etc centos@s203:/soft/hadoop
		$>scp -r etc centos@s204:/soft/hadoop
7)格式化文件系统(在名称节点s201执行此操作)
		$>hadoop namenode -format
	
8)启动hadoop进程
$>start-all.sh
原文:https://www.cnblogs.com/grj0011/p/11697535.html