大数据之---伪分布式部署之终极篇-创新互联

------------------------------软件版本--------------------------------------

目前创新互联已为超过千家的企业提供了网站建设、域名、网络空间、成都网站托管、企业网站设计、平武网站维护等服务,公司将坚持客户导向、应用为本的策略,正道将秉承"和谐、参与、激情"的文化,与客户和合作伙伴齐心协力一起成长,共同发展。
RHEL6.8hadoop2.8.1apache-maven-3.3.9
   
findbugs-1.3.9protobuf-2.5.0.tar.gzjdk-8u45

------------------------------软件版本---------------------------------------

1.Hadoop
宏观: Hadoop为主的生态圈 hadoop flume........
狭义: apache hadoop hadoop.apache.org

2.Hadoop(存储+计算+资源和作业调度)
hadoop1.x  
   HDFS     存储
   MapReduce 计算+资源和作业调度

hadoop2.x 企业正在用
   HDFS     存储
   MapReduce 计算
   YARN     资源和作业调度平台 计算组件都会on yarn

hadoop3.x ???

  • EC技术:Erasure Encoding 简称EC,是Hadoop3给HDFS拓展的一种新特性,用来解决存储空间文件。

  • YARN:提供YARN的时间轴服务V.2,以便用户和开发人员可以对其进行测试,并提供反馈意见。

  • 优化Hadoop Shell脚本

  • 重构Hadoop Client Jar包

  • 支持随机Container

  • MapReduce任务级本地优化

  • 支持多个NameNode

  • 部分默认服务端口被改变

  • 支持文件系统连接器

  • DataNode内部添加了负载均衡

  • 重构后台程序和任务对管理

2.Maven部署

blog
2.1解压

[root@hadoop1 softwore]# pwd
/opt/softwore
[root@hadoop1 softwore]# ls
apache-maven-3.3.9-bin.zip hadoop-2.8.1-src.tar.gz jdk-8u45-linux-x64.gz
findbugs-1.3.9.zip         hadoop-2.8.1.tar.gz     protobuf-2.5.0.tar.gz
[root@hadoop1 softwore]# unzip apache-maven-3.3.9-bin.zip


2.2配置mavne目录
2.3查看配置文件和 解压我们准备好的仓库文件


3.Hadoop编译
3.1解压
3.2查看pom.xml
3.3查看BUILDING.txt
Requirements:编译软件环境要求

* Unix System
* JDK 1.7+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
* Zlib devel (if compiling native code)
* openssl devel (if compiling native hadoop-pipes and to get the best HDFS encryption performance)
* Linux FUSE (Filesystem in Userspace) version 2.6 or above (if compiling fuse_dfs)
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)

3.4 JDK部署

[root@hadoop1 softwore]# tar -zxvf jdk-8u45-linux-x64.gz   -C /usr/java[root@hadoop1 softwore]# ls -ld /usr/java/*
drwxr-xr-x 8 root root 4096 12月 15 2016 /usr/java/djdk1.7.0_79
drwxr-xr-x 8 uucp 143 4096 4月 11 2015 /usr/java/jdk1.8.0_45
drwxr-xr-x 8 uucp 143 4096 10月 7 2015 /usr/java/jdk1.8.0_65
[root@hadoop1 softwore]# vim /etc/profile

export JAVA_HOME=/usr/java/jdk1.8.0_45
export JRE_HOME=/usr/java/jdk1.8.0_45/jre

[root@hadoop1 softwore]# source /etc/profile
[root@hadoop1 softwore]# java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)

3.5 MAVEN
[root@hadoop000 hadoop-2.8.1-src]# mvn --version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)
Maven home: /opt/software/apache-maven-3.3.9
Java version: 1.8.0_45, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_45/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-431.el6.x86_64", arch: "amd64", family: "unix"
[root@hadoop000 hadoop-2.8.1-src]#

3.6 FINDBUGS
[root@hadoop000 findbugs-1.3.9]# findbugs -version
1.3.9
[root@hadoop000 findbugs-1.3.9]#

3.7 PROTOCBUF
[root@hadoop000 local]# protoc --version
libprotoc 2.5.0
[root@hadoop000 local]#

3.8 OTHER

3.9 编译
mvn clean package -Pdist,native -DskipTests -Dtar

3.10 解读


4.hadoop部署
单机部署    进程没有
伪分布式部署 进程存在+1节点 开发
集群部署    进程存在+n节点 开发/生产

下载的包: src 源代码包里面不包含jar     小
         不带src或者带bin 编译好的组件  大

4.1解压
tar -xzvf hadoop-2.8.1.tar.gz
chown -R root:root hadoop-2.8.1
4.2解读解压文件
[root@hadoop000 hadoop-2.8.1]# ll
total 148
drwxrwxr-x. 2 root root 4096 Jun 2 2017 bin
drwxrwxr-x. 3 root root 4096 Jun 2 2017 etc
drwxrwxr-x. 2 root root 4096 Jun 2 2017 include
drwxrwxr-x. 3 root root 4096 Jun 2 2017 lib
drwxrwxr-x. 2 root root 4096 Jun 2 2017 libexec
-rw-rw-r--. 1 root root 99253 Jun 2 2017 LICENSE.txt
-rw-rw-r--. 1 root root 15915 Jun 2 2017 NOTICE.txt
-rw-r--r--. 1 root root 1366 Jun 2 2017 README.txt
drwxrwxr-x. 2 root root 4096 Jun 2 2017 sbin
drwxrwxr-x. 4 root root 4096 Jun 2 2017 share
[root@hadoop000 hadoop-2.8.1]#
bin 执行命令的shell
etc 配置文件
lib 库
sbin 启动和关闭hadoop
share jar


[root@hadoop000 hadoop-2.8.1]# rm -f bin/*.cmd
[root@hadoop000 hadoop-2.8.1]# rm -f sbin/*.cmd
[root@hadoop000 hadoop-2.8.1]#
[root@hadoop000 hadoop-2.8.1]# ll bin
total 348
-rwxrwxr-x. 1 root root 139387 Jun 2 2017 container-executor
-rwxrwxr-x. 1 root root  6514 Jun 2 2017 hadoop
-rwxrwxr-x. 1 root root 12330 Jun 2 2017 hdfs
-rwxrwxr-x. 1 root root  6237 Jun 2 2017 mapred
-rwxrwxr-x. 1 root root  1776 Jun 2 2017 rcc
-rwxrwxr-x. 1 root root 156812 Jun 2 2017 test-container-executor
-rwxrwxr-x. 1 root root 14416 Jun 2 2017 yarn
[root@hadoop000 hadoop-2.8.1]#

4.3配置环境变量
[root@hadoop000 ~]# vi /etc/profile
export HADOOP_HOME=/opt/software/hadoop-2.8.1
export PATH=$HADOOP_HOME/bin:$PROTOC_HOME/bin:$FINDBUGS_HOME/bin:$MVN_HOME/bin:$JAVA_HOME/bin:$PATH
[root@hadoop000 ~]# source /etc/profile

[root@hadoop000 ~]# which hadoop
/opt/software/hadoop-2.8.1/bin/hadoop

4.4配置core-site文件
etc/hadoop/core-site.xml:


   
       fs.defaultFS
       hdfs://localhost:9000
   

etc/hadoop/hdfs-site.xml:


   
       dfs.replication
       1
   

4.5ssh
[root@hadoop000 ~]# rm -rf .ssh

[root@hadoop000 ~]# ssh-keygen
[root@hadoop000 ~]# cd .ssh
[root@hadoop000 .ssh]# ll
total 8
-rw-------. 1 root root 1671 May 13 21:47 id_rsa
-rw-r--r--. 1 root root 396 May 13 21:47 id_rsa.pub
[root@hadoop000 .ssh]# cat id_rsa.pub >> authorized_keys
[root@hadoop000 .ssh]#

[root@hadoop000 ~]# ssh localhost date
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is ec:85:86:32:22:94:d1:a9:f2:0b:c5:12:3f:ba:e2:61.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Sun May 13 21:49:14 CST 2018
[root@hadoop000 ~]#
[root@hadoop000 ~]#
[root@hadoop000 ~]# ssh localhost date
Sun May 13 21:49:17 CST 2018
[root@hadoop000 ~]#

4.6 Format the filesystem:

 $ bin/hdfs namenode -format

4.7 java home配置
[root@hadoop000 hadoop]# vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_45

4.8 Start NameNode daemon and DataNode daemon:
[root@hadoop000 hadoop-2.8.1]# sbin/start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-namenode-hadoop000.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-datanode-hadoop000.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is ec:85:86:32:22:94:d1:a9:f2:0b:c5:12:3f:ba:e2:61.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-secondarynamenode-hadoop000.out
[root@hadoop000 hadoop-2.8.1]#
[root@hadoop000 hadoop-2.8.1]#
[root@hadoop000 hadoop-2.8.1]#
[root@hadoop000 hadoop-2.8.1]#
[root@hadoop000 hadoop-2.8.1]# jps
16243 Jps
15943 DataNode
5127 Launcher
16139 SecondaryNameNode
15853 NameNode

[root@hadoop000 ~]# hdfs dfs -put jepson.log /
[root@hadoop000 ~]#
[root@hadoop000 ~]#
[root@hadoop000 ~]# hdfs dfs -ls /
Found 1 items
-rw-r--r--  3 root supergroup         6 2018-05-13 21:57 /jepson.log
[root@hadoop000 ~]#
[root@hadoop000 ~]# hdfs dfs -cat /jepson.log
A
5
6
[root@hadoop000 ~]#
[root@hadoop000 ~]#
[root@hadoop000 ~]# cat jepson.log
A
5
6
[root@hadoop000 ~]#


0513作业:
1.hadoop编译
2.hdfs伪分布式部署
3.以上两篇blog

另外有需要云服务器可以了解下创新互联scvps.cn,海内外云服务器15元起步,三天无理由+7*72小时售后在线,公司持有idc许可证,提供“云服务器、裸金属服务器、高防服务器、香港服务器、美国服务器、虚拟主机、免备案服务器”等云主机租用服务以及企业上云的综合解决方案,具有“安全稳定、简单易用、服务可用性高、性价比高”等特点与优势,专为企业上云打造定制,能够满足用户丰富、多元化的应用场景需求。


网站栏目:大数据之---伪分布式部署之终极篇-创新互联
标题链接:http://scyanting.com/article/dddcep.html