大数据动态之201505

近期动态:
Hadoop 2.7发布。
Hortonworks HDP 2.2.4.2发布。
Ambari 2.0发布。
Cloudera Enterperise 5.4发布。
Hive 1.2.0 发布,支持Hive on Spark。

HDP 2.2/HDP 2.2.4/Ambari 2.0/Ambari 2.0.1

1. HDP支持异构存储Heterogeneous storage,主要是对SSD的支持;
2. Hive开始支持 ACID 事务,向企业级应用场景前进了一大步;
3. HDP支持Spark 1.2.1;
4. HDP支持通过DominantResourceCalculator对CPU的资源隔离与资源调度;
5. Ambari 支持Blurprint,通过 REST API 管理和运维有更好的支持;
6. Ambari 支持Stacks,通过Stacks方式来定义一系列的集成组件;
7. Ambari 2.0支持HDP 2.2平台的Rolling Upgrades;
8. Ambari 2.0支持安装、配置Apache Ranger;
9. Ambari 2.0开始集成Ambari Alerts;
10. Ambari 2.0开始集成Ambari Metrics,替代之前的Ganglia;
11. Ambari 2.0开始支持User Views功能,User Views提供给运维人员更好的界面,包括Tez View、Capacity Scheduler View、Hive View、Pig View、Files View;

HDP 2.2之后部署的结构与之前有调整,新部署的结构与说明如下:

目录结构
从HDP 2.2之后,HDP安装后的目录结构发生了变化,之前安装后的Hadoop在/usr/lib目录下,现在变更到/usr/hdp目录下,结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
 {code}
├── /usr/hdp/2.2.0.0-2041/hadoop
│ ├── /usr/hdp/2.2.0.0-2041/hadoop/bin
│ ├── /usr/hdp/2.2.0.0-2041/hadoop/conf -> /etc/hadoop/conf
│ ├── /usr/hdp/2.2.0.0-2041/hadoop/lib
│ │ ├── /usr/hdp/2.2.0.0-2041/hadoop/lib/native
│ ├── /usr/hdp/2.2.0.0-2041/hadoop/libexec
│ ├── /usr/hdp/2.2.0.0-2041/hadoop/man
│ └── /usr/hdp/2.2.0.0-2041/hadoop/sbin
├── /usr/hdp/2.2.0.0-2041/hadoop-hdfs
│ ├── /usr/hdp/2.2.0.0-2041/hadoop-hdfs/bin
│ ├── /usr/hdp/2.2.0.0-2041/hadoop-hdfs/lib
│ ├── /usr/hdp/2.2.0.0-2041/hadoop-hdfs/sbin
│ └── /usr/hdp/2.2.0.0-2041/hadoop-hdfs/webapps
├── /usr/hdp/2.2.0.0-2041/hbase
│ ├── /usr/hdp/2.2.0.0-2041/hbase/bin
│ ├── /usr/hdp/2.2.0.0-2041/hbase/conf -> /etc/hbase/conf
│ ├── /usr/hdp/2.2.0.0-2041/hbase/doc
│ ├── /usr/hdp/2.2.0.0-2041/hbase/include
│ ├── /usr/hdp/2.2.0.0-2041/hbase/lib
└── /usr/hdp/2.2.0.0-2041/zookeeper
├── /usr/hdp/2.2.0.0-2041/zookeeper/bin
├── /usr/hdp/2.2.0.0-2041/zookeeper/conf -> /etc/zookeeper/conf
├── /usr/hdp/2.2.0.0-2041/zookeeper/doc
├── /usr/hdp/2.2.0.0-2041/zookeeper/lib
├── /usr/hdp/2.2.0.0-2041/zookeeper/man
{code}
{code}
/usr/hdp/2.2.3.0-2611
├── /usr/hdp/2.2.3.0-2611/hadoop
│ ├── /usr/hdp/2.2.3.0-2611/hadoop/bin
│ ├── /usr/hdp/2.2.3.0-2611/hadoop/conf -> /etc/hadoop/conf
│ ├── /usr/hdp/2.2.3.0-2611/hadoop/lib
│ │ ├── /usr/hdp/2.2.3.0-2611/hadoop/lib/native
│ ├── /usr/hdp/2.2.3.0-2611/hadoop/libexec
│ ├── /usr/hdp/2.2.3.0-2611/hadoop/man
│ └── /usr/hdp/2.2.3.0-2611/hadoop/sbin
├── /usr/hdp/2.2.3.0-2611/hadoop-hdfs
│ ├── /usr/hdp/2.2.3.0-2611/hadoop-hdfs/bin
│ ├── /usr/hdp/2.2.3.0-2611/hadoop-hdfs/lib
│ ├── /usr/hdp/2.2.3.0-2611/hadoop-hdfs/sbin
│ └── /usr/hdp/2.2.3.0-2611/hadoop-hdfs/webapps
├── /usr/hdp/2.2.3.0-2611/hbase
│ ├── /usr/hdp/2.2.3.0-2611/hbase/bin
│ ├── /usr/hdp/2.2.3.0-2611/hbase/conf -> /etc/hbase/conf
│ ├── /usr/hdp/2.2.3.0-2611/hbase/doc
│ ├── /usr/hdp/2.2.3.0-2611/hbase/include
│ ├── /usr/hdp/2.2.3.0-2611/hbase/lib
└── /usr/hdp/2.2.3.0-2611/zookeeper
├── /usr/hdp/2.2.3.0-2611/zookeeper/bin
├── /usr/hdp/2.2.3.0-2611/zookeeper/conf -> /etc/zookeeper/conf
├── /usr/hdp/2.2.3.0-2611/zookeeper/doc
├── /usr/hdp/2.2.3.0-2611/zookeeper/lib
├── /usr/hdp/2.2.3.0-2611/zookeeper/man
{code}

管理活动版本
HDP 2.0之后推出了hdp-select服务,通过这个服务可以管理活动版本,默认就会安装hdp-select,可以通过hdp-select命令验证是否安装。

1
2
> hdp-select
> hdp-select versions

同样支持管理命令,例如:

> hdp-select set hadoop-hdfs-datanode 2.2.3.0-2600

安装后的库、工具和脚本

HDP 2.0之前安装后库放在/usr/lib下,现在放在/usr/hdp/current下:

/usr/hdp/current/hadoop-hdfs-namenode/
/usr/hdp/current/hadoop-yarn-resourcemanager
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar

Daemon Scripts

/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-deamon.sh
/usr/hdp/current/hadoop-yarn-resourcemanager/sbin/yarn-daemon.sh
/usr/hdp/current/hadoop-yarn-nodemanager/sbin/yarn-daemon.sh

Configuration files

/etc/hadoop/conf

Bin Scripts

/usr/bin/hadoop -> /usr/hdp/current/hadoop-client/bin/hadoop