Hive安装及整合HBase

下载HIVE

1
2
3
4
su - hadoop
wget http://apache.fayea.com/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz
tar -zxvf apache-hive-1.2.1-bin.tar.gz
mv apache-hive-1.2.1-bin hive

环境配置

在/etc/profile中追加:

1
2
export HIVE_HOME=/home/hadoop/hive
export PATH=$HIVE_HOME/bin:$PATH

安装MySQL

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
yum install gcc gcc- c ++ ncurses-devel  -y



groupadd mysql
useradd -g mysql mysql
wget http://dev.mysql.com/get/downloads/mysql/mysql-5.6.25.tar.gz
tar zxvf mysql-5.6.25.tar.gz
cd mysql-5.6.25
cmake \
-DCMAKE_INSTALL_PREFIX=/data/mysql \
-DMYSQL_UNIX_ADDR=/data/mysql/mysql.sock \
-DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci \
-DWITH_INNOBASE_STORAGE_ENGINE=1 \
-DWITH_ARCHIVE_STORAGE_ENGINE=1 \
-DWITH_BLACKHOLE_STORAGE_ENGINE=1 \
-DMYSQL_DATADIR=/data/mysql/data \
-DMYSQL_TCP_PORT=3306 \
-DENABLE_DOWNLOADS=1

make && make install

chmod +w /data/mysql/
chown -R mysql:mysql /data/mysql/
ln -s /data/mysql/lib/libmysqlclient.so.18 /usr/lib/libmysqlclient.so.18
ln -s /data/mysql/mysql.sock /tmp/mysql.sock

cp /data/mysql/support-files/my-default.cnf /etc/my.cnf
cp /data/mysql/support-files/mysql.server /etc/init.d/mysqld
/data/mysql/scripts/mysql_install_db --user=mysql --defaults-file=/etc/my.cnf --basedir=/data/mysql --datadir=/data/mysql/data


#创建hive数据库
create database hive ;

配置HIVE

新建hive-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://HADOOP-MASTER-153:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root<value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
</configuration>

使用了mysql作为metastore ,则需要在lib目录下添加mysql的驱动

下载JDBC到hive/lib目录下

1
wget http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.37/mysql-connector-java-5.1.37.jar

客户端配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
scp -r apache-hive-1.2.1-bin/ hadoop@hadoop-slave:/home/hadoop
[hadoop@hadoop-slave conf]$ vi hive-site.xml
<configuration>
<property>
<name>hive.metastore.uris</name>
<value>thrift://hadoop-master:9083</value>
</property>
</configuration>


[hadoop@hadoop-master ~]$ hive --service metastore &
[hadoop@hadoop-master ~]$ jps
10288 RunJar #多了一个进程
9365 NameNode
9670 SecondaryNameNode
11096 Jps
9944 NodeManager
9838 ResourceManager
9471 DataNode






#启动hive
$ hive


Logging initialized using configuration in jar:file:/home/hadoop/hive/lib/hive-common-1.2.1.jar!/hive-log4j.properties
hive>
hive> show databases;
OK
default
Time taken: 1.436 seconds, Fetched: 1 row(s)

整合HBase

1
2
3
4
CREATE  EXTERNAL  TABLE hbaseusers(key string, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name")
TBLPROPERTIES ("hbase.table.name" = "users");

建立外部数据源指定hbase存在的数据表列族时候不能包含空格, 不然会出现’不存在的列族’(Column Family n is not defined in hbase table nginx):

1
2
3
4
5
6
7

报错:
hive> CREATE EXTERNAL TABLE hbaseusers(key string, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name")
> TBLPROPERTIES ("hbase.table.name" = "users");
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)

在debug模式下开启metastore :
执行hive --service metastore -hiveconf hive.root.logger=DEBUG,console

报错:

1
2
16/01/24 17:00:07 [main]: ERROR DataNucleus.Datastore: An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes

差了相关文档,发现需要把mysql的字符集设置为latin1.
在/etc/my.cnf里增加

1
2
character-set-server = latin1
collation-server = latin1_general_ci

重启mysql

如果之前有hive的database需要drop掉重新建

1
2
3
4
5
6
hive>  CREATE  EXTERNAL  TABLE hbaseusers(key string, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name")
> TBLPROPERTIES ("hbase.table.name" = "users");
OK
Time taken: 1.617 seconds

OK

参考文档

http://my.oschina.net/u/204498/blog/522772
http://yanliu.org/2015/08/13/Hadoop%E9%9B%86%E7%BE%A4%E4%B9%8BHive%E5%AE%89%E8%A3%85%E9%85%8D%E7%BD%AE/