메뉴 건너뛰기

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.


1. hive table(file을 바라보고 있으며 hbase table(아래의 hbase_mytable)에 값을 넣기 위한 src table) 을  external로  table 생성

CREATE EXTERNAL TABLE IF NOT EXISTS external_file
     (
     FOO STRING,
     BAR STRING
     )
     COMMENT 'TEST TABLE OF EMP_IP_TABLE'
     ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
     STORED AS TEXTFILE LOCATION '/data';

 

* /data에 들어 있는 파일 내용

hadoop@bigdata-host:~/hive/conf$ hadoop fs -cat /data/external_file.txt
a,b
a1,b1
a2,b2
a3,b3

2.hive table( hbase  table을 바라보는 테이블)생성

CREATE EXTERNAL TABLE hbase_mytable(table_id string, foo string, bar string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:foo,cf:bar")
TBLPROPERTIES("hbase.table.name" = "mytable");

 

3. hive기동시 아래와 같이 jar를 포함해준다.

adoop@bigdata-host:~/hive/bin$ hive --auxpath /home/hadoop/hive/lib/hbase-0.94.6.1.jar,/home/hadoop/hive/lib/zookeeper-3.4.3.jar,/home/hadoop/hive/lib/hive-hbase-handler-0.11.0.jar,/home/hadoop/hive/lib/guava-11.0.2.jar,/home/hadoop/hive/lib/hive-contrib-0.11.0.jar -hiveconf hbase.master=localhost:60000

4. hive에 들어가서.. table을 생성한 후 hbase table에 입력 실행결과......

hive> insert into table hbase_mytable select foo, foo, bar from external_file;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201404111158_0008, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201404111158_0008
Kill Command = /home/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job  -kill job_201404111158_0008
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2014-04-11 13:56:49,322 Stage-0 map = 0%,  reduce = 0%
2014-04-11 13:56:55,482 Stage-0 map = 100%,  reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:56,500 Stage-0 map = 100%,  reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:57,518 Stage-0 map = 100%,  reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:58,541 Stage-0 map = 100%,  reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:59,561 Stage-0 map = 100%,  reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:57:00,587 Stage-0 map = 100%,  reduce = 100%, Cumulative CPU 1.74 sec
MapReduce Total cumulative CPU time: 1 seconds 740 msec
Ended Job = job_201404111158_0008
4 Rows loaded to hbase_mytable
MapReduce Jobs Launched:
Job 0: Map: 1   Cumulative CPU: 1.74 sec   HDFS Read: 220 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 740 msec
OK
Time taken: 34.565 seconds

5. hbase_mytable의 값  확인(기존에 있던 값하고 새로 추가된 값이 같이 보인다.)

hive> select * from hbase_mytable;                                           
OK
2.5 1.3 NULL
a a b
a1 a1 b1
a2 a2 b2
a3 a3 b3
second 3 NULL
third NULL 3.14159
Time taken: 1.315 seconds, Fetched: 7 row(s)

6. hbase shell에서 확인

hbase(main):001:0> scan 'mytable'
ROW                        COLUMN+CELL                                                               
 2.5                       column=cf:foo, timestamp=1397112248576, value=1.3                         
 a                         column=cf:bar, timestamp=1397192214568, value=b                           
 a                         column=cf:foo, timestamp=1397192214568, value=a                           
 a1                        column=cf:bar, timestamp=1397192214568, value=b1                          
 a1                        column=cf:foo, timestamp=1397192214568, value=a1                          
 a2                        column=cf:bar, timestamp=1397192214568, value=b2                          
 a2                        column=cf:foo, timestamp=1397192214568, value=a2                          
 a3                        column=cf:bar, timestamp=1397192214568, value=b3                          
 a3                        column=cf:foo, timestamp=1397192214568, value=a3                          
 first                     column=cf:message, timestamp=1397109873612, value=hellp Hbase             
 second                    column=cf:foo, timestamp=1397112803662, value=3                           
 second2                   column=cf:foo2, timestamp=1397112883691, value=3                          
 third                     column=cf:bar, timestamp=1397109940598, value=3.14159                     
9 row(s) in 1.8090 seconds

 

 

번호 제목 날짜 조회 수
47 Hive Query Examples from test code (2 of 2) 2014.03.26 11464
46 insert hbase by hive ... error occured after 5 hours..HMaster가 뜨지 않는 장애에 대한 복구 방법 2014.04.29 7236
45 hive 2.0.1 설치및 mariadb로 metastore 설정 2016.06.03 5292
44 Hive 사용법 및 쿼리 샘플코드 2013.03.07 3079
43 Hive+mysql 설치 및 환경구축하기 file 2013.03.07 2797
42 banana pi에 hive 0.13.1+mysql(metastore)설치 file 2014.09.09 2485
41 Hive java connection 설정 file 2013.04.01 2168
40 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable to deserialize reduce input key from...오류해결방법 2015.06.16 1968
» hive에서 생성된 external table에서 hbase의 table에 값 insert하기 2014.04.11 1819
38 index생성, 삭제, 활용 2014.04.25 1765
37 FAILED: IllegalStateException Variable substitution depth too large: 40 오류발생시 조치사항 2014.08.19 1595
36 upsert구현방법(년-월-일 파티션을 기준으로) 및 테스트 script file 2018.07.03 1573
35 hiverserver2기동시 connection refused가 발생하는 경우 조치방법 2014.05.22 1558
34 json 값 다루기 2014.04.17 1325
33 schema설정없이 hive를 최초에 실행했을때 발생하는 오류메세지및 처리방법 2016.09.25 1299
32 external partition table생성및 data확인 2014.04.03 1297
31 sequence한 번호 생성방법 2014.04.25 1252
30 json으로 존재하는 데이터 parsing하기 2019.03.25 1249
29 Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D오류발생시 조치사항 2016.06.03 1246
28 json serde사용법 2014.04.17 1196
위로