Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

hive hive에서 생성된 external table에서 hbase의 table에 값 insert하기

총관리자 2014.04.11 14:04 조회 수 : 1820

1. hive table(file을 바라보고 있으며 hbase table(아래의 hbase_mytable)에 값을 넣기 위한 src table) 을 external로 table 생성

CREATE EXTERNAL TABLE IF NOT EXISTS external_file
     (
     FOO STRING,
     BAR STRING
     )
     COMMENT 'TEST TABLE OF EMP_IP_TABLE'
     ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
     STORED AS TEXTFILE LOCATION '/data';

* /data에 들어 있는 파일 내용

hadoop@bigdata-host:~/hive/conf$ hadoop fs -cat /data/external_file.txt
a,b
a1,b1
a2,b2
a3,b3

2.hive table( hbase table을 바라보는 테이블)생성

CREATE EXTERNAL TABLE hbase_mytable(table_id string, foo string, bar string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:foo,cf:bar")
TBLPROPERTIES("hbase.table.name" = "mytable");

3. hive기동시 아래와 같이 jar를 포함해준다.

adoop@bigdata-host:~/hive/bin$ hive --auxpath /home/hadoop/hive/lib/hbase-0.94.6.1.jar,/home/hadoop/hive/lib/zookeeper-3.4.3.jar,/home/hadoop/hive/lib/hive-hbase-handler-0.11.0.jar,/home/hadoop/hive/lib/guava-11.0.2.jar,/home/hadoop/hive/lib/hive-contrib-0.11.0.jar -hiveconf hbase.master=localhost:60000

4. hive에 들어가서.. table을 생성한 후 hbase table에 입력 실행결과......

hive> insert into table hbase_mytable select foo, foo, bar from external_file;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201404111158_0008, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201404111158_0008
Kill Command = /home/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -kill job_201404111158_0008
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2014-04-11 13:56:49,322 Stage-0 map = 0%, reduce = 0%
2014-04-11 13:56:55,482 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:56,500 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:57,518 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:58,541 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:56:59,561 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.74 sec
2014-04-11 13:57:00,587 Stage-0 map = 100%, reduce = 100%, Cumulative CPU 1.74 sec
MapReduce Total cumulative CPU time: 1 seconds 740 msec
Ended Job = job_201404111158_0008
4 Rows loaded to hbase_mytable
MapReduce Jobs Launched:
Job 0: Map: 1 Cumulative CPU: 1.74 sec HDFS Read: 220 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 740 msec
OK
Time taken: 34.565 seconds

5. hbase_mytable의 값 확인(기존에 있던 값하고 새로 추가된 값이 같이 보인다.)

hive> select * from hbase_mytable;
OK
2.5 1.3 NULL
a a b
a1 a1 b1
a2 a2 b2
a3 a3 b3
second 3 NULL
third NULL 3.14159
Time taken: 1.315 seconds, Fetched: 7 row(s)

6. hbase shell에서 확인

hbase(main):001:0> scan 'mytable'
ROW                        COLUMN+CELL
2.5                       column=cf:foo, timestamp=1397112248576, value=1.3
a                         column=cf:bar, timestamp=1397192214568, value=b
a                         column=cf:foo, timestamp=1397192214568, value=a
a1                        column=cf:bar, timestamp=1397192214568, value=b1
a1                        column=cf:foo, timestamp=1397192214568, value=a1
a2                        column=cf:bar, timestamp=1397192214568, value=b2
a2                        column=cf:foo, timestamp=1397192214568, value=a2
a3                        column=cf:bar, timestamp=1397192214568, value=b3
a3                        column=cf:foo, timestamp=1397192214568, value=a3
first                     column=cf:message, timestamp=1397109873612, value=hellp Hbase
second                    column=cf:foo, timestamp=1397112803662, value=3
second2                   column=cf:foo2, timestamp=1397112883691, value=3
third                     column=cf:bar, timestamp=1397109940598, value=3.14159
9 row(s) in 1.8090 seconds

이 게시물을

이 글의 추천인 목록 목록

번호	제목	날짜	조회 수
661	hadoop 2.6.0에 sqoop2 (1.99.5) server및 client설치 == fail	2015.06.11	1914
660	impald에서 idle_query_timeout 와 idle_session_timeout 구분	2021.05.20	1869
659	갑자기 DataNode가 java.io.IOException: Premature EOF from inputStream를 반복적으로 발생시키다가 java.lang.OutOfMemoryError: Java heap space를 내면서 죽는 경우 조치방법	2017.07.19	1835
658	[ftgo_application]Unable to infer base url오류 발생시 조치방법	2023.02.20	1830
657	hue db에서 사용자가 가지는 정보 확인	2020.02.10	1829
»	hive에서 생성된 external table에서 hbase의 table에 값 insert하기	2014.04.11	1820
655	access=WRITE, inode="staging":ubuntu:supergroup:rwxr-xr-x 오류	2014.07.05	1793
654	index생성, 삭제, 활용	2014.04.25	1766
653	physical memory used되면서 mapper가 kill되는 경우 오류 발생시 조치	2018.09.20	1758
652	Cloudera Manager설치및 Uninstall 방법(순서)	2018.05.28	1730
651	Ubuntu 16.04 LTS에서 sendmail설치및 설정(수신,발신 가능)및 메일서버 만들기	2017.05.23	1708
650	centos 5.X에 hadoop 2.0.5 alpha 설치	2013.12.16	1674
649	[CDP7.1.7]BDR작업후 오류로 Diagnostic Data를 수집하는 동안 "No content to map due to end-of-input at [Source: (String)""; line: 1, column: 0]" 오류 발생시 조치	2024.02.20	1669
648	Jena 2.3를 Hadoop 2.7.2의 NFS로 mount하고 fuseki를 이용하여 start할때 오류 메세지	2016.12.02	1644
647	jsoup 사용 예제	2014.06.06	1630
646	flume 1.5.2 설치및 테스트(source : file, sink : hdfs) in HA	2015.05.21	1608
645	upsert구현방법(년-월-일 파티션을 기준으로) 및 테스트 script	2018.07.03	1606
644	Journal Storage Directory /data/hadoop/journal/data/mycluster not formatted 오류시 조치사항	2016.07.29	1603
643	oozie의 meta정보를 mysql에서 관리하기	2014.05.26	1603
642	centsOS vsftpd설치하기	2013.12.17	1602

쓰기 태그

첫 페이지 1 2 3 4 5 6 7 8 9 10 끝 페이지

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

hive hive에서 생성된 external table에서 hbase의 table에 값 insert하기

댓글 0

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

hive hive에서 생성된 external table에서 hbase의 table에 값 insert하기

댓글 0

LOGIN