Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.
1. 다운로드
http://archive.apache.org/dist/sqoop/1.99.5/ 에서 hadoop2는 hadoop200이 붙은 tar.gz파일을 다운로드한다.
(sqoop-1.99.5-bin-hadoop200.tar.gz)
2. /usr/local로 이동후 압축풀기
mv sqoop-1.99.5-bin-hadoop200.tar.gz /usr/local
tar xvfz sqoop-1.99.5-bin-hadoop200.tar.gz
3. link 생성
ln -s sqoop-1.99.5-bin-hadoop200 sqoop
4. catalina.properties 설정
hadoop1 혹은 hadoop2이 어디에 설치되어있던 상관없이 hadoop library파일과 configuration파일이 사용가능한 곳에 있어야 한다.
vi /usr/local/sqoop/server/conf/catalina.properties
내용중 common.loader내용을 현재 상황에 맞게 jar파일위치를 지정해준다.
common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/usr/local/hadoop/share/usr/local/common/*.jar,/usr/local/hadoop/share/usr/local/hdfs/*.jar,/usr/local/hadoop/share/usr/local/mapreduce/*.jar,/usr/local/hadoop/share/usr/local/yarn/*.jar,/usr/local/hive/lib/*.jar,/usr/local/hadoop/share/usr/local/common/lib/*.jar
5. jdbc driver설정
mkdir /usr/local/sqoop/lib
cp postgresql-9.3-1103.jdbc4.jar /usr/local/sqoop/lib
6. server설정 확인
sqoop2-tool verify하면 "Caused by: java.sql.SQLNonTransientConnectionException: No current connection."메세지가 나올수 있는데
무시하고 진행하면 된다.(Verification was successful.라는 메세지가 보이므로..)
--->정상메세지
Verification was successful.
Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly
-- 오류메세지
sqoop2-tool verify
Sqoop home directory: /usr/lib/sqoop
Setting SQOOP_HTTP_PORT: 12000
Setting SQOOP_ADMIN_PORT: 12001
Using CATALINA_OPTS:
Adding to CATALINA_OPTS: -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Mar 14, 2015 11:02:53 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/usr/lib/sqoop/lib], exists: [false], isDirectory: [false], canRead: [false]
Sqoop tool executor:
Version: 1.99.5
Revision: 9665c01f674d69d41a6fcfffb2c0b94590f70f59
Compiled on Wed Feb 18 09:42:27 PST 2015 by vbasavaraj
Running tool: class org.apache.sqoop.tools.tool.VerifyTool
2015-03-14 23:02:54,477 INFO [main] core.SqoopServer (SqoopServer.java:initialize(51)) - Booting up Sqoop server
2015-03-14 23:02:54,493 INFO [main] core.PropertiesConfigurationProvider (PropertiesConfigurationProvider.java:initialize(96)) - Starting config file poller thread
log4j: Parsing for [root] with value=[WARN, file].
log4j: Level token is [WARN].
log4j: Category root set to WARN
log4j: Parsing appender named "file".
log4j: Parsing layout options for "file".
log4j: Setting property [conversionPattern] to [%d{ISO8601} %-5p %c{2} [%l] %m%n].
log4j: End of parsing for "file".
log4j: Setting property [file] to [@LOGDIR@/sqoop.log].
log4j: Setting property [maxBackupIndex] to [5].
log4j: Setting property [maxFileSize] to [25MB].
log4j: setFile called: @LOGDIR@/sqoop.log, true
log4j: setFile ended
log4j: Parsed "file" options.
log4j: Parsing for [org.apache.sqoop] with value=[DEBUG].
log4j: Level token is [DEBUG].
log4j: Category org.apache.sqoop set to DEBUG
log4j: Handling log4j.additivity.org.apache.sqoop=[null]
log4j: Parsing for [org.apache.derby] with value=[INFO].
log4j: Level token is [INFO].
log4j: Category org.apache.derby set to INFO
log4j: Handling log4j.additivity.org.apache.derby=[null]
log4j: Finished configuring.
log4j: Could not find root logger information. Is this OK?
log4j: Parsing for [default] with value=[INFO,defaultAppender].
log4j: Level token is [INFO].
log4j: Category default set to INFO
log4j: Parsing appender named "defaultAppender".
log4j: Parsing layout options for "defaultAppender".
log4j: Setting property [conversionPattern] to [%d %-5p %c: %m%n].
log4j: End of parsing for "defaultAppender".
log4j: Setting property [file] to [@LOGDIR@/default.audit].
log4j: setFile called: @LOGDIR@/default.audit, true
log4j: setFile ended
log4j: Parsed "defaultAppender" options.
log4j: Handling log4j.additivity.default=[null]
log4j: Finished configuring.
Exception in thread "PurgeThread" org.apache.sqoop.common.SqoopException: JDBCREPO_0009:Failed to finalize transaction
at org.apache.sqoop.repository.JdbcRepositoryTransaction.close(JdbcRepositoryTransaction.java:115)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:109)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:61)
at org.apache.sqoop.repository.JdbcRepository.purgeSubmissions(JdbcRepository.java:589)
at org.apache.sqoop.driver.JobManager$PurgeThread.run(JobManager.java:648)
Caused by: java.sql.SQLNonTransientConnectionException: No current connection.
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.noCurrentConnection(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.checkIfClosed(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.setupContextStack(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.commit(Unknown Source)
at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:211)
at org.apache.sqoop.repository.JdbcRepositoryTransaction.close(JdbcRepositoryTransaction.java:112)
... 4 more
Caused by: java.sql.SQLException: No current connection.
at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
... 15 more
Verification was successful.
Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.
hduser@slave:/usr/lib/sqoop/bin$ hadoop version
Hadoop 2.4.1
Subversion http://svn.apache.org/repos/asf/hadoop/common -r 1604318
Compiled by jenkins on 2014-06-21T05:43Z
Compiled with protoc 2.5.0
From source with checksum bb7ac0a3c73dc131f4844b873c74b630
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.4.1.jar
7. server기등(실패)
bin>sqoop.sh server start
Sqoop home directory: /usr/local/sqoop
Setting SQOOP_HTTP_PORT: 12000
Setting SQOOP_ADMIN_PORT: 12001
Using CATALINA_OPTS:
Adding to CATALINA_OPTS: -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Using CATALINA_BASE: /usr/local/sqoop/server
Using CATALINA_HOME: /usr/local/sqoop/server
Using CATALINA_TMPDIR: /usr/local/sqoop/server/temp
Using JRE_HOME: /usr/java/latest
Using CLASSPATH: /usr/local/sqoop/server/bin/bootstrap.jar
(서버stop : bin>sqoop.sh -server stop)
* default로 12000, 12001을 사용하는데 필요시 server/bin/setenv.sh에서
SQOOP_HTTP_PORT, SQOOP_ADMIN_PORT에서 변경가능함
=============아래와 같은 오류(로그폴더 bin>@LOGDIR@)가 발생하면서 sqoop2 server가 기동되지 않는다.=====>
*org.apache.sqoop.common.SqoopException: CONN_0007:Connector registration
failed*
* at
org.apache.sqoop.connector.ConnectorManager.registerConnectors(ConnectorManager.java:236)*
* at
org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:197)*
* at
org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:145)*
*Caused by: org.apache.sqoop.common.SqoopException: JDBCREPO_0013:Connector
metadata changed - upgrade may be required - Connector:
generic-jdbc-connector given:
connector-generic-jdbc-connector:-1:org.apache.sqoop.connector.jdbc.GenericJdbcConnector,
Connection: Forms:
form-connection:-1:[input-connection.jdbcDriver:-1:STRING:128,
input-connection.connectionString:-1:STRING:128,
input-connection.username:-1:STRING:40,
input-connection.password:-1:STRING:40,
input-connection.jdbcProperties:-1:MAP]Job type: EXPORTForms:
form-table:-1:[input-table.schemaName:-1:STRING:50,
input-table.tableName:-1:STRING:2000, input-table.sql:-1:STRING:50,
input-table.columns:-1:STRING:50]Job type: IMPORTForms:
form-table:-1:[input-table.schemaName:-1:STRING:50,
input-table.tableName:-1:STRING:50, input-table.sql:-1:STRING:2000,
input-table.columns:-1:STRING:50, input-table.partitionColumn:-1:STRING:50,
input-table.partitionColumnNull:-1:BOOLEAN,
input-table.boundaryQuery:-1:STRING:50] found:
connector-generic-jdbc-connector:1:org.apache.sqoop.connector.jdbc.GenericJdbcConnector,
Connection: Forms:
form-connection:1:[input-connection.jdbcDriver:1:STRING:128,
input-connection.connectionString:2:STRING:128,
input-connection.username:3:STRING:40,
input-connection.password:4:STRING:40,
input-connection.jdbcProperties:5:MAP]Job type: EXPORTForms:
form-table:2:[input-table.schemaName:6:STRING:50,
input-table.tableName:7:STRING:2000, input-table.sql:8:STRING:50,
input-table.columns:9:STRING:50, input-table.stageTableName:10:STRING:2000,
input-table.clearStageTable:11:BOOLEAN]Job type: IMPORTForms:
form-table:3:[input-table.schemaName:12:STRING:50,
input-table.tableName:13:STRING:50, input-table.sql:14:STRING:2000,
input-table.columns:15:STRING:50, input-table.partitionColumn:16:STRING:50,
input-table.partitionColumnNull:17:BOOLEAN,
input-table.boundaryQuery:18:STRING:50]*
* at
org.apache.sqoop.repository.JdbcRepository$3.doIt(JdbcRepository.java:195)*
그래서 sqoop.properties파일에
org.apache.sqoop.connector.autoupgrade=true
org.apache.sqoop.framework.autoupgrade=true
를 설정하여도 마찬가지로 기동되지 않는다.....===> failed
8. client기동
bin>sqoop.sh client를 실행하면 된다.
sqoop:000>show connector하면 오류메세지가 나온다.
============> 그지 같은 SQOOP2 <===================
---------sqoop.properties---------------------
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#
# Sqoop configuration file used by the built in configuration
# provider: org.apache.sqoop.core.PropertiesConfigurationProvider.
# This file must reside in the system configuration directory
# which is specified by the system property "sqoop.config.dir"
# and must be called sqoop.properties.
#
# NOTE: Tokens specified in this file that are marked by a
# leading and trailing '@' characters should be replaced by
# their appropriate values. For example, the token @LOGDIR@
# should be replaced appropriately.
#
# The following tokens are used in this configuration file:
#
# LOGDIR
# The absolute path to the directory where system genearated
# log files will be kept.
#
# BASEDIR
# The absolute path to the directory where Sqoop 2 is installed
#
#
# Logging Configuration
# Any property that starts with the prefix
# org.apache.sqoop.log4j is parsed out by the configuration
# system and passed to the log4j subsystem. This allows you
# to specify log4j configuration properties from within the
# Sqoop configuration.
#
org.apache.sqoop.log4j.appender.file=org.apache.log4j.RollingFileAppender
org.apache.sqoop.log4j.appender.file.File=@LOGDIR@/sqoop.log
org.apache.sqoop.log4j.appender.file.MaxFileSize=25MB
org.apache.sqoop.log4j.appender.file.MaxBackupIndex=5
org.apache.sqoop.log4j.appender.file.layout=org.apache.log4j.PatternLayout
org.apache.sqoop.log4j.appender.file.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} [%l] %m%n
org.apache.sqoop.log4j.debug=true
org.apache.sqoop.log4j.rootCategory=WARN, file
org.apache.sqoop.log4j.category.org.apache.sqoop=DEBUG
org.apache.sqoop.log4j.category.org.apache.derby=INFO
#
# Audit Loggers Configuration
# Multiple audit loggers could be given here. To specify an
# audit logger, you should at least add org.apache.sqoop.
# auditlogger.[LoggerName].class. You could also provide
# more configuration options by using org.apache.sqoop.
# auditlogger.[LoggerName] prefix, then all these options
# are parsed to the logger class.
#
org.apache.sqoop.auditlogger.default.class=org.apache.sqoop.audit.FileAuditLogger
org.apache.sqoop.auditlogger.default.file=@LOGDIR@/default.audit
#
# Repository configuration
# The Repository subsystem provides the special prefix which
# is "org.apache.sqoop.repository.sysprop". Any property that
# is specified with this prefix is parsed out and set as a
# system property. For example, if the built in Derby repository
# is being used, the sysprop prefixed properties can be used
# to affect Derby configuration at startup time by setting
# the appropriate system properties.
#
# Repository provider
org.apache.sqoop.repository.provider=org.apache.sqoop.repository.JdbcRepositoryProvider
# Repository upgrade
# If set to true, it will not upgrade the sqoop respository schema, by default it will iniate the upgrade on server start-up
#org.apache.sqoop.repository.schema.immutable=false
org.apache.sqoop.repository.schema.immutable=true
# JDBC repository provider configuration
#org.apache.sqoop.repository.jdbc.handler=org.apache.sqoop.repository.derby.DerbyRepositoryHandler
#org.apache.sqoop.repository.jdbc.transaction.isolation=READ_COMMITTED
#org.apache.sqoop.repository.jdbc.maximum.connections=10
#org.apache.sqoop.repository.jdbc.url=jdbc:derby:@BASEDIR@/repository/db;create=true
#org.apache.sqoop.repository.jdbc.driver=org.apache.derby.jdbc.EmbeddedDriver
#org.apache.sqoop.repository.jdbc.user=sa
#org.apache.sqoop.repository.jdbc.password=
# JDBC repository provider configuration
org.apache.sqoop.repository.jdbc.handler=org.apache.sqoop.repository.postgresql.PostgresqlRepositoryHandler
org.apache.sqoop.repository.jdbc.transaction.isolation=READ_COMMITTED
org.apache.sqoop.repository.jdbc.maximum.connections=10
org.apache.sqoop.repository.jdbc.url=jdbc:postgresql://postgres.dbserver.com:5432/sqoopdb
org.apache.sqoop.repository.jdbc.create.schema=true
org.apache.sqoop.repository.jdbc.driver=org.postgresql.Driver
#org.apache.sqoop.repository.jdbc.driver=org.apache.postgresql.jdbc.EmbeddedDriver
org.apache.sqoop.repository.jdbc.user=sqoop
org.apache.sqoop.repository.jdbc.password=sqoop
# System properties for embedded Derby configuration
org.apache.sqoop.repository.sysprop.derby.stream.error.file=@LOGDIR@/derbyrepo.log
#
# Sqoop Connector configuration
# If set to true will initiate Connectors config upgrade during server startup
#
org.apache.sqoop.connector.autoupgrade=true
#
# Sqoop Driver configuration
# If set to true will initiate the Driver config upgrade during server startup
#
org.apache.sqoop.driver.autoupgrade=true
# Sleeping period for reloading configuration file (once a minute)
org.apache.sqoop.core.configuration.provider.properties.sleep=60000
#
# Submission engine configuration
#
# Submission engine class
org.apache.sqoop.submission.engine=org.apache.sqoop.submission.mapreduce.MapreduceSubmissionEngine
# Number of milliseconds, submissions created before this limit will be removed, default is one day
#org.apache.sqoop.submission.purge.threshold=
# Number of milliseconds for purge thread to sleep, by default one day
#org.apache.sqoop.submission.purge.sleep=
# Number of milliseconds for update thread to sleep, by default 5 minutes
#org.apache.sqoop.submission.update.sleep=
#
# Configuration for Mapreduce submission engine (applicable if it's configured)
#
# Hadoop configuration directory
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/etc/hadoop/
#
# Execution engine configuration
#
org.apache.sqoop.execution.engine=org.apache.sqoop.execution.mapreduce.MapreduceExecutionEngine
#
# Authentication configuration
#
#org.apache.sqoop.security.authentication.type=SIMPLE
#org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.SimpleAuthenticationHandler
#org.apache.sqoop.security.authentication.anonymous=true
#org.apache.sqoop.security.authentication.type=KERBEROS
#org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.KerberosAuthenticationHandler
#org.apache.sqoop.security.authentication.kerberos.principal=sqoop/_HOST@NOVALOCAL
#org.apache.sqoop.security.authentication.kerberos.keytab=/home/kerberos/sqoop.keytab
#org.apache.sqoop.security.authentication.kerberos.http.principal=HTTP/_HOST@NOVALOCAL
#org.apache.sqoop.security.authentication.kerberos.http.keytab=/home/kerberos/sqoop.keytab
#org.apache.sqoop.security.authentication.enable.doAs=true
#org.apache.sqoop.security.authentication.proxyuser.#USER#.users=*
#org.apache.sqoop.security.authentication.proxyuser.#USER#.groups=*
#org.apache.sqoop.security.authentication.proxyuser.#USER#.hosts=*
#
# Authorization configuration
#
#org.apache.sqoop.security.authorization.handler=org.apache.sqoop.security.authorization.DefaultAuthorizationHandler
#org.apache.sqoop.security.authorization.access_controller=org.apache.sqoop.security.authorization.DefaultAuthorizationAccessController
#org.apache.sqoop.security.authorization.validator=org.apache.sqoop.security.authorization.DefaultAuthorizationValidator
#org.apache.sqoop.security.authorization.authentication_provider=org.apache.sqoop.security.authorization.DefaultAuthenticationProvider
#org.apache.sqoop.security.authorization.server_name=SqoopServer1
# External connectors load path
# "/path/to/external/connectors/": Add all the connector JARs in the specified folder
#
org.apache.sqoop.connector.external.loadpath=