Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.
1. 다운로드(Elastic Search)
가.Elastic Search=>https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.0/elasticsearch-2.3.0.tar.gz
나.ES-Hadoop=>https://www.elastic.co/thank-you?url=http://download.elastic.co/hadoop/elasticsearch-hadoop-2.2.0.zip
2. 압축풀기
tar xvfz elasticsearch-2.3.0.tar.gz
3. 링크 생성
ln -s elasticsearch-2.3.0 elasticsearch
4. config/elasticsearch.yml파일 수정
-bash-4.1# cat elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
# cluster.name: my-application
cluster.name: iot
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
# node.name: node-1
node.name: node-1
node.master: true
node.data: false
#
# Add custom attributes to the node:
#
# node.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
# path.data: /path/to/data
path.data: /data/elasticsearch/data
#
# Path to log files:
#
# path.logs: /path/to/logs
path.logs: /logs/elasticsearch/logs
#
# ----------------------------------- Memory -----------------------------------
index.number_of_shareds: 5
index.number_of_replicas: 1
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
# bootstrap.mlockall: true
#
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
# network.host: 192.168.0.1
network.host: xxx.xxx.xxx.43
#
# Set a custom port for HTTP:
#
# http.port: 9200
http.port: 9200
transport.tcp.port: 9300
transport.tcp.compress: true
http.enabled: true
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
# discovery.zen.ping.unicast.hosts: ["host1", "host2"]
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["gsda1:9300", "gsda1:9301", "gsda1:9302"]
action.auto_create_index: true
index.mapper.dynamic: true
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
#
# discovery.zen.minimum_master_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true
# ---------------------------------- Marvel Exporter -----------------------------------
marvel.agent.exporter.es.hosts: ["gsda1:9200", "gsda2:9200", "gsda3:9200", "gsda4:9200", "gsda5:9200"]
5. 각 서버에 scp한다.
scp -r -P 22 elasticsearch-2.3.0 root@sda2:$HOME
scp -r -P 22 elasticsearch-2.3.0 root@gsda3:$HOME
scp -r -P 22 elasticsearch-2.3.0 root@gsda4:$HOME
scp -r -P 22 elasticsearch-2.3.0 root@gsda5:$HOME
6.각 서버(5개)에 들어가서 링크를 생성한다.
ln -s elasticsearch-2.3.0 elasticsearch
7. master로 1개, 나머지 4개는 data용 node로 구성한다.
cluster.name=iot(각서버 모두 동일하게 설정한다.)
network.host=XXX.XXX.XXX.XXX(각각의 ip를 설정한다.)
node.name: node1(각각의 서버에 고유한 값을 설정한다.)
*master 서버에는
node.master: true
node.data: false
로 설정하고
나머지(data node)는
node.master: false
node.data: true
로 설정한다.
8. elastic서버 기동(서버 마다 각각 기동시켜준다, root도 실행)
==> bin/elasticsearch -d : daemon으로 띄운다. console에 띄우려면 -d를 빼고 실행한다.
==> 헐.. root로 실행하니 아래와 같은 오류가 뜬다..
이럴때는 "elasticsearch -d -Des.insecure.allow.root=true"명령을 주면 해결된다.
--------------오류내용------------------------------
-bash-4.1# ./elasticsearch -d
-bash-4.1# Exception in thread "main" java.lang.RuntimeException: don't run elasticsearch as root.
at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:93)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:144)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
Refer to the log for complete error details.
9. 클러스터/노드 정보를 확인한다.
가. node정보 확인(각 노드별 확인 가능함) : http://gsda1:9200/
나. cluster 정보 확인(master노드에서만 확인 가능함) : http://gsda1:9200/_cluster/health?pretty=true
다. 노드정보 확인(각 노드별 확인 가능함) : http://gsda1:9200/_nodes?pretty=true
라. 노드정보 확인(각 노드별 확인 가능함) : http://gsda1:9200/_nodes/settings?pretty=true
10. TEST
가. 인덱스 생성 : -bash-4.1# curl -XPUT 'http://gsda1:9200/blog' ==> {"acknowledged":true}
나. 인덱스 생성 정보확인 :
-bash-4.1# curl -XGET 'http://gsda1:9200/blog/_settings?pretty=true'
{
"blog" : {
"settings" : {
"index" : {
"creation_date" : "1459833926158",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "XoppVt7VQPGgYHP_J4y3fQ",
"version" : {
"created" : "2030099"
}
}
}
}
}
다. 인덱스 삭제 : -bash-4.1# curl -XDELETE 'http://gsda1:9200/blog' ==> {"acknowledged":true}
* 삭제시 물리적으로 바로 삭제되어 복구할 수 없음
11. document 자동색인 (elasticsearch.yml의 index.mapper.dynamic: true가 기본으로 설정되어어함)
가. document추가1(json형식으로 작성하여 등록함)
curl -XPOST 'http://gsda1:9200/blog/article/2' -d '{
"article_id" : 2,
"title" : "This is a Title2",
"content" : "This is a Content2"
}'
===> {"_index":"blog","_type":"article","_id":"2","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}
curl -XPOST 'http://gsda1:9200/blog/article/2' -d '{
"article_id" : 2,
"title" : "This is a Title2",
"content" : "This is a Content2"
}'
===> {"_index":"blog","_type":"article","_id":"2","_version":1,"created":true}
나. document 가져오기(id를 지정하여 가져오기)
curl -XGET 'http://gsda1:9200/blog/article/1'
====>
{"_index":"blog","_type":"article","_id":"1","_version":1,"found":true,"_source":{
"article_id" : 1,
"title" : "This is a Title1",
"content" : "This is a Content1"
다. Index, Type Mapping 정보확인
curl -XGET 'http://gsda1:9200/blog/_mapping?pretty=true'
==>
{
"blog" : {
"mappings" : {
"article" : {
"properties" : {
"article_id" : {
"type" : "long"
},
"content" : {
"type" : "string"
},
"title" : {
"type" : "string"
}
}
}
}
}
}
discovery.zen.ping.unicast.hosts: ["gsda1:9300", "gsda1:9301", "gsda1:9302"] #(master서버의 위치를 지정한다)
를 지정하여 명시적으로 master위치를 알려주어야 한다(모든 서버에 반영해야함)
------------------------------discovery.zen.ping.unicast.hosts로 하고
서버를 모두 기동했는데 했어도 number_of_nodes와 number_of_data_nodes값이 0인 경우의 메세지 내용-------------------------
{ "cluster_name" : "iot", "status" : "green", "timed_out" : false, "number_of_nodes" : 0, "number_of_data_nodes" : 0, "active_primary_shards" : 0, "active_shards" : 0, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }