Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

1. 다운로드(Elastic Search)

 가.Elastic Search=>https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.0/elasticsearch-2.3.0.tar.gz


2. 압축풀기

   tar xvfz elasticsearch-2.3.0.tar.gz

3. 링크 생성

  ln -s elasticsearch-2.3.0 elasticsearch

4. config/elasticsearch.yml파일 수정

-bash-4.1# cat elasticsearch.yml 

# ======================== Elasticsearch Configuration =========================


# NOTE: Elasticsearch comes with reasonable defaults for most settings.

#       Before you set out to tweak and tune the configuration, make sure you

#       understand what are you trying to accomplish and the consequences.


# The primary way of configuring a node is via this file. This template lists

# the most important settings you may want to configure for a production cluster.


# Please see the documentation for further information on configuration options:

# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>


# ---------------------------------- Cluster -----------------------------------


# Use a descriptive name for your cluster:


# cluster.name: my-application

cluster.name: iot


# ------------------------------------ Node ------------------------------------


# Use a descriptive name for the node:


# node.name: node-1

node.name: node-1

node.master: true

node.data: false


# Add custom attributes to the node:


# node.rack: r1


# ----------------------------------- Paths ------------------------------------


# Path to directory where to store the data (separate multiple locations by comma):


# path.data: /path/to/data

path.data: /data/elasticsearch/data


# Path to log files:


# path.logs: /path/to/logs

path.logs: /logs/elasticsearch/logs


# ----------------------------------- Memory -----------------------------------

index.number_of_shareds: 5

index.number_of_replicas: 1

# ----------------------------------- Memory -----------------------------------


# Lock the memory on startup:


# bootstrap.mlockall: true


# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory

# available on the system and that the owner of the process is allowed to use this limit.


# Elasticsearch performs poorly when the system is swapping the memory.


# ---------------------------------- Network -----------------------------------


# Set the bind address to a specific IP (IPv4 or IPv6):


# network.host:

network.host: xxx.xxx.xxx.43 


# Set a custom port for HTTP:


# http.port: 9200

http.port: 9200

transport.tcp.port: 9300

transport.tcp.compress: true

http.enabled: true


# For more information, see the documentation at:

# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>


# --------------------------------- Discovery ----------------------------------


# Pass an initial list of hosts to perform discovery when new node is started:

# The default list of hosts is ["", "[::1]"]


# discovery.zen.ping.unicast.hosts: ["host1", "host2"]

discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.unicast.hosts: ["gsda1:9300", "gsda1:9301", "gsda1:9302"]

action.auto_create_index: true

index.mapper.dynamic: true


# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):


# discovery.zen.minimum_master_nodes: 3


# For more information, see the documentation at:

# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>


# ---------------------------------- Gateway -----------------------------------


# Block initial recovery after a full cluster restart until N nodes are started:


# gateway.recover_after_nodes: 3


# For more information, see the documentation at:

# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>


# ---------------------------------- Various -----------------------------------


# Disable starting multiple nodes on a single system:


# node.max_local_storage_nodes: 1


# Require explicit names when deleting indices:


# action.destructive_requires_name: true

# ---------------------------------- Marvel Exporter -----------------------------------

marvel.agent.exporter.es.hosts: ["gsda1:9200", "gsda2:9200", "gsda3:9200", "gsda4:9200", "gsda5:9200"]

5. 각 서버에 scp한다.

scp -r -P 22 elasticsearch-2.3.0 root@sda2:$HOME

scp -r -P 22 elasticsearch-2.3.0 root@gsda3:$HOME

scp -r -P 22 elasticsearch-2.3.0 root@gsda4:$HOME

scp -r -P 22 elasticsearch-2.3.0 root@gsda5:$HOME

6.각 서버(5개)에 들어가서 링크를 생성한다.

ln -s elasticsearch-2.3.0 elasticsearch

7. master로 1개, 나머지 4개는 data용 node로 구성한다.

cluster.name=iot(각서버 모두 동일하게 설정한다.)

network.host=XXX.XXX.XXX.XXX(각각의 ip를 설정한다.)

node.name: node1(각각의 서버에 고유한 값을 설정한다.)

*master 서버에는

node.master: true

node.data: false

로 설정하고 

나머지(data node)는 

node.master: false

node.data: true

로 설정한다.

8. elastic서버 기동(서버 마다 각각 기동시켜준다, root도 실행)

 ==> bin/elasticsearch -d  : daemon으로 띄운다. console에 띄우려면 -d를 빼고 실행한다.

 ==> 헐.. root로 실행하니 아래와 같은 오류가 뜬다..

   이럴때는 "elasticsearch -d -Des.insecure.allow.root=true"명령을 주면 해결된다.


-bash-4.1# ./elasticsearch -d

-bash-4.1# Exception in thread "main" java.lang.RuntimeException: don't run elasticsearch as root.

        at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:93)

        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:144)

        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270)

        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)

Refer to the log for complete error details.

9. 클러스터/노드 정보를 확인한다.

  가. node정보 확인(각 노드별 확인 가능함) : http://gsda1:9200/

  나. cluster 정보 확인(master노드에서만 확인 가능함) : http://gsda1:9200/_cluster/health?pretty=true

  다. 노드정보 확인(각 노드별 확인 가능함) : http://gsda1:9200/_nodes?pretty=true

  라. 노드정보 확인(각 노드별 확인 가능함) : http://gsda1:9200/_nodes/settings?pretty=true

10. TEST

  가. 인덱스 생성 : -bash-4.1# curl -XPUT 'http://gsda1:9200/blog' ==> {"acknowledged":true}

  나. 인덱스 생성 정보확인 : 

-bash-4.1# curl -XGET 'http://gsda1:9200/blog/_settings?pretty=true'


  "blog" : {

    "settings" : {

      "index" : {

        "creation_date" : "1459833926158",

        "number_of_shards" : "5",

        "number_of_replicas" : "1",

        "uuid" : "XoppVt7VQPGgYHP_J4y3fQ",

        "version" : {

          "created" : "2030099"






  다. 인덱스 삭제 : -bash-4.1# curl -XDELETE 'http://gsda1:9200/blog'     ==> {"acknowledged":true}

     * 삭제시 물리적으로 바로 삭제되어 복구할 수 없음

11. document 자동색인 (elasticsearch.yml의 index.mapper.dynamic: true가 기본으로 설정되어어함)

 . document추가1(json형식으로 작성하여 등록함)

curl -XPOST 'http://gsda1:9200/blog/article/2' -d '{

"article_id" : 2,

"title" : "This is a Title2",

"content" : "This is a Content2"


===> {"_index":"blog","_type":"article","_id":"2","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}

* document추가2
curl -XPOST 'http://gsda1:9200/blog/article/1' -d '{
"article_id" : 1,
"title" : "This is a Title1",
"content" : "This is a Content1"
===> {"_index":"blog","_type":"article","_id":"1","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}

curl -XPOST 'http://gsda1:9200/blog/article/2' -d '{

"article_id" : 2,

"title" : "This is a Title2",

"content" : "This is a Content2"


===> {"_index":"blog","_type":"article","_id":"2","_version":1,"created":true}

 . document 가져오기(id를 지정하여 가져오기)

curl -XGET 'http://gsda1:9200/blog/article/1'



"article_id" : 1,

"title" : "This is a Title1",

"content" : "This is a Content1"

 . Index, Type Mapping 정보확인

 curl -XGET 'http://gsda1:9200/blog/_mapping?pretty=true'



  "blog" : {

    "mappings" : {

      "article" : {

        "properties" : {

          "article_id" : {

            "type" : "long"


          "content" : {

            "type" : "string"


          "title" : {

            "type" : "string"








*참고: http://gsda1:9200/_cluster/health?pretty=true하면 number_of_nodes와 number_of_data_nodes의 값이 0인 경우가 있는데
이때는 elasticsearch.yml에 
discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.unicast.hosts: ["gsda1:9300", "gsda1:9301", "gsda1:9302"]    #(master서버의 위치를 지정한다)

를 지정하여 명시적으로 master위치를 알려주어야 한다(모든 서버에 반영해야함)

------------------------------discovery.zen.ping.unicast.hosts로 하고 

서버를 모두 기동했는데 했어도 number_of_nodes와 number_of_data_nodes값이 0인 경우의 메세지 내용-------------------------

  "cluster_name" : "iot",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 0,
  "number_of_data_nodes" : 0,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0