# Catatan Seekor Elasticsearch

Elasticsearch adalah search engine berbasis Lucene yang dirancang untuk distributed, full-text search dan analytics.

## Fundamental

### Basic Concepts

* **Index**: Collection of documents (similar to database table)
* **Document**: JSON object stored in an index
* **Shard**: Subset of an index for horizontal scaling
* **Node**: Single server instance
* **Cluster**: Collection of nodes

### Index Management

```json
// Create Index
PUT /my_index
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "standard"
      },
      "content": {
        "type": "text",
        "analyzer": "standard"
      },
      "created_at": {
        "type": "date"
      }
    }
  }
}

// Get Index Info
GET /my_index

// Delete Index
DELETE /my_index
```

### Document Operations

```json
// Index Document
PUT /my_index/_doc/1
{
  "title": "Elasticsearch Guide",
  "content": "This is a comprehensive guide to Elasticsearch",
  "created_at": "2024-01-01T00:00:00Z"
}

// Get Document
GET /my_index/_doc/1

// Update Document
POST /my_index/_update/1
{
  "doc": {
    "content": "Updated content for Elasticsearch guide"
  }
}

// Delete Document
DELETE /my_index/_doc/1
```

## Search Queries

### Basic Search

```json
// Simple Match Query
GET /my_index/_search
{
  "query": {
    "match": {
      "content": "elasticsearch guide"
    }
  }
}

// Multi-field Search
GET /my_index/_search
{
  "query": {
    "multi_match": {
      "query": "search engine",
      "fields": ["title", "content"]
    }
  }
}
```

### Advanced Queries

```json
// Boolean Query
GET /my_index/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "content": "elasticsearch" } }
      ],
      "should": [
        { "match": { "title": "guide" } }
      ],
      "must_not": [
        { "match": { "content": "deprecated" } }
      ]
    }
  }
}

// Range Query
GET /my_index/_search
{
  "query": {
    "range": {
      "created_at": {
        "gte": "2024-01-01",
        "lte": "2024-12-31"
      }
    }
  }
}

// Wildcard Query
GET /my_index/_search
{
  "query": {
    "wildcard": {
      "title": "elastic*"
    }
  }
}
```

### Aggregations

```json
// Terms Aggregation
GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": {
        "field": "category.keyword"
      }
    }
  }
}

// Date Histogram
GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "posts_over_time": {
      "date_histogram": {
        "field": "created_at",
        "calendar_interval": "month"
      }
    }
  }
}

// Nested Aggregations
GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": {
        "field": "category.keyword"
      },
      "aggs": {
        "avg_rating": {
          "avg": {
            "field": "rating"
          }
        }
      }
    }
  }
}
```

## Mapping & Analysis

### Field Types

```json
// Text vs Keyword
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",        // Full-text search
        "fields": {
          "keyword": {         // Exact match & aggregations
            "type": "keyword"
          }
        }
      },
      "rating": {
        "type": "float"
      },
      "tags": {
        "type": "keyword"
      },
      "created_at": {
        "type": "date"
      }
    }
  }
}
```

### Analyzers

```json
// Custom Analyzer
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "stop", "snowball"]
        }
      }
    }
  }
}
```

## Performance & Optimization

### Index Settings

```json
// Optimize for Search
{
  "settings": {
    "index": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "refresh_interval": "30s"
    }
  }
}
```

### Query Optimization

```json
// Use Source Filtering
GET /my_index/_search
{
  "_source": ["title", "created_at"],
  "query": {
    "match": {
      "content": "elasticsearch"
    }
  }
}

// Pagination with Search After
GET /my_index/_search
{
  "size": 10,
  "query": {
    "match_all": {}
  },
  "sort": [
    {"created_at": "desc"},
    {"_id": "desc"}
  ],
  "search_after": [1640995200000, "last_document_id"]
}
```

## Monitoring & Management

### Cluster Health

```json
// Check Cluster Health
GET /_cluster/health

// Check Node Stats
GET /_nodes/stats

// Check Index Stats
GET /my_index/_stats
```

### Reindex API

```json
// Reindex Documents
POST /_reindex
{
  "source": {
    "index": "old_index"
  },
  "dest": {
    "index": "new_index"
  }
}
```

## Best Practices

### Index Design

* Use appropriate number of shards (1 shard per 25-50GB)
* Use aliases for zero-downtime reindexing
* Set appropriate refresh intervals
* Use bulk API for multiple operations

### Query Optimization

* Use source filtering to reduce network transfer
* Use search\_after for deep pagination
* Avoid wildcard queries at the beginning of terms
* Use appropriate analyzers for your use case

### Monitoring

* Monitor cluster health regularly
* Set up alerts for cluster status
* Monitor query performance
* Regular index optimization

## Tools & Integrations

### Kibana

* Web interface for Elasticsearch
* Visualization and dashboard creation
* Index management
* Query development

### Logstash

* Data processing pipeline
* Input/output plugins
* Filtering and transformation
* Integration with Elasticsearch

### Beats

* Lightweight data shippers
* Filebeat, Metricbeat, etc.
* Direct integration with Elasticsearch
* Minimal resource usage
