Elasticsearch – Distributed Search and Analytics Engine
What is Elasticsearch?
Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. Developed by Elastic NV, it enables real-time searching, analysis, and visualization of large volumes of data with remarkable speed and scalability. From full-text search to log analytics, security intelligence to business analytics, Elasticsearch powers diverse use cases across industries.
As the heart of the Elastic Stack (formerly ELK Stack), Elasticsearch works alongside Kibana for visualization, Logstash and Beats for data ingestion, creating a complete platform for observability and search. Its distributed architecture automatically handles sharding, replication, and cluster management, enabling horizontal scaling to petabyte-scale deployments.
Organizations worldwide rely on Elasticsearch for application search, website search, enterprise search, logging and log analytics, security analytics, business analytics, and geospatial data analysis. Its speed—returning results in milliseconds even across billions of documents—makes it ideal for applications requiring real-time insights.
Key Features and Capabilities
Distributed Architecture
Elasticsearch distributes data across multiple nodes in a cluster, automatically handling sharding and replication. This architecture provides horizontal scalability, high availability, and fault tolerance. Adding nodes increases capacity without application changes.
Full-Text Search
Built on Apache Lucene, Elasticsearch provides sophisticated full-text search with relevance scoring, phrase matching, fuzzy matching, stemming, and multi-language support. Analyzers customize how text is processed for indexing and searching.
Real-Time Analytics
Aggregations enable real-time analytics on indexed data. Metric aggregations calculate statistics; bucket aggregations group data; pipeline aggregations perform calculations on aggregation results. Complex analytics run in milliseconds.
RESTful API
All Elasticsearch operations use a simple REST API with JSON documents. This interface enables easy integration with any programming language and straightforward debugging with standard HTTP tools.
Schema-Free
Dynamic mapping automatically detects and creates field mappings when new data arrives. Explicit mappings provide fine-grained control over field types, analyzers, and indexing options when needed.
Near Real-Time
Documents become searchable within one second of indexing by default. This near real-time performance enables applications requiring immediate searchability of new data.
System Requirements
Hardware Requirements
Elasticsearch requires Java 11 or later (bundled with distributions). Minimum specs are 2 CPU cores and 4 GB RAM, though production deployments typically need 16-64 GB RAM per node. SSD storage is strongly recommended for performance.
Supported Platforms
Elasticsearch runs on Linux (recommended for production), Windows, and macOS. Docker and Kubernetes deployments are fully supported with official images and operators.
Installation Guide
Installing on Ubuntu/Debian
# Import GPG key
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
# Add repository
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
# Install Elasticsearch
sudo apt update
sudo apt install elasticsearch
# Start Elasticsearch
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
# Verify installation
curl -X GET "localhost:9200/?pretty"
# Reset password (Elasticsearch 8.x with security enabled)
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic
Installing on RHEL/CentOS
# Import GPG key
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
# Create repo file
sudo tee /etc/yum.repos.d/elasticsearch.repo << 'EOF'
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
# Install Elasticsearch
sudo dnf install elasticsearch
# Start service
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
Installing with Docker
# Single node (development)
docker run -d \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
-v elasticsearch-data:/usr/share/elasticsearch/data \
docker.elastic.co/elasticsearch/elasticsearch:8.11.0
# Docker Compose (with Kibana)
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
ports:
- 9200:9200
kibana:
image: docker.elastic.co/kibana/kibana:8.11.0
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- 5601:5601
depends_on:
- elasticsearch
volumes:
elasticsearch-data:
Installing on Kubernetes
# Using ECK (Elastic Cloud on Kubernetes)
kubectl create -f https://download.elastic.co/downloads/eck/2.10.0/crds.yaml
kubectl apply -f https://download.elastic.co/downloads/eck/2.10.0/operator.yaml
# Deploy Elasticsearch cluster
cat <
Essential Operations
Index Management
# Create index
curl -X PUT "localhost:9200/my-index?pretty" -H 'Content-Type: application/json' -d'
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 2
},
"mappings": {
"properties": {
"title": { "type": "text" },
"content": { "type": "text", "analyzer": "english" },
"timestamp": { "type": "date" },
"views": { "type": "integer" },
"tags": { "type": "keyword" }
}
}
}'
# List indices
curl -X GET "localhost:9200/_cat/indices?v"
# Get index mapping
curl -X GET "localhost:9200/my-index/_mapping?pretty"
# Delete index
curl -X DELETE "localhost:9200/my-index?pretty"
# Index settings
curl -X PUT "localhost:9200/my-index/_settings?pretty" -H 'Content-Type: application/json' -d'
{
"index": {
"number_of_replicas": 1
}
}'
Document Operations
# Index document (auto-generate ID)
curl -X POST "localhost:9200/my-index/_doc?pretty" -H 'Content-Type: application/json' -d'
{
"title": "Getting Started with Elasticsearch",
"content": "Elasticsearch is a distributed search engine...",
"timestamp": "2024-01-15T10:30:00",
"views": 1500,
"tags": ["search", "database", "tutorial"]
}'
# Index document with ID
curl -X PUT "localhost:9200/my-index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"title": "Advanced Elasticsearch",
"content": "Learn advanced features...",
"timestamp": "2024-01-16T14:00:00",
"views": 850,
"tags": ["advanced", "elasticsearch"]
}'
# Get document
curl -X GET "localhost:9200/my-index/_doc/1?pretty"
# Update document
curl -X POST "localhost:9200/my-index/_update/1?pretty" -H 'Content-Type: application/json' -d'
{
"doc": {
"views": 900
}
}'
# Delete document
curl -X DELETE "localhost:9200/my-index/_doc/1?pretty"
# Bulk operations
curl -X POST "localhost:9200/_bulk?pretty" -H 'Content-Type: application/json' -d'
{ "index": { "_index": "my-index" } }
{ "title": "Document 1", "content": "Content 1" }
{ "index": { "_index": "my-index" } }
{ "title": "Document 2", "content": "Content 2" }
{ "delete": { "_index": "my-index", "_id": "old-doc" } }
'
Search Queries
Basic Search
# Match all
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match_all": {}
}
}'
# Match query (full-text search)
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"content": "elasticsearch tutorial"
}
}
}'
# Match phrase
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match_phrase": {
"content": "distributed search"
}
}
}'
# Multi-match (search multiple fields)
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"multi_match": {
"query": "elasticsearch",
"fields": ["title^2", "content"]
}
}
}'
# Term query (exact match)
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"term": {
"tags": "tutorial"
}
}
}'
Compound Queries
# Bool query
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{ "match": { "content": "elasticsearch" } }
],
"filter": [
{ "term": { "tags": "tutorial" } },
{ "range": { "views": { "gte": 1000 } } }
],
"should": [
{ "match": { "title": "getting started" } }
],
"must_not": [
{ "term": { "tags": "deprecated" } }
]
}
}
}'
# Range query
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"range": {
"timestamp": {
"gte": "2024-01-01",
"lte": "2024-12-31"
}
}
}
}'
Aggregations
Metric and Bucket Aggregations
# Basic aggregations
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"total_views": { "sum": { "field": "views" } },
"avg_views": { "avg": { "field": "views" } },
"max_views": { "max": { "field": "views" } },
"view_stats": { "stats": { "field": "views" } }
}
}'
# Terms aggregation (group by)
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"tags_count": {
"terms": {
"field": "tags",
"size": 10
}
}
}
}'
# Date histogram
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"posts_over_time": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "month"
},
"aggs": {
"total_views": { "sum": { "field": "views" } }
}
}
}
}'
# Nested aggregations
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"tags": {
"terms": { "field": "tags" },
"aggs": {
"avg_views": { "avg": { "field": "views" } }
}
}
}
}'
Cluster Management
Cluster Commands
# Cluster health
curl -X GET "localhost:9200/_cluster/health?pretty"
# Cluster stats
curl -X GET "localhost:9200/_cluster/stats?pretty"
# Node information
curl -X GET "localhost:9200/_nodes?pretty"
# Node stats
curl -X GET "localhost:9200/_nodes/stats?pretty"
# Shard allocation
curl -X GET "localhost:9200/_cat/shards?v"
# Pending tasks
curl -X GET "localhost:9200/_cluster/pending_tasks?pretty"
# Cluster settings
curl -X GET "localhost:9200/_cluster/settings?pretty"
# Update cluster settings
curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}'
Index Lifecycle Management
ILM Policy
# Create ILM policy
curl -X PUT "localhost:9200/_ilm/policy/logs-policy?pretty" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "50GB",
"max_age": "7d"
}
}
},
"warm": {
"min_age": "30d",
"actions": {
"shrink": { "number_of_shards": 1 },
"forcemerge": { "max_num_segments": 1 }
}
},
"cold": {
"min_age": "90d",
"actions": {
"freeze": {}
}
},
"delete": {
"min_age": "365d",
"actions": {
"delete": {}
}
}
}
}
}'
Configuration
elasticsearch.yml
# Cluster settings
cluster.name: my-cluster
node.name: node-1
node.roles: [ master, data, ingest ]
# Network settings
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
# Discovery
discovery.seed_hosts: ["node1", "node2", "node3"]
cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]
# Paths
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
# Memory
bootstrap.memory_lock: true
# Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
Best Practices
Performance Optimization
1. Size shards appropriately (10-50 GB each)
2. Use SSDs for data storage
3. Allocate 50% of available RAM to heap (max 31GB)
4. Disable swapping
5. Use bulk API for indexing
6. Optimize mappings (disable _source if not needed)
7. Use filter context for non-scoring queries
8. Implement index lifecycle management
Conclusion
Elasticsearch provides a powerful foundation for search and analytics at any scale. Its distributed architecture, real-time capabilities, and rich query language make it ideal for diverse use cases from application search to observability.
As part of the Elastic Stack, Elasticsearch integrates seamlessly with visualization, ingestion, and analysis tools, providing a complete platform for extracting insights from data.
Download Options
Download Elasticsearch – Distributed Search and Analytics Engine
Version 8.11
File Size: 500 MB
Download NowSafe & Secure
Verified and scanned for viruses
Regular Updates
Always get the latest version
24/7 Support
Help available when you need it