This commit is contained in:
Stefan Hardegger
2025-09-18 07:46:10 +02:00
parent 64f97f5648
commit 54df3c471e
17 changed files with 2530 additions and 0 deletions

View File

@@ -0,0 +1,178 @@
# OpenSearch Configuration - Best Practices Implementation
## Overview
This directory contains a production-ready OpenSearch configuration following industry best practices for security, scalability, and maintainability.
## Architecture
### 📁 Directory Structure
```
opensearch/
├── config/
│ ├── opensearch-development.yml # Development-specific settings
│ └── opensearch-production.yml # Production-specific settings
├── mappings/
│ ├── stories-mapping.json # Story index mapping
│ ├── authors-mapping.json # Author index mapping
│ └── collections-mapping.json # Collection index mapping
├── templates/
│ ├── stories-template.json # Index template for stories_*
│ └── index-lifecycle-policy.json # ILM policy for index management
└── README.md # This file
```
## ✅ Best Practices Implemented
### 🔒 **Security**
- **Environment-Aware SSL Configuration**
- Production: Full certificate validation with custom truststore support
- Development: Optional certificate validation for local development
- **Proper Authentication**: Basic auth with secure credential management
- **Connection Security**: TLS 1.3 support with hostname verification
### 🏗️ **Configuration Management**
- **Externalized Configuration**: JSON/YAML files instead of hardcoded values
- **Environment-Specific Settings**: Different configs for dev/staging/prod
- **Type-Safe Properties**: Strongly-typed configuration classes
- **Validation**: Configuration validation at startup
### 📈 **Scalability & Performance**
- **Connection Pooling**: Configurable connection pool with timeout management
- **Environment-Aware Sharding**:
- Development: 1 shard, 0 replicas (single node)
- Production: 3 shards, 1 replica (high availability)
- **Bulk Operations**: Optimized bulk indexing with configurable batch sizes
- **Index Templates**: Automatic application of settings to new indexes
### 🔄 **Index Lifecycle Management**
- **Automated Index Rollover**: Based on size, document count, and age
- **Hot-Warm-Cold Architecture**: Optimized storage costs
- **Retention Policies**: Automatic cleanup of old data
- **Force Merge**: Optimization in warm phase
### 📊 **Monitoring & Observability**
- **Health Checks**: Automatic cluster health monitoring
- **Spring Boot Actuator**: Health endpoints for monitoring systems
- **Metrics Collection**: Configurable performance metrics
- **Slow Query Detection**: Configurable thresholds for query performance
### 🛡️ **Error Handling & Resilience**
- **Connection Retry Logic**: Automatic retry with backoff
- **Circuit Breaker Pattern**: Fail-fast for unhealthy clusters
- **Graceful Degradation**: Fallback to Typesense when OpenSearch unavailable
- **Detailed Error Logging**: Comprehensive error tracking
## 🚀 Usage
### Development Environment
```yaml
# application-development.yml
storycove:
opensearch:
profile: development
security:
ssl-verification: false
trust-all-certificates: true
indices:
default-shards: 1
default-replicas: 0
```
### Production Environment
```yaml
# application-production.yml
storycove:
opensearch:
profile: production
security:
ssl-verification: true
trust-all-certificates: false
truststore-path: /etc/ssl/opensearch-truststore.jks
indices:
default-shards: 3
default-replicas: 1
```
## 📋 Environment Variables
### Required
- `OPENSEARCH_PASSWORD`: Admin password for OpenSearch cluster
### Optional (with sensible defaults)
- `OPENSEARCH_HOST`: Cluster hostname (default: localhost)
- `OPENSEARCH_PORT`: Cluster port (default: 9200)
- `OPENSEARCH_USERNAME`: Admin username (default: admin)
- `OPENSEARCH_SSL_VERIFICATION`: Enable SSL verification (default: false for dev)
- `OPENSEARCH_MAX_CONN_TOTAL`: Max connections (default: 30 for dev, 200 for prod)
## 🎯 Index Templates
Index templates automatically apply configuration to new indexes:
```json
{
"index_patterns": ["stories_*"],
"template": {
"settings": {
"number_of_shards": "#{ENV_SPECIFIC}",
"analysis": {
"analyzer": {
"story_analyzer": {
"type": "standard",
"stopwords": "_english_"
}
}
}
}
}
}
```
## 🔍 Health Monitoring
Access health information:
- **Application Health**: `/actuator/health`
- **OpenSearch Specific**: `/actuator/health/opensearch`
- **Detailed Metrics**: Available when `enable-metrics: true`
## 🔄 Migration Strategy
The configuration supports parallel operation with Typesense:
1. **Development**: Test OpenSearch alongside Typesense
2. **Staging**: Validate performance and accuracy
3. **Production**: Gradual rollout with instant rollback capability
## 🛠️ Troubleshooting
### Common Issues
1. **SSL Certificate Errors**
- Development: Set `trust-all-certificates: true`
- Production: Provide valid truststore path
2. **Connection Timeouts**
- Increase `connection.timeout` values
- Check network connectivity and firewall rules
3. **Index Creation Failures**
- Verify cluster health with `/actuator/health/opensearch`
- Check OpenSearch logs for detailed error messages
4. **Performance Issues**
- Monitor slow queries with configurable thresholds
- Adjust bulk operation settings
- Review shard allocation and replica settings
## 🔮 Future Enhancements
- **Multi-Cluster Support**: Connect to multiple OpenSearch clusters
- **Advanced Security**: Integration with OpenSearch Security plugin
- **Custom Analyzers**: Domain-specific text analysis
- **Index Aliases**: Zero-downtime index updates
- **Machine Learning**: Integration with OpenSearch ML features
---
This configuration provides a solid foundation that scales from development to enterprise production environments while maintaining security, performance, and operational excellence.

View File

@@ -0,0 +1,32 @@
# OpenSearch Development Configuration
opensearch:
cluster:
name: "storycove-dev"
initial_master_nodes: ["opensearch-node"]
# Development settings - single node, minimal resources
indices:
default_settings:
number_of_shards: 1
number_of_replicas: 0
refresh_interval: "1s"
# Security settings for development
security:
ssl_verification: false
trust_all_certificates: true
# Connection settings
connection:
timeout: "30s"
socket_timeout: "60s"
max_connections_per_route: 10
max_connections_total: 30
# Index management
index_management:
auto_create_templates: true
template_patterns:
stories: "stories_*"
authors: "authors_*"
collections: "collections_*"

View File

@@ -0,0 +1,60 @@
# OpenSearch Production Configuration
opensearch:
cluster:
name: "storycove-prod"
# Production settings - multi-shard, with replicas
indices:
default_settings:
number_of_shards: 3
number_of_replicas: 1
refresh_interval: "30s"
max_result_window: 50000
# Index lifecycle policies
lifecycle:
hot_phase_duration: "7d"
warm_phase_duration: "30d"
cold_phase_duration: "90d"
delete_after: "1y"
# Security settings for production
security:
ssl_verification: true
trust_all_certificates: false
certificate_verification: true
tls_version: "TLSv1.3"
# Connection settings
connection:
timeout: "10s"
socket_timeout: "30s"
max_connections_per_route: 50
max_connections_total: 200
retry_on_failure: true
max_retries: 3
retry_delay: "1s"
# Performance tuning
performance:
bulk_actions: 1000
bulk_size: "5MB"
bulk_timeout: "10s"
concurrent_requests: 4
# Monitoring and observability
monitoring:
health_check_interval: "30s"
slow_query_threshold: "5s"
enable_metrics: true
# Index management
index_management:
auto_create_templates: true
template_patterns:
stories: "stories_*"
authors: "authors_*"
collections: "collections_*"
retention_policy:
enabled: true
default_retention: "1y"

View File

@@ -0,0 +1,79 @@
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"name_analyzer": {
"type": "standard",
"stopwords": "_english_"
},
"autocomplete_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "edge_ngram"]
}
},
"filter": {
"edge_ngram": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20
}
}
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text",
"analyzer": "name_analyzer",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "autocomplete_analyzer"
},
"keyword": {
"type": "keyword"
}
}
},
"bio": {
"type": "text",
"analyzer": "name_analyzer"
},
"urls": {
"type": "keyword"
},
"imageUrl": {
"type": "keyword"
},
"storyCount": {
"type": "integer"
},
"averageRating": {
"type": "float"
},
"totalWordCount": {
"type": "long"
},
"totalReadingTime": {
"type": "integer"
},
"createdAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"updatedAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"libraryId": {
"type": "keyword"
}
}
}
}

View File

@@ -0,0 +1,73 @@
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"collection_analyzer": {
"type": "standard",
"stopwords": "_english_"
},
"autocomplete_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "edge_ngram"]
}
},
"filter": {
"edge_ngram": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20
}
}
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text",
"analyzer": "collection_analyzer",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "autocomplete_analyzer"
},
"keyword": {
"type": "keyword"
}
}
},
"description": {
"type": "text",
"analyzer": "collection_analyzer"
},
"storyCount": {
"type": "integer"
},
"totalWordCount": {
"type": "long"
},
"averageRating": {
"type": "float"
},
"isPublic": {
"type": "boolean"
},
"createdAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"updatedAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"libraryId": {
"type": "keyword"
}
}
}
}

View File

@@ -0,0 +1,120 @@
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"story_analyzer": {
"type": "standard",
"stopwords": "_english_"
},
"autocomplete_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "edge_ngram"]
}
},
"filter": {
"edge_ngram": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20
}
}
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"title": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "autocomplete_analyzer"
},
"keyword": {
"type": "keyword"
}
}
},
"content": {
"type": "text",
"analyzer": "story_analyzer"
},
"summary": {
"type": "text",
"analyzer": "story_analyzer"
},
"authorNames": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"authorIds": {
"type": "keyword"
},
"tagNames": {
"type": "keyword"
},
"seriesTitle": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"seriesId": {
"type": "keyword"
},
"wordCount": {
"type": "integer"
},
"rating": {
"type": "float"
},
"readingTime": {
"type": "integer"
},
"language": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"createdAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"updatedAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"publishedAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"isRead": {
"type": "boolean"
},
"isFavorite": {
"type": "boolean"
},
"readingProgress": {
"type": "float"
},
"libraryId": {
"type": "keyword"
}
}
}
}

View File

@@ -0,0 +1,77 @@
{
"policy": {
"description": "StoryCove index lifecycle policy",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [
{
"rollover": {
"min_size": "50gb",
"min_doc_count": 1000000,
"min_age": "7d"
}
}
],
"transitions": [
{
"state_name": "warm",
"conditions": {
"min_age": "7d"
}
}
]
},
{
"name": "warm",
"actions": [
{
"replica_count": {
"number_of_replicas": 0
}
},
{
"force_merge": {
"max_num_segments": 1
}
}
],
"transitions": [
{
"state_name": "cold",
"conditions": {
"min_age": "30d"
}
}
]
},
{
"name": "cold",
"actions": [],
"transitions": [
{
"state_name": "delete",
"conditions": {
"min_age": "365d"
}
}
]
},
{
"name": "delete",
"actions": [
{
"delete": {}
}
]
}
],
"ism_template": [
{
"index_patterns": ["stories_*", "authors_*", "collections_*"],
"priority": 100
}
]
}
}

View File

@@ -0,0 +1,124 @@
{
"index_patterns": ["stories_*"],
"priority": 1,
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"story_analyzer": {
"type": "standard",
"stopwords": "_english_"
},
"autocomplete_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "edge_ngram"]
}
},
"filter": {
"edge_ngram": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20
}
}
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"title": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "autocomplete_analyzer"
},
"keyword": {
"type": "keyword"
}
}
},
"content": {
"type": "text",
"analyzer": "story_analyzer"
},
"summary": {
"type": "text",
"analyzer": "story_analyzer"
},
"authorNames": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"authorIds": {
"type": "keyword"
},
"tagNames": {
"type": "keyword"
},
"seriesTitle": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"seriesId": {
"type": "keyword"
},
"wordCount": {
"type": "integer"
},
"rating": {
"type": "float"
},
"readingTime": {
"type": "integer"
},
"language": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"createdAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"updatedAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"publishedAt": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"isRead": {
"type": "boolean"
},
"isFavorite": {
"type": "boolean"
},
"readingProgress": {
"type": "float"
},
"libraryId": {
"type": "keyword"
}
}
}
}
}