Files
storycove/OPENSEARCH_MIGRATION_SPECIFICATION.md
Stefan Hardegger 54df3c471e phase 1
2025-09-18 07:46:10 +02:00

25 KiB

StoryCove Search Migration Specification: Typesense to OpenSearch

Executive Summary

This document specifies the migration from Typesense to OpenSearch for the StoryCove application. The migration will be implemented using a parallel approach, maintaining Typesense functionality while gradually transitioning to OpenSearch, ensuring zero downtime and the ability to rollback if needed.

Migration Goals:

  • Solve random query reliability issues
  • Improve complex filtering performance
  • Maintain feature parity during transition
  • Zero downtime migration
  • Improved developer experience

Current State Analysis

Typesense Implementation Overview

Service Architecture:

  • TypesenseService.java (~2000 lines) - Primary search service
  • 3 search indexes: Stories, Authors, Collections
  • Multi-library support with dynamic collection names
  • Integration with Spring Boot backend

Core Functionality:

  1. Full-text Search: Stories, Authors with complex query building
  2. Random Story Selection: _rand() function with fallback logic
  3. Advanced Filtering: 15+ filter conditions with boolean logic
  4. Faceting: Tag aggregations and counts
  5. Autocomplete: Search suggestions with typeahead
  6. CRUD Operations: Index/update/delete for all entity types

Current Issues Identified:

  • _rand() function unreliability requiring complex fallback logic
  • Complex filter query building with escaping issues
  • Limited aggregation capabilities
  • Inconsistent API behavior across query patterns
  • Multi-collection management complexity

Data Models and Schema

Story Index Fields:

// Core fields
UUID id, String title, String description, String sourceUrl
Integer wordCount, Integer rating, Integer volume
Boolean isRead, LocalDateTime lastReadAt, Integer readingPosition

// Relationships
UUID authorId, String authorName
UUID seriesId, String seriesName
List<String> tagNames

// Metadata
LocalDateTime createdAt, LocalDateTime updatedAt
String coverPath, String sourceDomain

Author Index Fields:

UUID id, String name, String notes
Integer authorRating, Double averageStoryRating, Integer storyCount
List<String> urls, String avatarImagePath
LocalDateTime createdAt, LocalDateTime updatedAt

Collection Index Fields:

UUID id, String name, String description
List<String> tagNames, Boolean archived
LocalDateTime createdAt, LocalDateTime updatedAt
Integer storyCount, Integer currentPosition

API Endpoints Current State

Search Endpoints Analysis:

USED by Frontend (Must Implement):

  • GET /api/stories/search - Main story search with complex filtering (CRITICAL)
  • GET /api/stories/random - Random story selection with filters (CRITICAL)
  • GET /api/authors/search-typesense - Author search (HIGH)
  • GET /api/tags/autocomplete - Tag suggestions (MEDIUM)
  • POST /api/stories/reindex-typesense - Admin reindex operations (MEDIUM)
  • POST /api/authors/reindex-typesense - Admin reindex operations (MEDIUM)
  • POST /api/stories/recreate-typesense-collection - Admin recreate (MEDIUM)
  • POST /api/authors/recreate-typesense-collection - Admin recreate (MEDIUM)

UNUSED by Frontend (Skip Implementation):

  • GET /api/stories/search/suggestions - Not used by frontend
  • GET /api/authors/search - Superseded by typesense version
  • GET /api/series/search - Not used by frontend
  • GET /api/tags/search - Superseded by autocomplete
  • POST /api/search/reindex - Not used by frontend
  • GET /api/search/health - Not used by frontend

Scope Reduction: ~40% fewer endpoints to implement

Search Parameters (Stories):

query, page, size, authors[], tags[], minRating, maxRating
sortBy, sortDir, facetBy[]
minWordCount, maxWordCount, createdAfter, createdBefore
lastReadAfter, lastReadBefore, unratedOnly, readingStatus
hasReadingProgress, hasCoverImage, sourceDomain, seriesFilter
minTagCount, popularOnly, hiddenGemsOnly

Target OpenSearch Architecture

Service Layer Design

New Components:

OpenSearchService.java        - Primary search service (mirrors TypesenseService API)
OpenSearchConfig.java         - Configuration and client setup
SearchMigrationService.java   - Handles parallel operation during migration
SearchServiceAdapter.java     - Abstraction layer for service switching

Index Strategy:

  • Single-node deployment for development/small installations
  • Index-per-library approach: stories-{libraryId}, authors-{libraryId}, collections-{libraryId}
  • Index templates for consistent mapping across libraries
  • Aliases for easy switching and zero-downtime updates

OpenSearch Index Mappings

Stories Index Mapping:

{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "story_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "stop", "snowball"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "id": {"type": "keyword"},
      "title": {
        "type": "text",
        "analyzer": "story_analyzer",
        "fields": {"keyword": {"type": "keyword"}}
      },
      "description": {
        "type": "text",
        "analyzer": "story_analyzer"
      },
      "authorName": {
        "type": "text",
        "analyzer": "story_analyzer",
        "fields": {"keyword": {"type": "keyword"}}
      },
      "seriesName": {
        "type": "text",
        "fields": {"keyword": {"type": "keyword"}}
      },
      "tagNames": {"type": "keyword"},
      "wordCount": {"type": "integer"},
      "rating": {"type": "integer"},
      "volume": {"type": "integer"},
      "isRead": {"type": "boolean"},
      "readingPosition": {"type": "integer"},
      "lastReadAt": {"type": "date"},
      "createdAt": {"type": "date"},
      "updatedAt": {"type": "date"},
      "coverPath": {"type": "keyword"},
      "sourceUrl": {"type": "keyword"},
      "sourceDomain": {"type": "keyword"}
    }
  }
}

Authors Index Mapping:

{
  "mappings": {
    "properties": {
      "id": {"type": "keyword"},
      "name": {
        "type": "text",
        "analyzer": "story_analyzer",
        "fields": {"keyword": {"type": "keyword"}}
      },
      "notes": {"type": "text"},
      "authorRating": {"type": "integer"},
      "averageStoryRating": {"type": "float"},
      "storyCount": {"type": "integer"},
      "urls": {"type": "keyword"},
      "avatarImagePath": {"type": "keyword"},
      "createdAt": {"type": "date"},
      "updatedAt": {"type": "date"}
    }
  }
}

Collections Index Mapping:

{
  "mappings": {
    "properties": {
      "id": {"type": "keyword"},
      "name": {
        "type": "text",
        "fields": {"keyword": {"type": "keyword"}}
      },
      "description": {"type": "text"},
      "tagNames": {"type": "keyword"},
      "archived": {"type": "boolean"},
      "storyCount": {"type": "integer"},
      "currentPosition": {"type": "integer"},
      "createdAt": {"type": "date"},
      "updatedAt": {"type": "date"}
    }
  }
}

Query Translation Strategy

Random Story Queries:

// Typesense (problematic)
String sortBy = seed != null ? "_rand(" + seed + ")" : "_rand()";

// OpenSearch (reliable)
QueryBuilder randomQuery = QueryBuilders.functionScoreQuery(
    QueryBuilders.boolQuery().must(filters),
    ScoreFunctionBuilders.randomFunction(seed != null ? seed.intValue() : null)
);

Complex Filtering:

// Build bool query with multiple filter conditions
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
    .must(QueryBuilders.multiMatchQuery(query, "title", "description", "authorName"))
    .filter(QueryBuilders.termsQuery("tagNames", tags))
    .filter(QueryBuilders.rangeQuery("wordCount").gte(minWords).lte(maxWords))
    .filter(QueryBuilders.rangeQuery("rating").gte(minRating).lte(maxRating));

Faceting/Aggregations:

// Tags aggregation
AggregationBuilder tagsAgg = AggregationBuilders
    .terms("tags")
    .field("tagNames")
    .size(100);

// Rating ranges
AggregationBuilder ratingRanges = AggregationBuilders
    .range("rating_ranges")
    .field("rating")
    .addRange("unrated", 0, 1)
    .addRange("low", 1, 3)
    .addRange("high", 4, 6);

Revised Implementation Phases (Scope Reduced by 40%)

Phase 1: Infrastructure Setup (Week 1)

Objectives:

  • Add OpenSearch to Docker Compose
  • Create basic OpenSearch service
  • Establish index templates and mappings
  • Focus: Only stories, authors, and tags indexes (skip series, collections)

Deliverables:

  1. Docker Compose Updates:
opensearch:
  image: opensearchproject/opensearch:2.11.0
  environment:
    - discovery.type=single-node
    - DISABLE_SECURITY_PLUGIN=true
    - OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx1g
  ports:
    - "9200:9200"
  volumes:
    - opensearch_data:/usr/share/opensearch/data
  1. OpenSearchConfig.java:
@Configuration
@ConditionalOnProperty(name = "storycove.opensearch.enabled", havingValue = "true")
public class OpenSearchConfig {
    @Bean
    public OpenSearchClient openSearchClient() {
        // Client configuration
    }
}
  1. Basic Index Creation:
    • Create index templates for stories, authors, collections
    • Implement index creation with proper mappings
    • Add health check endpoint

Success Criteria:

  • OpenSearch container starts successfully
  • Basic connectivity established
  • Index templates created and validated

Phase 2: Core Service Implementation (Week 2)

Objectives:

  • Implement OpenSearchService with core functionality
  • Create service abstraction layer
  • Implement basic search operations
  • Focus: Only critical endpoints (stories search, random, authors)

Deliverables:

  1. OpenSearchService.java - Core service implementing:

    • indexStory(), updateStory(), deleteStory()
    • searchStories() with basic query support (CRITICAL)
    • getRandomStoryId() with reliable seed support (CRITICAL)
    • indexAuthor(), updateAuthor(), deleteAuthor()
    • searchAuthors() for authors page (HIGH)
    • bulkIndexStories(), bulkIndexAuthors() for initial data loading
  2. SearchServiceAdapter.java - Abstraction layer:

@Service
public class SearchServiceAdapter {
    @Autowired(required = false)
    private TypesenseService typesenseService;

    @Autowired(required = false)
    private OpenSearchService openSearchService;

    @Value("${storycove.search.provider:typesense}")
    private String searchProvider;

    public SearchResultDto<StorySearchDto> searchStories(...) {
        return "opensearch".equals(searchProvider)
            ? openSearchService.searchStories(...)
            : typesenseService.searchStories(...);
    }
}
  1. Basic Query Implementation:
    • Full-text search across title/description/author
    • Basic filtering (tags, rating, word count)
    • Pagination and sorting

Success Criteria:

  • Basic search functionality working
  • Service abstraction layer functional
  • Can switch between Typesense and OpenSearch via configuration

Phase 3: Advanced Features Implementation (Week 3)

Objectives:

  • Implement complex filtering (all 15+ filter types)
  • Add random story functionality
  • Implement faceting/aggregations
  • Add autocomplete/suggestions

Deliverables:

  1. Complex Query Builder:

    • All filter conditions from original implementation
    • Date range filtering with proper timezone handling
    • Boolean logic for reading status, coverage, series filters
  2. Random Story Implementation:

public Optional<UUID> getRandomStoryId(String searchQuery, List<String> tags, Long seed, ...) {
    BoolQueryBuilder baseQuery = buildFilterQuery(searchQuery, tags, ...);

    QueryBuilder randomQuery = QueryBuilders.functionScoreQuery(
        baseQuery,
        ScoreFunctionBuilders.randomFunction(seed != null ? seed.intValue() : null)
    );

    SearchRequest request = new SearchRequest("stories-" + getCurrentLibraryId())
        .source(new SearchSourceBuilder()
            .query(randomQuery)
            .size(1)
            .fetchSource(new String[]{"id"}, null));

    // Execute and return result
}
  1. Faceting Implementation:

    • Tag aggregations with counts
    • Rating range aggregations
    • Author aggregations
    • Custom facet builders
  2. Autocomplete Service:

    • Suggest-based implementation using completion fields
    • Prefix matching for story titles and author names

Success Criteria:

  • All filter conditions working correctly
  • Random story selection reliable with seed support
  • Faceting returns accurate counts
  • Autocomplete responsive and accurate

Phase 4: Data Migration & Parallel Operation (Week 4)

Objectives:

  • Implement bulk data migration from database
  • Enable parallel operation (write to both systems)
  • Comprehensive testing of OpenSearch functionality

Deliverables:

  1. Migration Service:
@Service
public class SearchMigrationService {
    public void performFullMigration() {
        // Migrate all libraries
        List<Library> libraries = libraryService.findAll();
        for (Library library : libraries) {
            migrateLibraryData(library);
        }
    }

    private void migrateLibraryData(Library library) {
        // Create indexes for library
        // Bulk load stories, authors, collections
        // Verify data integrity
    }
}
  1. Dual-Write Implementation:

    • Modify all entity update operations to write to both systems
    • Add configuration flag for dual-write mode
    • Error handling for partial failures
  2. Data Validation Tools:

    • Compare search result counts between systems
    • Validate random story selection consistency
    • Check faceting accuracy

Success Criteria:

  • Complete data migration with 100% accuracy
  • Dual-write operations working without errors
  • Search result parity between systems verified

Phase 5: API Integration & Testing (Week 5)

Objectives:

  • Update controller endpoints to use OpenSearch
  • Comprehensive integration testing
  • Performance testing and optimization

Deliverables:

  1. Controller Updates:

    • Modify controllers to use SearchServiceAdapter
    • Add migration controls for gradual rollout
    • Implement A/B testing capability
  2. Integration Tests:

@SpringBootTest
@TestMethodOrder(OrderAnnotation.class)
public class OpenSearchIntegrationTest {
    @Test
    @Order(1)
    void testBasicSearch() {
        // Test basic story search functionality
    }

    @Test
    @Order(2)
    void testComplexFiltering() {
        // Test all 15+ filter conditions
    }

    @Test
    @Order(3)
    void testRandomStory() {
        // Test random story with and without seed
    }

    @Test
    @Order(4)
    void testFaceting() {
        // Test aggregation accuracy
    }
}
  1. Performance Testing:
    • Load testing with realistic data volumes
    • Query performance benchmarking
    • Memory usage monitoring

Success Criteria:

  • All integration tests passing
  • Performance meets or exceeds Typesense baseline
  • Memory usage within acceptable limits (< 2GB)

Phase 6: Production Rollout & Monitoring (Week 6)

Objectives:

  • Production deployment with feature flags
  • Gradual user migration with monitoring
  • Rollback capability testing

Deliverables:

  1. Feature Flag Implementation:
@Component
public class SearchFeatureFlags {
    @Value("${storycove.search.opensearch.enabled:false}")
    private boolean openSearchEnabled;

    @Value("${storycove.search.opensearch.percentage:0}")
    private int rolloutPercentage;

    public boolean shouldUseOpenSearch(String userId) {
        if (!openSearchEnabled) return false;
        return userId.hashCode() % 100 < rolloutPercentage;
    }
}
  1. Monitoring & Alerting:

    • Query performance metrics
    • Error rate monitoring
    • Search result accuracy validation
    • User experience metrics
  2. Rollback Procedures:

    • Immediate rollback to Typesense capability
    • Data consistency verification
    • Performance rollback triggers

Success Criteria:

  • Successful production deployment
  • Zero user-facing issues during rollout
  • Monitoring showing improved performance
  • Rollback procedures validated

Phase 7: Cleanup & Documentation (Week 7)

Objectives:

  • Remove Typesense dependencies
  • Update documentation
  • Performance optimization

Deliverables:

  1. Code Cleanup:

    • Remove TypesenseService and related classes
    • Clean up Docker Compose configuration
    • Remove unused dependencies
  2. Documentation Updates:

    • Update deployment documentation
    • Search API documentation
    • Troubleshooting guides
  3. Performance Tuning:

    • Index optimization
    • Query performance tuning
    • Resource allocation optimization

Success Criteria:

  • Typesense completely removed
  • Documentation up to date
  • Optimized performance in production

Data Migration Strategy

Pre-Migration Validation

Data Integrity Checks:

  1. Count validation: Ensure all stories/authors/collections are present
  2. Field validation: Verify all required fields are populated
  3. Relationship validation: Check author-story and series-story relationships
  4. Library separation: Ensure proper multi-library data isolation

Migration Process:

  1. Index Creation:
// Create indexes with proper mappings for each library
for (Library library : libraries) {
    String storiesIndex = "stories-" + library.getId();
    createIndexWithMapping(storiesIndex, getStoriesMapping());
    createIndexWithMapping("authors-" + library.getId(), getAuthorsMapping());
    createIndexWithMapping("collections-" + library.getId(), getCollectionsMapping());
}
  1. Bulk Data Loading:
// Load in batches to manage memory usage
int batchSize = 1000;
List<Story> allStories = storyService.findByLibraryId(libraryId);

for (int i = 0; i < allStories.size(); i += batchSize) {
    List<Story> batch = allStories.subList(i, Math.min(i + batchSize, allStories.size()));
    List<StoryDocument> documents = batch.stream()
        .map(this::convertToSearchDocument)
        .collect(Collectors.toList());

    bulkIndexStories(documents, "stories-" + libraryId);
}
  1. Post-Migration Validation:
    • Count comparison between database and OpenSearch
    • Spot-check random records for field accuracy
    • Test search functionality with known queries
    • Verify faceting counts match expected values

Rollback Strategy

Immediate Rollback Triggers:

  • Search error rate > 1%
  • Query performance degradation > 50%
  • Data inconsistency detected
  • Memory usage > 4GB sustained

Rollback Process:

  1. Update feature flag to disable OpenSearch
  2. Verify Typesense still operational
  3. Clear OpenSearch indexes to free resources
  4. Investigate and document issues

Data Consistency During Rollback:

  • Continue dual-write during investigation
  • Re-sync any missed updates to OpenSearch
  • Validate data integrity before retry

Testing Strategy

Unit Tests

OpenSearchService Unit Tests:

@ExtendWith(MockitoExtension.class)
class OpenSearchServiceTest {
    @Mock private OpenSearchClient client;
    @InjectMocks private OpenSearchService service;

    @Test
    void testSearchStoriesBasicQuery() {
        // Mock OpenSearch response
        // Test basic search functionality
    }

    @Test
    void testComplexFilterQuery() {
        // Test complex boolean query building
    }

    @Test
    void testRandomStorySelection() {
        // Test random query with seed
    }
}

Query Builder Tests:

  • Test all 15+ filter conditions
  • Validate query structure and parameters
  • Test edge cases and null handling

Integration Tests

Full Search Integration:

@SpringBootTest
@Testcontainers
class OpenSearchIntegrationTest {
    @Container
    static OpenSearchContainer opensearch = new OpenSearchContainer("opensearchproject/opensearch:2.11.0");

    @Test
    void testEndToEndStorySearch() {
        // Insert test data
        // Perform search via controller
        // Validate results
    }
}

Performance Tests

Load Testing Scenarios:

  1. Concurrent Search Load:

    • 50 concurrent users performing searches
    • Mixed query complexity
    • Duration: 10 minutes
  2. Bulk Indexing Performance:

    • Index 10,000 stories in batches
    • Measure throughput and memory usage
  3. Random Query Performance:

    • 1000 random story requests with different seeds
    • Compare with Typesense baseline

Acceptance Tests

Functional Requirements:

  • All existing search functionality preserved
  • Random story selection improved reliability
  • Faceting accuracy maintained
  • Multi-library separation working

Performance Requirements:

  • Search response time < 100ms for 95th percentile
  • Random story selection < 50ms
  • Index update operations < 10ms
  • Memory usage < 2GB in production

Risk Analysis & Mitigation

Technical Risks

Risk: OpenSearch Memory Usage

  • Probability: Medium
  • Impact: High
  • Mitigation: Resource monitoring, index optimization, container limits

Risk: Query Performance Regression

  • Probability: Low
  • Impact: High
  • Mitigation: Performance testing, query optimization, caching layer

Risk: Data Migration Accuracy

  • Probability: Low
  • Impact: Critical
  • Mitigation: Comprehensive validation, dual-write verification, rollback procedures

Risk: Complex Filter Compatibility

  • Probability: Medium
  • Impact: Medium
  • Mitigation: Extensive testing, gradual rollout, feature flags

Operational Risks

Risk: Production Deployment Issues

  • Probability: Medium
  • Impact: High
  • Mitigation: Staging environment testing, gradual rollout, immediate rollback capability

Risk: Team Learning Curve

  • Probability: High
  • Impact: Low
  • Mitigation: Documentation, training, gradual responsibility transfer

Business Continuity

Zero-Downtime Requirements:

  • Maintain Typesense during entire migration
  • Feature flag-based switching
  • Immediate rollback capability
  • Health monitoring with automated alerts

Success Criteria

Functional Requirements

  • All search functionality migrated successfully
  • Random story selection working reliably with seeds
  • Complex filtering (15+ conditions) working accurately
  • Faceting/aggregation results match expected values
  • Multi-library support maintained
  • Autocomplete functionality preserved

Performance Requirements

  • Search response time ≤ 100ms (95th percentile)
  • Random story selection ≤ 50ms
  • Index operations ≤ 10ms
  • Memory usage ≤ 2GB sustained
  • Zero search downtime during migration

Technical Requirements

  • Code quality maintained (test coverage ≥ 80%)
  • Documentation updated and comprehensive
  • Monitoring and alerting implemented
  • Rollback procedures tested and validated
  • Typesense dependencies cleanly removed

Timeline Summary

Phase Duration Key Deliverables Risk Level
1. Infrastructure 1 week Docker setup, basic service Low
2. Core Service 1 week Basic search operations Medium
3. Advanced Features 1 week Complex filtering, random queries High
4. Data Migration 1 week Full data migration, dual-write High
5. API Integration 1 week Controller updates, testing Medium
6. Production Rollout 1 week Gradual deployment, monitoring High
7. Cleanup 1 week Remove Typesense, documentation Low

Total Estimated Duration: 7 weeks


Configuration Management

Environment Variables

# OpenSearch Configuration
OPENSEARCH_HOST=opensearch
OPENSEARCH_PORT=9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=${OPENSEARCH_PASSWORD}

# Feature Flags
STORYCOVE_OPENSEARCH_ENABLED=true
STORYCOVE_SEARCH_PROVIDER=opensearch
STORYCOVE_SEARCH_DUAL_WRITE=true
STORYCOVE_OPENSEARCH_ROLLOUT_PERCENTAGE=100

# Performance Tuning
OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx2g
STORYCOVE_SEARCH_BATCH_SIZE=1000
STORYCOVE_SEARCH_TIMEOUT=30s

Docker Compose Updates

# Add to docker-compose.yml
opensearch:
  image: opensearchproject/opensearch:2.11.0
  environment:
    - discovery.type=single-node
    - DISABLE_SECURITY_PLUGIN=true
    - OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx2g
  volumes:
    - opensearch_data:/usr/share/opensearch/data
  networks:
    - storycove-network

volumes:
  opensearch_data:

Conclusion

This specification provides a comprehensive roadmap for migrating StoryCove from Typesense to OpenSearch. The phased approach ensures minimal risk while delivering improved reliability and performance, particularly for random story queries.

The parallel implementation strategy allows for thorough validation and provides confidence in the migration while maintaining the ability to rollback if issues arise. Upon successful completion, StoryCove will have a more robust and scalable search infrastructure that better supports its growth and feature requirements.

Next Steps:

  1. Review and approve this specification
  2. Set up development environment with OpenSearch
  3. Begin Phase 1 implementation
  4. Establish monitoring and success metrics
  5. Execute migration according to timeline

Document Version: 1.0 Last Updated: 2025-01-17 Author: Claude Code Assistant