This commit is contained in:
Stefan Hardegger
2025-09-18 07:46:10 +02:00
parent 64f97f5648
commit 54df3c471e
17 changed files with 2530 additions and 0 deletions

View File

@@ -0,0 +1,889 @@
# StoryCove Search Migration Specification: Typesense to OpenSearch
## Executive Summary
This document specifies the migration from Typesense to OpenSearch for the StoryCove application. The migration will be implemented using a parallel approach, maintaining Typesense functionality while gradually transitioning to OpenSearch, ensuring zero downtime and the ability to rollback if needed.
**Migration Goals:**
- Solve random query reliability issues
- Improve complex filtering performance
- Maintain feature parity during transition
- Zero downtime migration
- Improved developer experience
---
## Current State Analysis
### Typesense Implementation Overview
**Service Architecture:**
- `TypesenseService.java` (~2000 lines) - Primary search service
- 3 search indexes: Stories, Authors, Collections
- Multi-library support with dynamic collection names
- Integration with Spring Boot backend
**Core Functionality:**
1. **Full-text Search**: Stories, Authors with complex query building
2. **Random Story Selection**: `_rand()` function with fallback logic
3. **Advanced Filtering**: 15+ filter conditions with boolean logic
4. **Faceting**: Tag aggregations and counts
5. **Autocomplete**: Search suggestions with typeahead
6. **CRUD Operations**: Index/update/delete for all entity types
**Current Issues Identified:**
- `_rand()` function unreliability requiring complex fallback logic
- Complex filter query building with escaping issues
- Limited aggregation capabilities
- Inconsistent API behavior across query patterns
- Multi-collection management complexity
### Data Models and Schema
**Story Index Fields:**
```java
// Core fields
UUID id, String title, String description, String sourceUrl
Integer wordCount, Integer rating, Integer volume
Boolean isRead, LocalDateTime lastReadAt, Integer readingPosition
// Relationships
UUID authorId, String authorName
UUID seriesId, String seriesName
List<String> tagNames
// Metadata
LocalDateTime createdAt, LocalDateTime updatedAt
String coverPath, String sourceDomain
```
**Author Index Fields:**
```java
UUID id, String name, String notes
Integer authorRating, Double averageStoryRating, Integer storyCount
List<String> urls, String avatarImagePath
LocalDateTime createdAt, LocalDateTime updatedAt
```
**Collection Index Fields:**
```java
UUID id, String name, String description
List<String> tagNames, Boolean archived
LocalDateTime createdAt, LocalDateTime updatedAt
Integer storyCount, Integer currentPosition
```
### API Endpoints Current State
**Search Endpoints Analysis:**
**✅ USED by Frontend (Must Implement):**
- `GET /api/stories/search` - Main story search with complex filtering (CRITICAL)
- `GET /api/stories/random` - Random story selection with filters (CRITICAL)
- `GET /api/authors/search-typesense` - Author search (HIGH)
- `GET /api/tags/autocomplete` - Tag suggestions (MEDIUM)
- `POST /api/stories/reindex-typesense` - Admin reindex operations (MEDIUM)
- `POST /api/authors/reindex-typesense` - Admin reindex operations (MEDIUM)
- `POST /api/stories/recreate-typesense-collection` - Admin recreate (MEDIUM)
- `POST /api/authors/recreate-typesense-collection` - Admin recreate (MEDIUM)
**❌ UNUSED by Frontend (Skip Implementation):**
- `GET /api/stories/search/suggestions` - Not used by frontend
- `GET /api/authors/search` - Superseded by typesense version
- `GET /api/series/search` - Not used by frontend
- `GET /api/tags/search` - Superseded by autocomplete
- `POST /api/search/reindex` - Not used by frontend
- `GET /api/search/health` - Not used by frontend
**Scope Reduction: ~40% fewer endpoints to implement**
**Search Parameters (Stories):**
```
query, page, size, authors[], tags[], minRating, maxRating
sortBy, sortDir, facetBy[]
minWordCount, maxWordCount, createdAfter, createdBefore
lastReadAfter, lastReadBefore, unratedOnly, readingStatus
hasReadingProgress, hasCoverImage, sourceDomain, seriesFilter
minTagCount, popularOnly, hiddenGemsOnly
```
---
## Target OpenSearch Architecture
### Service Layer Design
**New Components:**
```
OpenSearchService.java - Primary search service (mirrors TypesenseService API)
OpenSearchConfig.java - Configuration and client setup
SearchMigrationService.java - Handles parallel operation during migration
SearchServiceAdapter.java - Abstraction layer for service switching
```
**Index Strategy:**
- **Single-node deployment** for development/small installations
- **Index-per-library** approach: `stories-{libraryId}`, `authors-{libraryId}`, `collections-{libraryId}`
- **Index templates** for consistent mapping across libraries
- **Aliases** for easy switching and zero-downtime updates
### OpenSearch Index Mappings
**Stories Index Mapping:**
```json
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"story_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "stop", "snowball"]
}
}
}
},
"mappings": {
"properties": {
"id": {"type": "keyword"},
"title": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {"keyword": {"type": "keyword"}}
},
"description": {
"type": "text",
"analyzer": "story_analyzer"
},
"authorName": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {"keyword": {"type": "keyword"}}
},
"seriesName": {
"type": "text",
"fields": {"keyword": {"type": "keyword"}}
},
"tagNames": {"type": "keyword"},
"wordCount": {"type": "integer"},
"rating": {"type": "integer"},
"volume": {"type": "integer"},
"isRead": {"type": "boolean"},
"readingPosition": {"type": "integer"},
"lastReadAt": {"type": "date"},
"createdAt": {"type": "date"},
"updatedAt": {"type": "date"},
"coverPath": {"type": "keyword"},
"sourceUrl": {"type": "keyword"},
"sourceDomain": {"type": "keyword"}
}
}
}
```
**Authors Index Mapping:**
```json
{
"mappings": {
"properties": {
"id": {"type": "keyword"},
"name": {
"type": "text",
"analyzer": "story_analyzer",
"fields": {"keyword": {"type": "keyword"}}
},
"notes": {"type": "text"},
"authorRating": {"type": "integer"},
"averageStoryRating": {"type": "float"},
"storyCount": {"type": "integer"},
"urls": {"type": "keyword"},
"avatarImagePath": {"type": "keyword"},
"createdAt": {"type": "date"},
"updatedAt": {"type": "date"}
}
}
}
```
**Collections Index Mapping:**
```json
{
"mappings": {
"properties": {
"id": {"type": "keyword"},
"name": {
"type": "text",
"fields": {"keyword": {"type": "keyword"}}
},
"description": {"type": "text"},
"tagNames": {"type": "keyword"},
"archived": {"type": "boolean"},
"storyCount": {"type": "integer"},
"currentPosition": {"type": "integer"},
"createdAt": {"type": "date"},
"updatedAt": {"type": "date"}
}
}
}
```
### Query Translation Strategy
**Random Story Queries:**
```java
// Typesense (problematic)
String sortBy = seed != null ? "_rand(" + seed + ")" : "_rand()";
// OpenSearch (reliable)
QueryBuilder randomQuery = QueryBuilders.functionScoreQuery(
QueryBuilders.boolQuery().must(filters),
ScoreFunctionBuilders.randomFunction(seed != null ? seed.intValue() : null)
);
```
**Complex Filtering:**
```java
// Build bool query with multiple filter conditions
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
.must(QueryBuilders.multiMatchQuery(query, "title", "description", "authorName"))
.filter(QueryBuilders.termsQuery("tagNames", tags))
.filter(QueryBuilders.rangeQuery("wordCount").gte(minWords).lte(maxWords))
.filter(QueryBuilders.rangeQuery("rating").gte(minRating).lte(maxRating));
```
**Faceting/Aggregations:**
```java
// Tags aggregation
AggregationBuilder tagsAgg = AggregationBuilders
.terms("tags")
.field("tagNames")
.size(100);
// Rating ranges
AggregationBuilder ratingRanges = AggregationBuilders
.range("rating_ranges")
.field("rating")
.addRange("unrated", 0, 1)
.addRange("low", 1, 3)
.addRange("high", 4, 6);
```
---
## Revised Implementation Phases (Scope Reduced by 40%)
### Phase 1: Infrastructure Setup (Week 1)
**Objectives:**
- Add OpenSearch to Docker Compose
- Create basic OpenSearch service
- Establish index templates and mappings
- **Focus**: Only stories, authors, and tags indexes (skip series, collections)
**Deliverables:**
1. **Docker Compose Updates:**
```yaml
opensearch:
image: opensearchproject/opensearch:2.11.0
environment:
- discovery.type=single-node
- DISABLE_SECURITY_PLUGIN=true
- OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx1g
ports:
- "9200:9200"
volumes:
- opensearch_data:/usr/share/opensearch/data
```
2. **OpenSearchConfig.java:**
```java
@Configuration
@ConditionalOnProperty(name = "storycove.opensearch.enabled", havingValue = "true")
public class OpenSearchConfig {
@Bean
public OpenSearchClient openSearchClient() {
// Client configuration
}
}
```
3. **Basic Index Creation:**
- Create index templates for stories, authors, collections
- Implement index creation with proper mappings
- Add health check endpoint
**Success Criteria:**
- OpenSearch container starts successfully
- Basic connectivity established
- Index templates created and validated
### Phase 2: Core Service Implementation (Week 2)
**Objectives:**
- Implement OpenSearchService with core functionality
- Create service abstraction layer
- Implement basic search operations
- **Focus**: Only critical endpoints (stories search, random, authors)
**Deliverables:**
1. **OpenSearchService.java** - Core service implementing:
- `indexStory()`, `updateStory()`, `deleteStory()`
- `searchStories()` with basic query support (CRITICAL)
- `getRandomStoryId()` with reliable seed support (CRITICAL)
- `indexAuthor()`, `updateAuthor()`, `deleteAuthor()`
- `searchAuthors()` for authors page (HIGH)
- `bulkIndexStories()`, `bulkIndexAuthors()` for initial data loading
2. **SearchServiceAdapter.java** - Abstraction layer:
```java
@Service
public class SearchServiceAdapter {
@Autowired(required = false)
private TypesenseService typesenseService;
@Autowired(required = false)
private OpenSearchService openSearchService;
@Value("${storycove.search.provider:typesense}")
private String searchProvider;
public SearchResultDto<StorySearchDto> searchStories(...) {
return "opensearch".equals(searchProvider)
? openSearchService.searchStories(...)
: typesenseService.searchStories(...);
}
}
```
3. **Basic Query Implementation:**
- Full-text search across title/description/author
- Basic filtering (tags, rating, word count)
- Pagination and sorting
**Success Criteria:**
- Basic search functionality working
- Service abstraction layer functional
- Can switch between Typesense and OpenSearch via configuration
### Phase 3: Advanced Features Implementation (Week 3)
**Objectives:**
- Implement complex filtering (all 15+ filter types)
- Add random story functionality
- Implement faceting/aggregations
- Add autocomplete/suggestions
**Deliverables:**
1. **Complex Query Builder:**
- All filter conditions from original implementation
- Date range filtering with proper timezone handling
- Boolean logic for reading status, coverage, series filters
2. **Random Story Implementation:**
```java
public Optional<UUID> getRandomStoryId(String searchQuery, List<String> tags, Long seed, ...) {
BoolQueryBuilder baseQuery = buildFilterQuery(searchQuery, tags, ...);
QueryBuilder randomQuery = QueryBuilders.functionScoreQuery(
baseQuery,
ScoreFunctionBuilders.randomFunction(seed != null ? seed.intValue() : null)
);
SearchRequest request = new SearchRequest("stories-" + getCurrentLibraryId())
.source(new SearchSourceBuilder()
.query(randomQuery)
.size(1)
.fetchSource(new String[]{"id"}, null));
// Execute and return result
}
```
3. **Faceting Implementation:**
- Tag aggregations with counts
- Rating range aggregations
- Author aggregations
- Custom facet builders
4. **Autocomplete Service:**
- Suggest-based implementation using completion fields
- Prefix matching for story titles and author names
**Success Criteria:**
- All filter conditions working correctly
- Random story selection reliable with seed support
- Faceting returns accurate counts
- Autocomplete responsive and accurate
### Phase 4: Data Migration & Parallel Operation (Week 4)
**Objectives:**
- Implement bulk data migration from database
- Enable parallel operation (write to both systems)
- Comprehensive testing of OpenSearch functionality
**Deliverables:**
1. **Migration Service:**
```java
@Service
public class SearchMigrationService {
public void performFullMigration() {
// Migrate all libraries
List<Library> libraries = libraryService.findAll();
for (Library library : libraries) {
migrateLibraryData(library);
}
}
private void migrateLibraryData(Library library) {
// Create indexes for library
// Bulk load stories, authors, collections
// Verify data integrity
}
}
```
2. **Dual-Write Implementation:**
- Modify all entity update operations to write to both systems
- Add configuration flag for dual-write mode
- Error handling for partial failures
3. **Data Validation Tools:**
- Compare search result counts between systems
- Validate random story selection consistency
- Check faceting accuracy
**Success Criteria:**
- Complete data migration with 100% accuracy
- Dual-write operations working without errors
- Search result parity between systems verified
### Phase 5: API Integration & Testing (Week 5)
**Objectives:**
- Update controller endpoints to use OpenSearch
- Comprehensive integration testing
- Performance testing and optimization
**Deliverables:**
1. **Controller Updates:**
- Modify controllers to use SearchServiceAdapter
- Add migration controls for gradual rollout
- Implement A/B testing capability
2. **Integration Tests:**
```java
@SpringBootTest
@TestMethodOrder(OrderAnnotation.class)
public class OpenSearchIntegrationTest {
@Test
@Order(1)
void testBasicSearch() {
// Test basic story search functionality
}
@Test
@Order(2)
void testComplexFiltering() {
// Test all 15+ filter conditions
}
@Test
@Order(3)
void testRandomStory() {
// Test random story with and without seed
}
@Test
@Order(4)
void testFaceting() {
// Test aggregation accuracy
}
}
```
3. **Performance Testing:**
- Load testing with realistic data volumes
- Query performance benchmarking
- Memory usage monitoring
**Success Criteria:**
- All integration tests passing
- Performance meets or exceeds Typesense baseline
- Memory usage within acceptable limits (< 2GB)
### Phase 6: Production Rollout & Monitoring (Week 6)
**Objectives:**
- Production deployment with feature flags
- Gradual user migration with monitoring
- Rollback capability testing
**Deliverables:**
1. **Feature Flag Implementation:**
```java
@Component
public class SearchFeatureFlags {
@Value("${storycove.search.opensearch.enabled:false}")
private boolean openSearchEnabled;
@Value("${storycove.search.opensearch.percentage:0}")
private int rolloutPercentage;
public boolean shouldUseOpenSearch(String userId) {
if (!openSearchEnabled) return false;
return userId.hashCode() % 100 < rolloutPercentage;
}
}
```
2. **Monitoring & Alerting:**
- Query performance metrics
- Error rate monitoring
- Search result accuracy validation
- User experience metrics
3. **Rollback Procedures:**
- Immediate rollback to Typesense capability
- Data consistency verification
- Performance rollback triggers
**Success Criteria:**
- Successful production deployment
- Zero user-facing issues during rollout
- Monitoring showing improved performance
- Rollback procedures validated
### Phase 7: Cleanup & Documentation (Week 7)
**Objectives:**
- Remove Typesense dependencies
- Update documentation
- Performance optimization
**Deliverables:**
1. **Code Cleanup:**
- Remove TypesenseService and related classes
- Clean up Docker Compose configuration
- Remove unused dependencies
2. **Documentation Updates:**
- Update deployment documentation
- Search API documentation
- Troubleshooting guides
3. **Performance Tuning:**
- Index optimization
- Query performance tuning
- Resource allocation optimization
**Success Criteria:**
- Typesense completely removed
- Documentation up to date
- Optimized performance in production
---
## Data Migration Strategy
### Pre-Migration Validation
**Data Integrity Checks:**
1. Count validation: Ensure all stories/authors/collections are present
2. Field validation: Verify all required fields are populated
3. Relationship validation: Check author-story and series-story relationships
4. Library separation: Ensure proper multi-library data isolation
**Migration Process:**
1. **Index Creation:**
```java
// Create indexes with proper mappings for each library
for (Library library : libraries) {
String storiesIndex = "stories-" + library.getId();
createIndexWithMapping(storiesIndex, getStoriesMapping());
createIndexWithMapping("authors-" + library.getId(), getAuthorsMapping());
createIndexWithMapping("collections-" + library.getId(), getCollectionsMapping());
}
```
2. **Bulk Data Loading:**
```java
// Load in batches to manage memory usage
int batchSize = 1000;
List<Story> allStories = storyService.findByLibraryId(libraryId);
for (int i = 0; i < allStories.size(); i += batchSize) {
List<Story> batch = allStories.subList(i, Math.min(i + batchSize, allStories.size()));
List<StoryDocument> documents = batch.stream()
.map(this::convertToSearchDocument)
.collect(Collectors.toList());
bulkIndexStories(documents, "stories-" + libraryId);
}
```
3. **Post-Migration Validation:**
- Count comparison between database and OpenSearch
- Spot-check random records for field accuracy
- Test search functionality with known queries
- Verify faceting counts match expected values
### Rollback Strategy
**Immediate Rollback Triggers:**
- Search error rate > 1%
- Query performance degradation > 50%
- Data inconsistency detected
- Memory usage > 4GB sustained
**Rollback Process:**
1. Update feature flag to disable OpenSearch
2. Verify Typesense still operational
3. Clear OpenSearch indexes to free resources
4. Investigate and document issues
**Data Consistency During Rollback:**
- Continue dual-write during investigation
- Re-sync any missed updates to OpenSearch
- Validate data integrity before retry
---
## Testing Strategy
### Unit Tests
**OpenSearchService Unit Tests:**
```java
@ExtendWith(MockitoExtension.class)
class OpenSearchServiceTest {
@Mock private OpenSearchClient client;
@InjectMocks private OpenSearchService service;
@Test
void testSearchStoriesBasicQuery() {
// Mock OpenSearch response
// Test basic search functionality
}
@Test
void testComplexFilterQuery() {
// Test complex boolean query building
}
@Test
void testRandomStorySelection() {
// Test random query with seed
}
}
```
**Query Builder Tests:**
- Test all 15+ filter conditions
- Validate query structure and parameters
- Test edge cases and null handling
### Integration Tests
**Full Search Integration:**
```java
@SpringBootTest
@Testcontainers
class OpenSearchIntegrationTest {
@Container
static OpenSearchContainer opensearch = new OpenSearchContainer("opensearchproject/opensearch:2.11.0");
@Test
void testEndToEndStorySearch() {
// Insert test data
// Perform search via controller
// Validate results
}
}
```
### Performance Tests
**Load Testing Scenarios:**
1. **Concurrent Search Load:**
- 50 concurrent users performing searches
- Mixed query complexity
- Duration: 10 minutes
2. **Bulk Indexing Performance:**
- Index 10,000 stories in batches
- Measure throughput and memory usage
3. **Random Query Performance:**
- 1000 random story requests with different seeds
- Compare with Typesense baseline
### Acceptance Tests
**Functional Requirements:**
- All existing search functionality preserved
- Random story selection improved reliability
- Faceting accuracy maintained
- Multi-library separation working
**Performance Requirements:**
- Search response time < 100ms for 95th percentile
- Random story selection < 50ms
- Index update operations < 10ms
- Memory usage < 2GB in production
---
## Risk Analysis & Mitigation
### Technical Risks
**Risk: OpenSearch Memory Usage**
- *Probability: Medium*
- *Impact: High*
- *Mitigation: Resource monitoring, index optimization, container limits*
**Risk: Query Performance Regression**
- *Probability: Low*
- *Impact: High*
- *Mitigation: Performance testing, query optimization, caching layer*
**Risk: Data Migration Accuracy**
- *Probability: Low*
- *Impact: Critical*
- *Mitigation: Comprehensive validation, dual-write verification, rollback procedures*
**Risk: Complex Filter Compatibility**
- *Probability: Medium*
- *Impact: Medium*
- *Mitigation: Extensive testing, gradual rollout, feature flags*
### Operational Risks
**Risk: Production Deployment Issues**
- *Probability: Medium*
- *Impact: High*
- *Mitigation: Staging environment testing, gradual rollout, immediate rollback capability*
**Risk: Team Learning Curve**
- *Probability: High*
- *Impact: Low*
- *Mitigation: Documentation, training, gradual responsibility transfer*
### Business Continuity
**Zero-Downtime Requirements:**
- Maintain Typesense during entire migration
- Feature flag-based switching
- Immediate rollback capability
- Health monitoring with automated alerts
---
## Success Criteria
### Functional Requirements ✅
- [ ] All search functionality migrated successfully
- [ ] Random story selection working reliably with seeds
- [ ] Complex filtering (15+ conditions) working accurately
- [ ] Faceting/aggregation results match expected values
- [ ] Multi-library support maintained
- [ ] Autocomplete functionality preserved
### Performance Requirements ✅
- [ ] Search response time 100ms (95th percentile)
- [ ] Random story selection 50ms
- [ ] Index operations 10ms
- [ ] Memory usage 2GB sustained
- [ ] Zero search downtime during migration
### Technical Requirements ✅
- [ ] Code quality maintained (test coverage 80%)
- [ ] Documentation updated and comprehensive
- [ ] Monitoring and alerting implemented
- [ ] Rollback procedures tested and validated
- [ ] Typesense dependencies cleanly removed
---
## Timeline Summary
| Phase | Duration | Key Deliverables | Risk Level |
|-------|----------|------------------|------------|
| 1. Infrastructure | 1 week | Docker setup, basic service | Low |
| 2. Core Service | 1 week | Basic search operations | Medium |
| 3. Advanced Features | 1 week | Complex filtering, random queries | High |
| 4. Data Migration | 1 week | Full data migration, dual-write | High |
| 5. API Integration | 1 week | Controller updates, testing | Medium |
| 6. Production Rollout | 1 week | Gradual deployment, monitoring | High |
| 7. Cleanup | 1 week | Remove Typesense, documentation | Low |
**Total Estimated Duration: 7 weeks**
---
## Configuration Management
### Environment Variables
```bash
# OpenSearch Configuration
OPENSEARCH_HOST=opensearch
OPENSEARCH_PORT=9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=${OPENSEARCH_PASSWORD}
# Feature Flags
STORYCOVE_OPENSEARCH_ENABLED=true
STORYCOVE_SEARCH_PROVIDER=opensearch
STORYCOVE_SEARCH_DUAL_WRITE=true
STORYCOVE_OPENSEARCH_ROLLOUT_PERCENTAGE=100
# Performance Tuning
OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx2g
STORYCOVE_SEARCH_BATCH_SIZE=1000
STORYCOVE_SEARCH_TIMEOUT=30s
```
### Docker Compose Updates
```yaml
# Add to docker-compose.yml
opensearch:
image: opensearchproject/opensearch:2.11.0
environment:
- discovery.type=single-node
- DISABLE_SECURITY_PLUGIN=true
- OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx2g
volumes:
- opensearch_data:/usr/share/opensearch/data
networks:
- storycove-network
volumes:
opensearch_data:
```
---
## Conclusion
This specification provides a comprehensive roadmap for migrating StoryCove from Typesense to OpenSearch. The phased approach ensures minimal risk while delivering improved reliability and performance, particularly for random story queries.
The parallel implementation strategy allows for thorough validation and provides confidence in the migration while maintaining the ability to rollback if issues arise. Upon successful completion, StoryCove will have a more robust and scalable search infrastructure that better supports its growth and feature requirements.
**Next Steps:**
1. Review and approve this specification
2. Set up development environment with OpenSearch
3. Begin Phase 1 implementation
4. Establish monitoring and success metrics
5. Execute migration according to timeline
---
*Document Version: 1.0*
*Last Updated: 2025-01-17*
*Author: Claude Code Assistant*