phase 1
This commit is contained in:
889
OPENSEARCH_MIGRATION_SPECIFICATION.md
Normal file
889
OPENSEARCH_MIGRATION_SPECIFICATION.md
Normal file
@@ -0,0 +1,889 @@
|
||||
# StoryCove Search Migration Specification: Typesense to OpenSearch
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document specifies the migration from Typesense to OpenSearch for the StoryCove application. The migration will be implemented using a parallel approach, maintaining Typesense functionality while gradually transitioning to OpenSearch, ensuring zero downtime and the ability to rollback if needed.
|
||||
|
||||
**Migration Goals:**
|
||||
- Solve random query reliability issues
|
||||
- Improve complex filtering performance
|
||||
- Maintain feature parity during transition
|
||||
- Zero downtime migration
|
||||
- Improved developer experience
|
||||
|
||||
---
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### Typesense Implementation Overview
|
||||
|
||||
**Service Architecture:**
|
||||
- `TypesenseService.java` (~2000 lines) - Primary search service
|
||||
- 3 search indexes: Stories, Authors, Collections
|
||||
- Multi-library support with dynamic collection names
|
||||
- Integration with Spring Boot backend
|
||||
|
||||
**Core Functionality:**
|
||||
1. **Full-text Search**: Stories, Authors with complex query building
|
||||
2. **Random Story Selection**: `_rand()` function with fallback logic
|
||||
3. **Advanced Filtering**: 15+ filter conditions with boolean logic
|
||||
4. **Faceting**: Tag aggregations and counts
|
||||
5. **Autocomplete**: Search suggestions with typeahead
|
||||
6. **CRUD Operations**: Index/update/delete for all entity types
|
||||
|
||||
**Current Issues Identified:**
|
||||
- `_rand()` function unreliability requiring complex fallback logic
|
||||
- Complex filter query building with escaping issues
|
||||
- Limited aggregation capabilities
|
||||
- Inconsistent API behavior across query patterns
|
||||
- Multi-collection management complexity
|
||||
|
||||
### Data Models and Schema
|
||||
|
||||
**Story Index Fields:**
|
||||
```java
|
||||
// Core fields
|
||||
UUID id, String title, String description, String sourceUrl
|
||||
Integer wordCount, Integer rating, Integer volume
|
||||
Boolean isRead, LocalDateTime lastReadAt, Integer readingPosition
|
||||
|
||||
// Relationships
|
||||
UUID authorId, String authorName
|
||||
UUID seriesId, String seriesName
|
||||
List<String> tagNames
|
||||
|
||||
// Metadata
|
||||
LocalDateTime createdAt, LocalDateTime updatedAt
|
||||
String coverPath, String sourceDomain
|
||||
```
|
||||
|
||||
**Author Index Fields:**
|
||||
```java
|
||||
UUID id, String name, String notes
|
||||
Integer authorRating, Double averageStoryRating, Integer storyCount
|
||||
List<String> urls, String avatarImagePath
|
||||
LocalDateTime createdAt, LocalDateTime updatedAt
|
||||
```
|
||||
|
||||
**Collection Index Fields:**
|
||||
```java
|
||||
UUID id, String name, String description
|
||||
List<String> tagNames, Boolean archived
|
||||
LocalDateTime createdAt, LocalDateTime updatedAt
|
||||
Integer storyCount, Integer currentPosition
|
||||
```
|
||||
|
||||
### API Endpoints Current State
|
||||
|
||||
**Search Endpoints Analysis:**
|
||||
|
||||
**✅ USED by Frontend (Must Implement):**
|
||||
- `GET /api/stories/search` - Main story search with complex filtering (CRITICAL)
|
||||
- `GET /api/stories/random` - Random story selection with filters (CRITICAL)
|
||||
- `GET /api/authors/search-typesense` - Author search (HIGH)
|
||||
- `GET /api/tags/autocomplete` - Tag suggestions (MEDIUM)
|
||||
- `POST /api/stories/reindex-typesense` - Admin reindex operations (MEDIUM)
|
||||
- `POST /api/authors/reindex-typesense` - Admin reindex operations (MEDIUM)
|
||||
- `POST /api/stories/recreate-typesense-collection` - Admin recreate (MEDIUM)
|
||||
- `POST /api/authors/recreate-typesense-collection` - Admin recreate (MEDIUM)
|
||||
|
||||
**❌ UNUSED by Frontend (Skip Implementation):**
|
||||
- `GET /api/stories/search/suggestions` - Not used by frontend
|
||||
- `GET /api/authors/search` - Superseded by typesense version
|
||||
- `GET /api/series/search` - Not used by frontend
|
||||
- `GET /api/tags/search` - Superseded by autocomplete
|
||||
- `POST /api/search/reindex` - Not used by frontend
|
||||
- `GET /api/search/health` - Not used by frontend
|
||||
|
||||
**Scope Reduction: ~40% fewer endpoints to implement**
|
||||
|
||||
**Search Parameters (Stories):**
|
||||
```
|
||||
query, page, size, authors[], tags[], minRating, maxRating
|
||||
sortBy, sortDir, facetBy[]
|
||||
minWordCount, maxWordCount, createdAfter, createdBefore
|
||||
lastReadAfter, lastReadBefore, unratedOnly, readingStatus
|
||||
hasReadingProgress, hasCoverImage, sourceDomain, seriesFilter
|
||||
minTagCount, popularOnly, hiddenGemsOnly
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Target OpenSearch Architecture
|
||||
|
||||
### Service Layer Design
|
||||
|
||||
**New Components:**
|
||||
```
|
||||
OpenSearchService.java - Primary search service (mirrors TypesenseService API)
|
||||
OpenSearchConfig.java - Configuration and client setup
|
||||
SearchMigrationService.java - Handles parallel operation during migration
|
||||
SearchServiceAdapter.java - Abstraction layer for service switching
|
||||
```
|
||||
|
||||
**Index Strategy:**
|
||||
- **Single-node deployment** for development/small installations
|
||||
- **Index-per-library** approach: `stories-{libraryId}`, `authors-{libraryId}`, `collections-{libraryId}`
|
||||
- **Index templates** for consistent mapping across libraries
|
||||
- **Aliases** for easy switching and zero-downtime updates
|
||||
|
||||
### OpenSearch Index Mappings
|
||||
|
||||
**Stories Index Mapping:**
|
||||
```json
|
||||
{
|
||||
"settings": {
|
||||
"number_of_shards": 1,
|
||||
"number_of_replicas": 0,
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"story_analyzer": {
|
||||
"type": "custom",
|
||||
"tokenizer": "standard",
|
||||
"filter": ["lowercase", "stop", "snowball"]
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"id": {"type": "keyword"},
|
||||
"title": {
|
||||
"type": "text",
|
||||
"analyzer": "story_analyzer",
|
||||
"fields": {"keyword": {"type": "keyword"}}
|
||||
},
|
||||
"description": {
|
||||
"type": "text",
|
||||
"analyzer": "story_analyzer"
|
||||
},
|
||||
"authorName": {
|
||||
"type": "text",
|
||||
"analyzer": "story_analyzer",
|
||||
"fields": {"keyword": {"type": "keyword"}}
|
||||
},
|
||||
"seriesName": {
|
||||
"type": "text",
|
||||
"fields": {"keyword": {"type": "keyword"}}
|
||||
},
|
||||
"tagNames": {"type": "keyword"},
|
||||
"wordCount": {"type": "integer"},
|
||||
"rating": {"type": "integer"},
|
||||
"volume": {"type": "integer"},
|
||||
"isRead": {"type": "boolean"},
|
||||
"readingPosition": {"type": "integer"},
|
||||
"lastReadAt": {"type": "date"},
|
||||
"createdAt": {"type": "date"},
|
||||
"updatedAt": {"type": "date"},
|
||||
"coverPath": {"type": "keyword"},
|
||||
"sourceUrl": {"type": "keyword"},
|
||||
"sourceDomain": {"type": "keyword"}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Authors Index Mapping:**
|
||||
```json
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"id": {"type": "keyword"},
|
||||
"name": {
|
||||
"type": "text",
|
||||
"analyzer": "story_analyzer",
|
||||
"fields": {"keyword": {"type": "keyword"}}
|
||||
},
|
||||
"notes": {"type": "text"},
|
||||
"authorRating": {"type": "integer"},
|
||||
"averageStoryRating": {"type": "float"},
|
||||
"storyCount": {"type": "integer"},
|
||||
"urls": {"type": "keyword"},
|
||||
"avatarImagePath": {"type": "keyword"},
|
||||
"createdAt": {"type": "date"},
|
||||
"updatedAt": {"type": "date"}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Collections Index Mapping:**
|
||||
```json
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"id": {"type": "keyword"},
|
||||
"name": {
|
||||
"type": "text",
|
||||
"fields": {"keyword": {"type": "keyword"}}
|
||||
},
|
||||
"description": {"type": "text"},
|
||||
"tagNames": {"type": "keyword"},
|
||||
"archived": {"type": "boolean"},
|
||||
"storyCount": {"type": "integer"},
|
||||
"currentPosition": {"type": "integer"},
|
||||
"createdAt": {"type": "date"},
|
||||
"updatedAt": {"type": "date"}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Query Translation Strategy
|
||||
|
||||
**Random Story Queries:**
|
||||
```java
|
||||
// Typesense (problematic)
|
||||
String sortBy = seed != null ? "_rand(" + seed + ")" : "_rand()";
|
||||
|
||||
// OpenSearch (reliable)
|
||||
QueryBuilder randomQuery = QueryBuilders.functionScoreQuery(
|
||||
QueryBuilders.boolQuery().must(filters),
|
||||
ScoreFunctionBuilders.randomFunction(seed != null ? seed.intValue() : null)
|
||||
);
|
||||
```
|
||||
|
||||
**Complex Filtering:**
|
||||
```java
|
||||
// Build bool query with multiple filter conditions
|
||||
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
|
||||
.must(QueryBuilders.multiMatchQuery(query, "title", "description", "authorName"))
|
||||
.filter(QueryBuilders.termsQuery("tagNames", tags))
|
||||
.filter(QueryBuilders.rangeQuery("wordCount").gte(minWords).lte(maxWords))
|
||||
.filter(QueryBuilders.rangeQuery("rating").gte(minRating).lte(maxRating));
|
||||
```
|
||||
|
||||
**Faceting/Aggregations:**
|
||||
```java
|
||||
// Tags aggregation
|
||||
AggregationBuilder tagsAgg = AggregationBuilders
|
||||
.terms("tags")
|
||||
.field("tagNames")
|
||||
.size(100);
|
||||
|
||||
// Rating ranges
|
||||
AggregationBuilder ratingRanges = AggregationBuilders
|
||||
.range("rating_ranges")
|
||||
.field("rating")
|
||||
.addRange("unrated", 0, 1)
|
||||
.addRange("low", 1, 3)
|
||||
.addRange("high", 4, 6);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Revised Implementation Phases (Scope Reduced by 40%)
|
||||
|
||||
### Phase 1: Infrastructure Setup (Week 1)
|
||||
|
||||
**Objectives:**
|
||||
- Add OpenSearch to Docker Compose
|
||||
- Create basic OpenSearch service
|
||||
- Establish index templates and mappings
|
||||
- **Focus**: Only stories, authors, and tags indexes (skip series, collections)
|
||||
|
||||
**Deliverables:**
|
||||
1. **Docker Compose Updates:**
|
||||
```yaml
|
||||
opensearch:
|
||||
image: opensearchproject/opensearch:2.11.0
|
||||
environment:
|
||||
- discovery.type=single-node
|
||||
- DISABLE_SECURITY_PLUGIN=true
|
||||
- OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx1g
|
||||
ports:
|
||||
- "9200:9200"
|
||||
volumes:
|
||||
- opensearch_data:/usr/share/opensearch/data
|
||||
```
|
||||
|
||||
2. **OpenSearchConfig.java:**
|
||||
```java
|
||||
@Configuration
|
||||
@ConditionalOnProperty(name = "storycove.opensearch.enabled", havingValue = "true")
|
||||
public class OpenSearchConfig {
|
||||
@Bean
|
||||
public OpenSearchClient openSearchClient() {
|
||||
// Client configuration
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Basic Index Creation:**
|
||||
- Create index templates for stories, authors, collections
|
||||
- Implement index creation with proper mappings
|
||||
- Add health check endpoint
|
||||
|
||||
**Success Criteria:**
|
||||
- OpenSearch container starts successfully
|
||||
- Basic connectivity established
|
||||
- Index templates created and validated
|
||||
|
||||
### Phase 2: Core Service Implementation (Week 2)
|
||||
|
||||
**Objectives:**
|
||||
- Implement OpenSearchService with core functionality
|
||||
- Create service abstraction layer
|
||||
- Implement basic search operations
|
||||
- **Focus**: Only critical endpoints (stories search, random, authors)
|
||||
|
||||
**Deliverables:**
|
||||
1. **OpenSearchService.java** - Core service implementing:
|
||||
- `indexStory()`, `updateStory()`, `deleteStory()`
|
||||
- `searchStories()` with basic query support (CRITICAL)
|
||||
- `getRandomStoryId()` with reliable seed support (CRITICAL)
|
||||
- `indexAuthor()`, `updateAuthor()`, `deleteAuthor()`
|
||||
- `searchAuthors()` for authors page (HIGH)
|
||||
- `bulkIndexStories()`, `bulkIndexAuthors()` for initial data loading
|
||||
|
||||
2. **SearchServiceAdapter.java** - Abstraction layer:
|
||||
```java
|
||||
@Service
|
||||
public class SearchServiceAdapter {
|
||||
@Autowired(required = false)
|
||||
private TypesenseService typesenseService;
|
||||
|
||||
@Autowired(required = false)
|
||||
private OpenSearchService openSearchService;
|
||||
|
||||
@Value("${storycove.search.provider:typesense}")
|
||||
private String searchProvider;
|
||||
|
||||
public SearchResultDto<StorySearchDto> searchStories(...) {
|
||||
return "opensearch".equals(searchProvider)
|
||||
? openSearchService.searchStories(...)
|
||||
: typesenseService.searchStories(...);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Basic Query Implementation:**
|
||||
- Full-text search across title/description/author
|
||||
- Basic filtering (tags, rating, word count)
|
||||
- Pagination and sorting
|
||||
|
||||
**Success Criteria:**
|
||||
- Basic search functionality working
|
||||
- Service abstraction layer functional
|
||||
- Can switch between Typesense and OpenSearch via configuration
|
||||
|
||||
### Phase 3: Advanced Features Implementation (Week 3)
|
||||
|
||||
**Objectives:**
|
||||
- Implement complex filtering (all 15+ filter types)
|
||||
- Add random story functionality
|
||||
- Implement faceting/aggregations
|
||||
- Add autocomplete/suggestions
|
||||
|
||||
**Deliverables:**
|
||||
1. **Complex Query Builder:**
|
||||
- All filter conditions from original implementation
|
||||
- Date range filtering with proper timezone handling
|
||||
- Boolean logic for reading status, coverage, series filters
|
||||
|
||||
2. **Random Story Implementation:**
|
||||
```java
|
||||
public Optional<UUID> getRandomStoryId(String searchQuery, List<String> tags, Long seed, ...) {
|
||||
BoolQueryBuilder baseQuery = buildFilterQuery(searchQuery, tags, ...);
|
||||
|
||||
QueryBuilder randomQuery = QueryBuilders.functionScoreQuery(
|
||||
baseQuery,
|
||||
ScoreFunctionBuilders.randomFunction(seed != null ? seed.intValue() : null)
|
||||
);
|
||||
|
||||
SearchRequest request = new SearchRequest("stories-" + getCurrentLibraryId())
|
||||
.source(new SearchSourceBuilder()
|
||||
.query(randomQuery)
|
||||
.size(1)
|
||||
.fetchSource(new String[]{"id"}, null));
|
||||
|
||||
// Execute and return result
|
||||
}
|
||||
```
|
||||
|
||||
3. **Faceting Implementation:**
|
||||
- Tag aggregations with counts
|
||||
- Rating range aggregations
|
||||
- Author aggregations
|
||||
- Custom facet builders
|
||||
|
||||
4. **Autocomplete Service:**
|
||||
- Suggest-based implementation using completion fields
|
||||
- Prefix matching for story titles and author names
|
||||
|
||||
**Success Criteria:**
|
||||
- All filter conditions working correctly
|
||||
- Random story selection reliable with seed support
|
||||
- Faceting returns accurate counts
|
||||
- Autocomplete responsive and accurate
|
||||
|
||||
### Phase 4: Data Migration & Parallel Operation (Week 4)
|
||||
|
||||
**Objectives:**
|
||||
- Implement bulk data migration from database
|
||||
- Enable parallel operation (write to both systems)
|
||||
- Comprehensive testing of OpenSearch functionality
|
||||
|
||||
**Deliverables:**
|
||||
1. **Migration Service:**
|
||||
```java
|
||||
@Service
|
||||
public class SearchMigrationService {
|
||||
public void performFullMigration() {
|
||||
// Migrate all libraries
|
||||
List<Library> libraries = libraryService.findAll();
|
||||
for (Library library : libraries) {
|
||||
migrateLibraryData(library);
|
||||
}
|
||||
}
|
||||
|
||||
private void migrateLibraryData(Library library) {
|
||||
// Create indexes for library
|
||||
// Bulk load stories, authors, collections
|
||||
// Verify data integrity
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. **Dual-Write Implementation:**
|
||||
- Modify all entity update operations to write to both systems
|
||||
- Add configuration flag for dual-write mode
|
||||
- Error handling for partial failures
|
||||
|
||||
3. **Data Validation Tools:**
|
||||
- Compare search result counts between systems
|
||||
- Validate random story selection consistency
|
||||
- Check faceting accuracy
|
||||
|
||||
**Success Criteria:**
|
||||
- Complete data migration with 100% accuracy
|
||||
- Dual-write operations working without errors
|
||||
- Search result parity between systems verified
|
||||
|
||||
### Phase 5: API Integration & Testing (Week 5)
|
||||
|
||||
**Objectives:**
|
||||
- Update controller endpoints to use OpenSearch
|
||||
- Comprehensive integration testing
|
||||
- Performance testing and optimization
|
||||
|
||||
**Deliverables:**
|
||||
1. **Controller Updates:**
|
||||
- Modify controllers to use SearchServiceAdapter
|
||||
- Add migration controls for gradual rollout
|
||||
- Implement A/B testing capability
|
||||
|
||||
2. **Integration Tests:**
|
||||
```java
|
||||
@SpringBootTest
|
||||
@TestMethodOrder(OrderAnnotation.class)
|
||||
public class OpenSearchIntegrationTest {
|
||||
@Test
|
||||
@Order(1)
|
||||
void testBasicSearch() {
|
||||
// Test basic story search functionality
|
||||
}
|
||||
|
||||
@Test
|
||||
@Order(2)
|
||||
void testComplexFiltering() {
|
||||
// Test all 15+ filter conditions
|
||||
}
|
||||
|
||||
@Test
|
||||
@Order(3)
|
||||
void testRandomStory() {
|
||||
// Test random story with and without seed
|
||||
}
|
||||
|
||||
@Test
|
||||
@Order(4)
|
||||
void testFaceting() {
|
||||
// Test aggregation accuracy
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Performance Testing:**
|
||||
- Load testing with realistic data volumes
|
||||
- Query performance benchmarking
|
||||
- Memory usage monitoring
|
||||
|
||||
**Success Criteria:**
|
||||
- All integration tests passing
|
||||
- Performance meets or exceeds Typesense baseline
|
||||
- Memory usage within acceptable limits (< 2GB)
|
||||
|
||||
### Phase 6: Production Rollout & Monitoring (Week 6)
|
||||
|
||||
**Objectives:**
|
||||
- Production deployment with feature flags
|
||||
- Gradual user migration with monitoring
|
||||
- Rollback capability testing
|
||||
|
||||
**Deliverables:**
|
||||
1. **Feature Flag Implementation:**
|
||||
```java
|
||||
@Component
|
||||
public class SearchFeatureFlags {
|
||||
@Value("${storycove.search.opensearch.enabled:false}")
|
||||
private boolean openSearchEnabled;
|
||||
|
||||
@Value("${storycove.search.opensearch.percentage:0}")
|
||||
private int rolloutPercentage;
|
||||
|
||||
public boolean shouldUseOpenSearch(String userId) {
|
||||
if (!openSearchEnabled) return false;
|
||||
return userId.hashCode() % 100 < rolloutPercentage;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. **Monitoring & Alerting:**
|
||||
- Query performance metrics
|
||||
- Error rate monitoring
|
||||
- Search result accuracy validation
|
||||
- User experience metrics
|
||||
|
||||
3. **Rollback Procedures:**
|
||||
- Immediate rollback to Typesense capability
|
||||
- Data consistency verification
|
||||
- Performance rollback triggers
|
||||
|
||||
**Success Criteria:**
|
||||
- Successful production deployment
|
||||
- Zero user-facing issues during rollout
|
||||
- Monitoring showing improved performance
|
||||
- Rollback procedures validated
|
||||
|
||||
### Phase 7: Cleanup & Documentation (Week 7)
|
||||
|
||||
**Objectives:**
|
||||
- Remove Typesense dependencies
|
||||
- Update documentation
|
||||
- Performance optimization
|
||||
|
||||
**Deliverables:**
|
||||
1. **Code Cleanup:**
|
||||
- Remove TypesenseService and related classes
|
||||
- Clean up Docker Compose configuration
|
||||
- Remove unused dependencies
|
||||
|
||||
2. **Documentation Updates:**
|
||||
- Update deployment documentation
|
||||
- Search API documentation
|
||||
- Troubleshooting guides
|
||||
|
||||
3. **Performance Tuning:**
|
||||
- Index optimization
|
||||
- Query performance tuning
|
||||
- Resource allocation optimization
|
||||
|
||||
**Success Criteria:**
|
||||
- Typesense completely removed
|
||||
- Documentation up to date
|
||||
- Optimized performance in production
|
||||
|
||||
---
|
||||
|
||||
## Data Migration Strategy
|
||||
|
||||
### Pre-Migration Validation
|
||||
|
||||
**Data Integrity Checks:**
|
||||
1. Count validation: Ensure all stories/authors/collections are present
|
||||
2. Field validation: Verify all required fields are populated
|
||||
3. Relationship validation: Check author-story and series-story relationships
|
||||
4. Library separation: Ensure proper multi-library data isolation
|
||||
|
||||
**Migration Process:**
|
||||
|
||||
1. **Index Creation:**
|
||||
```java
|
||||
// Create indexes with proper mappings for each library
|
||||
for (Library library : libraries) {
|
||||
String storiesIndex = "stories-" + library.getId();
|
||||
createIndexWithMapping(storiesIndex, getStoriesMapping());
|
||||
createIndexWithMapping("authors-" + library.getId(), getAuthorsMapping());
|
||||
createIndexWithMapping("collections-" + library.getId(), getCollectionsMapping());
|
||||
}
|
||||
```
|
||||
|
||||
2. **Bulk Data Loading:**
|
||||
```java
|
||||
// Load in batches to manage memory usage
|
||||
int batchSize = 1000;
|
||||
List<Story> allStories = storyService.findByLibraryId(libraryId);
|
||||
|
||||
for (int i = 0; i < allStories.size(); i += batchSize) {
|
||||
List<Story> batch = allStories.subList(i, Math.min(i + batchSize, allStories.size()));
|
||||
List<StoryDocument> documents = batch.stream()
|
||||
.map(this::convertToSearchDocument)
|
||||
.collect(Collectors.toList());
|
||||
|
||||
bulkIndexStories(documents, "stories-" + libraryId);
|
||||
}
|
||||
```
|
||||
|
||||
3. **Post-Migration Validation:**
|
||||
- Count comparison between database and OpenSearch
|
||||
- Spot-check random records for field accuracy
|
||||
- Test search functionality with known queries
|
||||
- Verify faceting counts match expected values
|
||||
|
||||
### Rollback Strategy
|
||||
|
||||
**Immediate Rollback Triggers:**
|
||||
- Search error rate > 1%
|
||||
- Query performance degradation > 50%
|
||||
- Data inconsistency detected
|
||||
- Memory usage > 4GB sustained
|
||||
|
||||
**Rollback Process:**
|
||||
1. Update feature flag to disable OpenSearch
|
||||
2. Verify Typesense still operational
|
||||
3. Clear OpenSearch indexes to free resources
|
||||
4. Investigate and document issues
|
||||
|
||||
**Data Consistency During Rollback:**
|
||||
- Continue dual-write during investigation
|
||||
- Re-sync any missed updates to OpenSearch
|
||||
- Validate data integrity before retry
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
**OpenSearchService Unit Tests:**
|
||||
```java
|
||||
@ExtendWith(MockitoExtension.class)
|
||||
class OpenSearchServiceTest {
|
||||
@Mock private OpenSearchClient client;
|
||||
@InjectMocks private OpenSearchService service;
|
||||
|
||||
@Test
|
||||
void testSearchStoriesBasicQuery() {
|
||||
// Mock OpenSearch response
|
||||
// Test basic search functionality
|
||||
}
|
||||
|
||||
@Test
|
||||
void testComplexFilterQuery() {
|
||||
// Test complex boolean query building
|
||||
}
|
||||
|
||||
@Test
|
||||
void testRandomStorySelection() {
|
||||
// Test random query with seed
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Query Builder Tests:**
|
||||
- Test all 15+ filter conditions
|
||||
- Validate query structure and parameters
|
||||
- Test edge cases and null handling
|
||||
|
||||
### Integration Tests
|
||||
|
||||
**Full Search Integration:**
|
||||
```java
|
||||
@SpringBootTest
|
||||
@Testcontainers
|
||||
class OpenSearchIntegrationTest {
|
||||
@Container
|
||||
static OpenSearchContainer opensearch = new OpenSearchContainer("opensearchproject/opensearch:2.11.0");
|
||||
|
||||
@Test
|
||||
void testEndToEndStorySearch() {
|
||||
// Insert test data
|
||||
// Perform search via controller
|
||||
// Validate results
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Tests
|
||||
|
||||
**Load Testing Scenarios:**
|
||||
1. **Concurrent Search Load:**
|
||||
- 50 concurrent users performing searches
|
||||
- Mixed query complexity
|
||||
- Duration: 10 minutes
|
||||
|
||||
2. **Bulk Indexing Performance:**
|
||||
- Index 10,000 stories in batches
|
||||
- Measure throughput and memory usage
|
||||
|
||||
3. **Random Query Performance:**
|
||||
- 1000 random story requests with different seeds
|
||||
- Compare with Typesense baseline
|
||||
|
||||
### Acceptance Tests
|
||||
|
||||
**Functional Requirements:**
|
||||
- All existing search functionality preserved
|
||||
- Random story selection improved reliability
|
||||
- Faceting accuracy maintained
|
||||
- Multi-library separation working
|
||||
|
||||
**Performance Requirements:**
|
||||
- Search response time < 100ms for 95th percentile
|
||||
- Random story selection < 50ms
|
||||
- Index update operations < 10ms
|
||||
- Memory usage < 2GB in production
|
||||
|
||||
---
|
||||
|
||||
## Risk Analysis & Mitigation
|
||||
|
||||
### Technical Risks
|
||||
|
||||
**Risk: OpenSearch Memory Usage**
|
||||
- *Probability: Medium*
|
||||
- *Impact: High*
|
||||
- *Mitigation: Resource monitoring, index optimization, container limits*
|
||||
|
||||
**Risk: Query Performance Regression**
|
||||
- *Probability: Low*
|
||||
- *Impact: High*
|
||||
- *Mitigation: Performance testing, query optimization, caching layer*
|
||||
|
||||
**Risk: Data Migration Accuracy**
|
||||
- *Probability: Low*
|
||||
- *Impact: Critical*
|
||||
- *Mitigation: Comprehensive validation, dual-write verification, rollback procedures*
|
||||
|
||||
**Risk: Complex Filter Compatibility**
|
||||
- *Probability: Medium*
|
||||
- *Impact: Medium*
|
||||
- *Mitigation: Extensive testing, gradual rollout, feature flags*
|
||||
|
||||
### Operational Risks
|
||||
|
||||
**Risk: Production Deployment Issues**
|
||||
- *Probability: Medium*
|
||||
- *Impact: High*
|
||||
- *Mitigation: Staging environment testing, gradual rollout, immediate rollback capability*
|
||||
|
||||
**Risk: Team Learning Curve**
|
||||
- *Probability: High*
|
||||
- *Impact: Low*
|
||||
- *Mitigation: Documentation, training, gradual responsibility transfer*
|
||||
|
||||
### Business Continuity
|
||||
|
||||
**Zero-Downtime Requirements:**
|
||||
- Maintain Typesense during entire migration
|
||||
- Feature flag-based switching
|
||||
- Immediate rollback capability
|
||||
- Health monitoring with automated alerts
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
### Functional Requirements ✅
|
||||
- [ ] All search functionality migrated successfully
|
||||
- [ ] Random story selection working reliably with seeds
|
||||
- [ ] Complex filtering (15+ conditions) working accurately
|
||||
- [ ] Faceting/aggregation results match expected values
|
||||
- [ ] Multi-library support maintained
|
||||
- [ ] Autocomplete functionality preserved
|
||||
|
||||
### Performance Requirements ✅
|
||||
- [ ] Search response time ≤ 100ms (95th percentile)
|
||||
- [ ] Random story selection ≤ 50ms
|
||||
- [ ] Index operations ≤ 10ms
|
||||
- [ ] Memory usage ≤ 2GB sustained
|
||||
- [ ] Zero search downtime during migration
|
||||
|
||||
### Technical Requirements ✅
|
||||
- [ ] Code quality maintained (test coverage ≥ 80%)
|
||||
- [ ] Documentation updated and comprehensive
|
||||
- [ ] Monitoring and alerting implemented
|
||||
- [ ] Rollback procedures tested and validated
|
||||
- [ ] Typesense dependencies cleanly removed
|
||||
|
||||
---
|
||||
|
||||
## Timeline Summary
|
||||
|
||||
| Phase | Duration | Key Deliverables | Risk Level |
|
||||
|-------|----------|------------------|------------|
|
||||
| 1. Infrastructure | 1 week | Docker setup, basic service | Low |
|
||||
| 2. Core Service | 1 week | Basic search operations | Medium |
|
||||
| 3. Advanced Features | 1 week | Complex filtering, random queries | High |
|
||||
| 4. Data Migration | 1 week | Full data migration, dual-write | High |
|
||||
| 5. API Integration | 1 week | Controller updates, testing | Medium |
|
||||
| 6. Production Rollout | 1 week | Gradual deployment, monitoring | High |
|
||||
| 7. Cleanup | 1 week | Remove Typesense, documentation | Low |
|
||||
|
||||
**Total Estimated Duration: 7 weeks**
|
||||
|
||||
---
|
||||
|
||||
## Configuration Management
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# OpenSearch Configuration
|
||||
OPENSEARCH_HOST=opensearch
|
||||
OPENSEARCH_PORT=9200
|
||||
OPENSEARCH_USERNAME=admin
|
||||
OPENSEARCH_PASSWORD=${OPENSEARCH_PASSWORD}
|
||||
|
||||
# Feature Flags
|
||||
STORYCOVE_OPENSEARCH_ENABLED=true
|
||||
STORYCOVE_SEARCH_PROVIDER=opensearch
|
||||
STORYCOVE_SEARCH_DUAL_WRITE=true
|
||||
STORYCOVE_OPENSEARCH_ROLLOUT_PERCENTAGE=100
|
||||
|
||||
# Performance Tuning
|
||||
OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx2g
|
||||
STORYCOVE_SEARCH_BATCH_SIZE=1000
|
||||
STORYCOVE_SEARCH_TIMEOUT=30s
|
||||
```
|
||||
|
||||
### Docker Compose Updates
|
||||
|
||||
```yaml
|
||||
# Add to docker-compose.yml
|
||||
opensearch:
|
||||
image: opensearchproject/opensearch:2.11.0
|
||||
environment:
|
||||
- discovery.type=single-node
|
||||
- DISABLE_SECURITY_PLUGIN=true
|
||||
- OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx2g
|
||||
volumes:
|
||||
- opensearch_data:/usr/share/opensearch/data
|
||||
networks:
|
||||
- storycove-network
|
||||
|
||||
volumes:
|
||||
opensearch_data:
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
This specification provides a comprehensive roadmap for migrating StoryCove from Typesense to OpenSearch. The phased approach ensures minimal risk while delivering improved reliability and performance, particularly for random story queries.
|
||||
|
||||
The parallel implementation strategy allows for thorough validation and provides confidence in the migration while maintaining the ability to rollback if issues arise. Upon successful completion, StoryCove will have a more robust and scalable search infrastructure that better supports its growth and feature requirements.
|
||||
|
||||
**Next Steps:**
|
||||
1. Review and approve this specification
|
||||
2. Set up development environment with OpenSearch
|
||||
3. Begin Phase 1 implementation
|
||||
4. Establish monitoring and success metrics
|
||||
5. Execute migration according to timeline
|
||||
|
||||
---
|
||||
|
||||
*Document Version: 1.0*
|
||||
*Last Updated: 2025-01-17*
|
||||
*Author: Claude Code Assistant*
|
||||
Reference in New Issue
Block a user