# Solr Library Separation Migration Guide This guide explains how to migrate existing StoryCove deployments to support proper library separation in Solr search. ## What Changed The Solr service has been enhanced to support multi-tenant library separation by: - Adding a `libraryId` field to all Solr documents - Filtering all search queries by the current library context - Ensuring complete data isolation between libraries ## Migration Options ### Option 1: Docker Volume Reset (Recommended for Docker) **Best for**: Development, staging, and Docker-based deployments where data loss is acceptable. ```bash # Stop the application docker-compose down # Remove only the Solr data volume (preserves database and images) docker volume rm storycove_solr_data # Restart - Solr will recreate cores with new schema docker-compose up -d # Wait for services to start, then trigger reindex via admin panel ``` **Pros**: Clean, simple, guaranteed to work **Cons**: Requires downtime, loses existing search index ### Option 2: Schema API Migration (Production Safe) **Best for**: Production environments where you need to preserve uptime. **Method A: Automatic (Recommended)** ```bash # Single endpoint that adds field and migrates data curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \ -H "Authorization: Bearer YOUR_JWT_TOKEN" ``` **Method B: Manual Steps** ```bash # Step 1: Add libraryId field via app API curl -X POST "http://your-app-host/api/admin/search/solr/add-library-field" \ -H "Authorization: Bearer YOUR_JWT_TOKEN" # Step 2: Run migration curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \ -H "Authorization: Bearer YOUR_JWT_TOKEN" ``` **Method C: Direct Solr API (if app API fails)** ```bash # Add libraryId field to stories core curl -X POST "http://your-solr-host:8983/solr/storycove_stories/schema" \ -H "Content-Type: application/json" \ -d '{ "add-field": { "name": "libraryId", "type": "string", "indexed": true, "stored": true, "required": false } }' # Add libraryId field to authors core curl -X POST "http://your-solr-host:8983/solr/storycove_authors/schema" \ -H "Content-Type: application/json" \ -d '{ "add-field": { "name": "libraryId", "type": "string", "indexed": true, "stored": true, "required": false } }' # Then run the migration curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \ -H "Authorization: Bearer YOUR_JWT_TOKEN" ``` **Pros**: No downtime, preserves service availability, automatic field addition **Cons**: Requires API access ### Option 3: Application-Level Migration (Recommended for Production) **Best for**: Production environments with proper admin access. 1. **Deploy the code changes** to your environment 2. **Access the admin panel** of your application 3. **Navigate to search settings** 4. **Use the "Migrate Library Schema" button** or API endpoint: ``` POST /api/admin/search/solr/migrate-library-schema ``` **Pros**: User-friendly, handles all complexity internally **Cons**: Requires admin access to application ## Step-by-Step Migration Process ### For Docker Deployments 1. **Backup your data** (optional but recommended): ```bash # Backup database docker-compose exec postgres pg_dump -U storycove storycove > backup.sql ``` 2. **Pull the latest code** with library separation fixes 3. **Choose migration approach**: - **Quick & Clean**: Use Option 1 (volume reset) - **Production**: Use Option 2 or 3 4. **Verify migration**: - Log in with different library passwords - Perform searches to confirm isolation - Check that new content gets indexed with library IDs ### For Kubernetes/Production Deployments 1. **Update your deployment** with the new container images 2. **Add the libraryId field** to Solr schema using Option 2 3. **Use the migration endpoint** (Option 3): ```bash kubectl exec -it deployment/storycove-backend -- \ curl -X POST http://localhost:8080/api/admin/search/solr/migrate-library-schema ``` 4. **Monitor logs** for successful migration ## Verification Steps After migration, verify that library separation is working: 1. **Test with multiple libraries**: - Log in with Library A password - Add/search content - Log in with Library B password - Confirm Library A content is not visible 2. **Check Solr directly** (if accessible): ```bash # Should show documents with libraryId field curl "http://solr:8983/solr/storycove_stories/select?q=*:*&fl=id,title,libraryId&rows=5" ``` 3. **Monitor application logs** for any library separation errors ## Troubleshooting ### "unknown field 'libraryId'" Error **Problem**: `ERROR: [doc=xxx] unknown field 'libraryId'` **Cause**: The Solr schema doesn't have the libraryId field yet. **Solutions**: 1. **Use the automated migration** (adds field automatically): ```bash curl -X POST "http://your-app/api/admin/search/solr/migrate-library-schema" ``` 2. **Add field manually first**: ```bash # Add field via app API curl -X POST "http://your-app/api/admin/search/solr/add-library-field" # Then run migration curl -X POST "http://your-app/api/admin/search/solr/migrate-library-schema" ``` 3. **Direct Solr API** (if app API fails): ```bash # Add to both cores curl -X POST "http://solr:8983/solr/storycove_stories/schema" \ -H "Content-Type: application/json" \ -d '{"add-field":{"name":"libraryId","type":"string","indexed":true,"stored":true}}' curl -X POST "http://solr:8983/solr/storycove_authors/schema" \ -H "Content-Type: application/json" \ -d '{"add-field":{"name":"libraryId","type":"string","indexed":true,"stored":true}}' ``` 4. **For development**: Use Option 1 (volume reset) for clean restart ### Migration Endpoint Returns Error Common causes: - Solr is not available (check connectivity) - No active library context (ensure user is authenticated) - Insufficient permissions (check JWT token/authentication) ### Search Results Still Mixed This indicates incomplete migration: - Clear all Solr data and reindex completely - Verify that all documents have libraryId field - Check that search queries include library filters ## Environment-Specific Notes ### Development - Use Option 1 (volume reset) for simplicity - Data loss is acceptable in dev environments ### Staging - Use Option 2 or 3 to test production migration procedures - Verify migration process before applying to production ### Production - **Always backup data first** - Use Option 2 (Schema API) or Option 3 (Admin endpoint) - Plan for brief performance impact during reindexing - Monitor system resources during bulk reindexing ## Performance Considerations - **Reindexing time**: Depends on data size (typically 1000 docs/second) - **Memory usage**: May increase during bulk indexing - **Search performance**: Minimal impact from library filtering - **Storage**: Slight increase due to libraryId field ## Rollback Plan If issues occur: 1. **Immediate**: Restart Solr to previous state (if using Option 1) 2. **Schema revert**: Remove libraryId field via Schema API 3. **Code rollback**: Deploy previous version without library separation 4. **Data restore**: Restore from backup if necessary This migration enables proper multi-tenant isolation while maintaining search performance and functionality.