solr fix

2025-09-23 13:58:49 +02:00
parent 857871273d
commit 62f017c4ca
6 changed files with 332 additions and 3 deletions
--- a/SOLR_LIBRARY_MIGRATION.md
+++ b/SOLR_LIBRARY_MIGRATION.md
@@ -0,0 +1,196 @@
+# Solr Library Separation Migration Guide
+
+This guide explains how to migrate existing StoryCove deployments to support proper library separation in Solr search.
+
+## What Changed
+
+The Solr service has been enhanced to support multi-tenant library separation by:
+- Adding a `libraryId` field to all Solr documents
+- Filtering all search queries by the current library context
+- Ensuring complete data isolation between libraries
+
+## Migration Options
+
+### Option 1: Docker Volume Reset (Recommended for Docker)
+
+**Best for**: Development, staging, and Docker-based deployments where data loss is acceptable.
+
+```bash
+# Stop the application
+docker-compose down
+
+# Remove only the Solr data volume (preserves database and images)
+docker volume rm storycove_solr_data
+
+# Restart - Solr will recreate cores with new schema
+docker-compose up -d
+
+# Wait for services to start, then trigger reindex via admin panel
+```
+
+**Pros**: Clean, simple, guaranteed to work
+**Cons**: Requires downtime, loses existing search index
+
+### Option 2: Schema API Migration (Production Safe)
+
+**Best for**: Production environments where you need to preserve uptime.
+
+```bash
+# Add libraryId field to stories core
+curl -X POST "http://your-solr-host:8983/solr/storycove_stories/schema" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "add-field": {
+      "name": "libraryId",
+      "type": "string",
+      "indexed": true,
+      "stored": true,
+      "required": false
+    }
+  }'
+
+# Add libraryId field to authors core
+curl -X POST "http://your-solr-host:8983/solr/storycove_authors/schema" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "add-field": {
+      "name": "libraryId",
+      "type": "string",
+      "indexed": true,
+      "stored": true,
+      "required": false
+    }
+  }'
+
+# Then use the admin migration endpoint
+curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
+  -H "Authorization: Bearer YOUR_JWT_TOKEN"
+```
+
+**Pros**: No downtime, preserves service availability
+**Cons**: More complex, requires API access
+
+### Option 3: Application-Level Migration (Recommended for Production)
+
+**Best for**: Production environments with proper admin access.
+
+1. **Deploy the code changes** to your environment
+2. **Access the admin panel** of your application
+3. **Navigate to search settings**
+4. **Use the "Migrate Library Schema" button** or API endpoint:
+   ```
+   POST /api/admin/search/solr/migrate-library-schema
+   ```
+
+**Pros**: User-friendly, handles all complexity internally
+**Cons**: Requires admin access to application
+
+## Step-by-Step Migration Process
+
+### For Docker Deployments
+
+1. **Backup your data** (optional but recommended):
+   ```bash
+   # Backup database
+   docker-compose exec postgres pg_dump -U storycove storycove > backup.sql
+   ```
+
+2. **Pull the latest code** with library separation fixes
+
+3. **Choose migration approach**:
+   - **Quick & Clean**: Use Option 1 (volume reset)
+   - **Production**: Use Option 2 or 3
+
+4. **Verify migration**:
+   - Log in with different library passwords
+   - Perform searches to confirm isolation
+   - Check that new content gets indexed with library IDs
+
+### For Kubernetes/Production Deployments
+
+1. **Update your deployment** with the new container images
+
+2. **Add the libraryId field** to Solr schema using Option 2
+
+3. **Use the migration endpoint** (Option 3):
+   ```bash
+   kubectl exec -it deployment/storycove-backend -- \
+     curl -X POST http://localhost:8080/api/admin/search/solr/migrate-library-schema
+   ```
+
+4. **Monitor logs** for successful migration
+
+## Verification Steps
+
+After migration, verify that library separation is working:
+
+1. **Test with multiple libraries**:
+   - Log in with Library A password
+   - Add/search content
+   - Log in with Library B password
+   - Confirm Library A content is not visible
+
+2. **Check Solr directly** (if accessible):
+   ```bash
+   # Should show documents with libraryId field
+   curl "http://solr:8983/solr/storycove_stories/select?q=*:*&fl=id,title,libraryId&rows=5"
+   ```
+
+3. **Monitor application logs** for any library separation errors
+
+## Troubleshooting
+
+### "unknown field 'libraryId'" Error
+
+This means the Solr schema wasn't updated. Solutions:
+- Use Option 1 (volume reset) for clean restart
+- Use Option 2 (Schema API) to add the field manually
+- Check that schema files contain the libraryId field definition
+
+### Migration Endpoint Returns Error
+
+Common causes:
+- Solr is not available (check connectivity)
+- No active library context (ensure user is authenticated)
+- Insufficient permissions (check JWT token/authentication)
+
+### Search Results Still Mixed
+
+This indicates incomplete migration:
+- Clear all Solr data and reindex completely
+- Verify that all documents have libraryId field
+- Check that search queries include library filters
+
+## Environment-Specific Notes
+
+### Development
+- Use Option 1 (volume reset) for simplicity
+- Data loss is acceptable in dev environments
+
+### Staging
+- Use Option 2 or 3 to test production migration procedures
+- Verify migration process before applying to production
+
+### Production
+- **Always backup data first**
+- Use Option 2 (Schema API) or Option 3 (Admin endpoint)
+- Plan for brief performance impact during reindexing
+- Monitor system resources during bulk reindexing
+
+## Performance Considerations
+
+- **Reindexing time**: Depends on data size (typically 1000 docs/second)
+- **Memory usage**: May increase during bulk indexing
+- **Search performance**: Minimal impact from library filtering
+- **Storage**: Slight increase due to libraryId field
+
+## Rollback Plan
+
+If issues occur:
+
+1. **Immediate**: Restart Solr to previous state (if using Option 1)
+2. **Schema revert**: Remove libraryId field via Schema API
+3. **Code rollback**: Deploy previous version without library separation
+4. **Data restore**: Restore from backup if necessary
+
+This migration enables proper multi-tenant isolation while maintaining search performance and functionality.