Files
storycove/SOLR_LIBRARY_MIGRATION.md
Stefan Hardegger 62f017c4ca solr fix
2025-09-23 13:58:49 +02:00

5.8 KiB

Solr Library Separation Migration Guide

This guide explains how to migrate existing StoryCove deployments to support proper library separation in Solr search.

What Changed

The Solr service has been enhanced to support multi-tenant library separation by:

  • Adding a libraryId field to all Solr documents
  • Filtering all search queries by the current library context
  • Ensuring complete data isolation between libraries

Migration Options

Best for: Development, staging, and Docker-based deployments where data loss is acceptable.

# Stop the application
docker-compose down

# Remove only the Solr data volume (preserves database and images)
docker volume rm storycove_solr_data

# Restart - Solr will recreate cores with new schema
docker-compose up -d

# Wait for services to start, then trigger reindex via admin panel

Pros: Clean, simple, guaranteed to work Cons: Requires downtime, loses existing search index

Option 2: Schema API Migration (Production Safe)

Best for: Production environments where you need to preserve uptime.

# Add libraryId field to stories core
curl -X POST "http://your-solr-host:8983/solr/storycove_stories/schema" \
  -H "Content-Type: application/json" \
  -d '{
    "add-field": {
      "name": "libraryId",
      "type": "string",
      "indexed": true,
      "stored": true,
      "required": false
    }
  }'

# Add libraryId field to authors core
curl -X POST "http://your-solr-host:8983/solr/storycove_authors/schema" \
  -H "Content-Type: application/json" \
  -d '{
    "add-field": {
      "name": "libraryId",
      "type": "string",
      "indexed": true,
      "stored": true,
      "required": false
    }
  }'

# Then use the admin migration endpoint
curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Pros: No downtime, preserves service availability Cons: More complex, requires API access

Best for: Production environments with proper admin access.

  1. Deploy the code changes to your environment
  2. Access the admin panel of your application
  3. Navigate to search settings
  4. Use the "Migrate Library Schema" button or API endpoint:
    POST /api/admin/search/solr/migrate-library-schema
    

Pros: User-friendly, handles all complexity internally Cons: Requires admin access to application

Step-by-Step Migration Process

For Docker Deployments

  1. Backup your data (optional but recommended):

    # Backup database
    docker-compose exec postgres pg_dump -U storycove storycove > backup.sql
    
  2. Pull the latest code with library separation fixes

  3. Choose migration approach:

    • Quick & Clean: Use Option 1 (volume reset)
    • Production: Use Option 2 or 3
  4. Verify migration:

    • Log in with different library passwords
    • Perform searches to confirm isolation
    • Check that new content gets indexed with library IDs

For Kubernetes/Production Deployments

  1. Update your deployment with the new container images

  2. Add the libraryId field to Solr schema using Option 2

  3. Use the migration endpoint (Option 3):

    kubectl exec -it deployment/storycove-backend -- \
      curl -X POST http://localhost:8080/api/admin/search/solr/migrate-library-schema
    
  4. Monitor logs for successful migration

Verification Steps

After migration, verify that library separation is working:

  1. Test with multiple libraries:

    • Log in with Library A password
    • Add/search content
    • Log in with Library B password
    • Confirm Library A content is not visible
  2. Check Solr directly (if accessible):

    # Should show documents with libraryId field
    curl "http://solr:8983/solr/storycove_stories/select?q=*:*&fl=id,title,libraryId&rows=5"
    
  3. Monitor application logs for any library separation errors

Troubleshooting

"unknown field 'libraryId'" Error

This means the Solr schema wasn't updated. Solutions:

  • Use Option 1 (volume reset) for clean restart
  • Use Option 2 (Schema API) to add the field manually
  • Check that schema files contain the libraryId field definition

Migration Endpoint Returns Error

Common causes:

  • Solr is not available (check connectivity)
  • No active library context (ensure user is authenticated)
  • Insufficient permissions (check JWT token/authentication)

Search Results Still Mixed

This indicates incomplete migration:

  • Clear all Solr data and reindex completely
  • Verify that all documents have libraryId field
  • Check that search queries include library filters

Environment-Specific Notes

Development

  • Use Option 1 (volume reset) for simplicity
  • Data loss is acceptable in dev environments

Staging

  • Use Option 2 or 3 to test production migration procedures
  • Verify migration process before applying to production

Production

  • Always backup data first
  • Use Option 2 (Schema API) or Option 3 (Admin endpoint)
  • Plan for brief performance impact during reindexing
  • Monitor system resources during bulk reindexing

Performance Considerations

  • Reindexing time: Depends on data size (typically 1000 docs/second)
  • Memory usage: May increase during bulk indexing
  • Search performance: Minimal impact from library filtering
  • Storage: Slight increase due to libraryId field

Rollback Plan

If issues occur:

  1. Immediate: Restart Solr to previous state (if using Option 1)
  2. Schema revert: Remove libraryId field via Schema API
  3. Code rollback: Deploy previous version without library separation
  4. Data restore: Restore from backup if necessary

This migration enables proper multi-tenant isolation while maintaining search performance and functionality.