shardegger/storycove

Fork 0

Files

Stefan Hardegger 6ee2d67027 solr migration button

2025-09-23 14:42:38 +02:00

7.3 KiB

Raw Blame History

Solr Library Separation Migration Guide

This guide explains how to migrate existing StoryCove deployments to support proper library separation in Solr search.

What Changed

The Solr service has been enhanced to support multi-tenant library separation by:

Adding a libraryId field to all Solr documents
Filtering all search queries by the current library context
Ensuring complete data isolation between libraries

Migration Options

Option 1: Docker Volume Reset (Recommended for Docker)

Best for: Development, staging, and Docker-based deployments where data loss is acceptable.

# Stop the application
docker-compose down

# Remove only the Solr data volume (preserves database and images)
docker volume rm storycove_solr_data

# Restart - Solr will recreate cores with new schema
docker-compose up -d

# Wait for services to start, then trigger reindex via admin panel

Pros: Clean, simple, guaranteed to work Cons: Requires downtime, loses existing search index

Option 2: Schema API Migration (Production Safe)

Best for: Production environments where you need to preserve uptime.

Method A: Automatic (Recommended)

# Single endpoint that adds field and migrates data
curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Method B: Manual Steps

# Step 1: Add libraryId field via app API
curl -X POST "http://your-app-host/api/admin/search/solr/add-library-field" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

# Step 2: Run migration
curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Method C: Direct Solr API (if app API fails)

# Add libraryId field to stories core
curl -X POST "http://your-solr-host:8983/solr/storycove_stories/schema" \
  -H "Content-Type: application/json" \
  -d '{
    "add-field": {
      "name": "libraryId",
      "type": "string",
      "indexed": true,
      "stored": true,
      "required": false
    }
  }'

# Add libraryId field to authors core
curl -X POST "http://your-solr-host:8983/solr/storycove_authors/schema" \
  -H "Content-Type: application/json" \
  -d '{
    "add-field": {
      "name": "libraryId",
      "type": "string",
      "indexed": true,
      "stored": true,
      "required": false
    }
  }'

# Then run the migration
curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Pros: No downtime, preserves service availability, automatic field addition Cons: Requires API access

Option 3: Application-Level Migration (Recommended for Production)

Best for: Production environments with proper admin access.

Deploy the code changes to your environment
Access the admin panel of your application
Navigate to search settings
Use the "Migrate Library Schema" button or API endpoint:
```
POST /api/admin/search/solr/migrate-library-schema
```

Pros: User-friendly, handles all complexity internally Cons: Requires admin access to application

Step-by-Step Migration Process

For Docker Deployments

Backup your data (optional but recommended):

# Backup database
docker-compose exec postgres pg_dump -U storycove storycove > backup.sql

Pull the latest code with library separation fixes
Choose migration approach:
- Quick & Clean: Use Option 1 (volume reset)
- Production: Use Option 2 or 3
Verify migration:
- Log in with different library passwords
- Perform searches to confirm isolation
- Check that new content gets indexed with library IDs

For Kubernetes/Production Deployments

Update your deployment with the new container images
Add the libraryId field to Solr schema using Option 2

Use the migration endpoint (Option 3):

kubectl exec -it deployment/storycove-backend -- \
  curl -X POST http://localhost:8080/api/admin/search/solr/migrate-library-schema

Monitor logs for successful migration

Verification Steps

After migration, verify that library separation is working:

Test with multiple libraries:
- Log in with Library A password
- Add/search content
- Log in with Library B password
- Confirm Library A content is not visible

Check Solr directly (if accessible):

# Should show documents with libraryId field
curl "http://solr:8983/solr/storycove_stories/select?q=*:*&fl=id,title,libraryId&rows=5"

Monitor application logs for any library separation errors

Troubleshooting

"unknown field 'libraryId'" Error

Problem: ERROR: [doc=xxx] unknown field 'libraryId'

Cause: The Solr schema doesn't have the libraryId field yet.

Solutions:

Use the automated migration (adds field automatically):

curl -X POST "http://your-app/api/admin/search/solr/migrate-library-schema"

Add field manually first:

# Add field via app API
curl -X POST "http://your-app/api/admin/search/solr/add-library-field"

# Then run migration
curl -X POST "http://your-app/api/admin/search/solr/migrate-library-schema"

Direct Solr API (if app API fails):

# Add to both cores
curl -X POST "http://solr:8983/solr/storycove_stories/schema" \
  -H "Content-Type: application/json" \
  -d '{"add-field":{"name":"libraryId","type":"string","indexed":true,"stored":true}}'

curl -X POST "http://solr:8983/solr/storycove_authors/schema" \
  -H "Content-Type: application/json" \
  -d '{"add-field":{"name":"libraryId","type":"string","indexed":true,"stored":true}}'

For development: Use Option 1 (volume reset) for clean restart

Migration Endpoint Returns Error

Common causes:

Solr is not available (check connectivity)
No active library context (ensure user is authenticated)
Insufficient permissions (check JWT token/authentication)

Search Results Still Mixed

This indicates incomplete migration:

Clear all Solr data and reindex completely
Verify that all documents have libraryId field
Check that search queries include library filters

Environment-Specific Notes

Development

Use Option 1 (volume reset) for simplicity
Data loss is acceptable in dev environments

Staging

Use Option 2 or 3 to test production migration procedures
Verify migration process before applying to production

Production

Always backup data first
Use Option 2 (Schema API) or Option 3 (Admin endpoint)
Plan for brief performance impact during reindexing
Monitor system resources during bulk reindexing

Performance Considerations

Reindexing time: Depends on data size (typically 1000 docs/second)
Memory usage: May increase during bulk indexing
Search performance: Minimal impact from library filtering
Storage: Slight increase due to libraryId field

Rollback Plan

If issues occur:

Immediate: Restart Solr to previous state (if using Option 1)
Schema revert: Remove libraryId field via Schema API
Code rollback: Deploy previous version without library separation
Data restore: Restore from backup if necessary

This migration enables proper multi-tenant isolation while maintaining search performance and functionality.

7.3 KiB Raw Blame History