storycove/SOLR_LIBRARY_MIGRATION.md

# Solr Library Separation Migration Guide

This guide explains how to migrate existing StoryCove deployments to support proper library separation in Solr search.

## What Changed

The Solr service has been enhanced to support multi-tenant library separation by:
- Adding a `libraryId` field to all Solr documents
- Filtering all search queries by the current library context
- Ensuring complete data isolation between libraries

## Migration Options

### Option 1: Docker Volume Reset (Recommended for Docker)

**Best for**: Development, staging, and Docker-based deployments where data loss is acceptable.

```bash
# Stop the application
docker-compose down

# Remove only the Solr data volume (preserves database and images)
docker volume rm storycove_solr_data

# Restart - Solr will recreate cores with new schema
docker-compose up -d

# Wait for services to start, then trigger reindex via admin panel
```

**Pros**: Clean, simple, guaranteed to work
**Cons**: Requires downtime, loses existing search index

### Option 2: Schema API Migration (Production Safe)

**Best for**: Production environments where you need to preserve uptime.

**Method A: Automatic (Recommended)**
```bash
# Single endpoint that adds field and migrates data
curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
```

**Method B: Manual Steps**
```bash
# Step 1: Add libraryId field via app API
curl -X POST "http://your-app-host/api/admin/search/solr/add-library-field" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

# Step 2: Run migration
curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
```

**Method C: Direct Solr API (if app API fails)**
```bash
# Add libraryId field to stories core
curl -X POST "http://your-solr-host:8983/solr/storycove_stories/schema" \
  -H "Content-Type: application/json" \
  -d '{
    "add-field": {
      "name": "libraryId",
      "type": "string",
      "indexed": true,
      "stored": true,
      "required": false
    }
  }'

# Add libraryId field to authors core
curl -X POST "http://your-solr-host:8983/solr/storycove_authors/schema" \
  -H "Content-Type: application/json" \
  -d '{
    "add-field": {
      "name": "libraryId",
      "type": "string",
      "indexed": true,
      "stored": true,
      "required": false
    }
  }'

# Then run the migration
curl -X POST "http://your-app-host/api/admin/search/solr/migrate-library-schema" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
```

**Pros**: No downtime, preserves service availability, automatic field addition
**Cons**: Requires API access

### Option 3: Application-Level Migration (Recommended for Production)

**Best for**: Production environments with proper admin access.

1. **Deploy the code changes** to your environment
2. **Access the admin panel** of your application
3. **Navigate to search settings**
4. **Use the "Migrate Library Schema" button** or API endpoint:
   ```
   POST /api/admin/search/solr/migrate-library-schema
   ```

**Pros**: User-friendly, handles all complexity internally
**Cons**: Requires admin access to application

## Step-by-Step Migration Process

### For Docker Deployments

1. **Backup your data** (optional but recommended):
   ```bash
   # Backup database
   docker-compose exec postgres pg_dump -U storycove storycove > backup.sql
   ```

2. **Pull the latest code** with library separation fixes

3. **Choose migration approach**:
   - **Quick & Clean**: Use Option 1 (volume reset)
   - **Production**: Use Option 2 or 3

4. **Verify migration**:
   - Log in with different library passwords
   - Perform searches to confirm isolation
   - Check that new content gets indexed with library IDs

### For Kubernetes/Production Deployments

1. **Update your deployment** with the new container images

2. **Add the libraryId field** to Solr schema using Option 2

3. **Use the migration endpoint** (Option 3):
   ```bash
   kubectl exec -it deployment/storycove-backend -- \
     curl -X POST http://localhost:8080/api/admin/search/solr/migrate-library-schema
   ```

4. **Monitor logs** for successful migration

## Verification Steps

After migration, verify that library separation is working:

1. **Test with multiple libraries**:
   - Log in with Library A password
   - Add/search content
   - Log in with Library B password
   - Confirm Library A content is not visible

2. **Check Solr directly** (if accessible):
   ```bash
   # Should show documents with libraryId field
   curl "http://solr:8983/solr/storycove_stories/select?q=*:*&fl=id,title,libraryId&rows=5"
   ```

3. **Monitor application logs** for any library separation errors

## Troubleshooting

### "unknown field 'libraryId'" Error

**Problem**: `ERROR: [doc=xxx] unknown field 'libraryId'`

**Cause**: The Solr schema doesn't have the libraryId field yet.

**Solutions**:

1. **Use the automated migration** (adds field automatically):
   ```bash
   curl -X POST "http://your-app/api/admin/search/solr/migrate-library-schema"
   ```

2. **Add field manually first**:
   ```bash
   # Add field via app API
   curl -X POST "http://your-app/api/admin/search/solr/add-library-field"

   # Then run migration
   curl -X POST "http://your-app/api/admin/search/solr/migrate-library-schema"
   ```

3. **Direct Solr API** (if app API fails):
   ```bash
   # Add to both cores
   curl -X POST "http://solr:8983/solr/storycove_stories/schema" \
     -H "Content-Type: application/json" \
     -d '{"add-field":{"name":"libraryId","type":"string","indexed":true,"stored":true}}'

   curl -X POST "http://solr:8983/solr/storycove_authors/schema" \
     -H "Content-Type: application/json" \
     -d '{"add-field":{"name":"libraryId","type":"string","indexed":true,"stored":true}}'
   ```

4. **For development**: Use Option 1 (volume reset) for clean restart

### Migration Endpoint Returns Error

Common causes:
- Solr is not available (check connectivity)
- No active library context (ensure user is authenticated)
- Insufficient permissions (check JWT token/authentication)

### Search Results Still Mixed

This indicates incomplete migration:
- Clear all Solr data and reindex completely
- Verify that all documents have libraryId field
- Check that search queries include library filters

## Environment-Specific Notes

### Development
- Use Option 1 (volume reset) for simplicity
- Data loss is acceptable in dev environments

### Staging
- Use Option 2 or 3 to test production migration procedures
- Verify migration process before applying to production

### Production
- **Always backup data first**
- Use Option 2 (Schema API) or Option 3 (Admin endpoint)
- Plan for brief performance impact during reindexing
- Monitor system resources during bulk reindexing

## Performance Considerations

- **Reindexing time**: Depends on data size (typically 1000 docs/second)
- **Memory usage**: May increase during bulk indexing
- **Search performance**: Minimal impact from library filtering
- **Storage**: Slight increase due to libraryId field

## Rollback Plan

If issues occur:

1. **Immediate**: Restart Solr to previous state (if using Option 1)
2. **Schema revert**: Remove libraryId field via Schema API
3. **Code rollback**: Deploy previous version without library separation
4. **Data restore**: Restore from backup if necessary

This migration enables proper multi-tenant isolation while maintaining search performance and functionality.