263 lines
10 KiB
Markdown
263 lines
10 KiB
Markdown
# StoryCove Data Model Documentation
|
|
|
|
## Overview
|
|
|
|
StoryCove uses PostgreSQL as its primary database with UUID-based primary keys throughout. The data model is designed to support a personal library of short stories with rich metadata, author information, and flexible organization through tags and series.
|
|
|
|
## Entity Relationship Diagram
|
|
|
|
```
|
|
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
|
|
│ Authors │────│ Stories │────│ Series │
|
|
│ │ │ │ │ │
|
|
│ - id (PK) │ │ - id (PK) │ │ - id (PK) │
|
|
│ - name │ │ - title │ │ - name │
|
|
│ - notes │ │ - content* │ │ - desc │
|
|
│ - rating │ │ - rating │ │ │
|
|
│ - avatar │ │ - volume │ │ │
|
|
└─────────────┘ │ - cover │ └─────────────┘
|
|
│ │ - word_count │
|
|
│ │ - source_url │
|
|
│ │ - timestamps │
|
|
│ └──────────────┘
|
|
│ │
|
|
│ │
|
|
┌─────────────┐ │ ┌─────────────┐
|
|
│ Author_URLs │ │ │ Tags │
|
|
│ │ │ │ │
|
|
│ - author_id │ │ │ - id (PK) │
|
|
│ - url │ │ │ - name │
|
|
└─────────────┘ │ └─────────────┘
|
|
│ │
|
|
│ │
|
|
┌─────────────┐ │
|
|
│ Story_Tags │─────────┘
|
|
│ │
|
|
│ - story_id │
|
|
│ - tag_id │
|
|
└─────────────┘
|
|
```
|
|
|
|
## Detailed Entity Specifications
|
|
|
|
### Stories Table
|
|
|
|
**Table Name**: `stories`
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
|
| title | VARCHAR(255) | NOT NULL | Story title |
|
|
| summary | TEXT | NULL | Optional story summary |
|
|
| description | VARCHAR(1000) | NULL | Optional description |
|
|
| content_html | TEXT | NULL | HTML content of the story |
|
|
| content_plain | TEXT | NULL | Plain text version (auto-generated) |
|
|
| source_url | VARCHAR(255) | NULL | Source URL where story was found |
|
|
| cover_path | VARCHAR(255) | NULL | Path to cover image file |
|
|
| word_count | INTEGER | NOT NULL, DEFAULT 0 | Word count (auto-calculated) |
|
|
| rating | INTEGER | NULL, CHECK (rating >= 1 AND rating <= 5) | Story rating |
|
|
| volume | INTEGER | NULL | Volume number if part of series |
|
|
| author_id | UUID | FOREIGN KEY | Reference to authors table |
|
|
| series_id | UUID | FOREIGN KEY, NULL | Reference to series table |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
|
| updated_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Last update timestamp |
|
|
|
|
**Indexes:**
|
|
- Primary key on `id`
|
|
- Foreign key index on `author_id`
|
|
- Foreign key index on `series_id`
|
|
- Index on `created_at` for recent stories queries
|
|
- Index on `rating` for top-rated queries
|
|
- Unique constraint on `source_url` where not null
|
|
|
|
**Business Rules:**
|
|
- Word count is automatically calculated from `content_plain` or `content_html`
|
|
- Plain text content is automatically extracted from HTML content using Jsoup
|
|
- Volume is only meaningful when series_id is set
|
|
- Rating must be between 1-5 if provided
|
|
|
|
### Authors Table
|
|
|
|
**Table Name**: `authors`
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
|
| name | VARCHAR(255) | NOT NULL, UNIQUE | Author name |
|
|
| notes | TEXT | NULL | Notes about the author |
|
|
| author_rating | INTEGER | NULL, CHECK (author_rating >= 1 AND author_rating <= 5) | Author rating |
|
|
| avatar_image_path | VARCHAR(255) | NULL | Path to avatar image |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
|
| updated_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Last update timestamp |
|
|
|
|
**Indexes:**
|
|
- Primary key on `id`
|
|
- Unique index on `name`
|
|
- Index on `author_rating` for top-rated queries
|
|
|
|
**Business Rules:**
|
|
- Author names must be unique across the system
|
|
- Rating must be between 1-5 if provided
|
|
- Author statistics (story count, average rating) are calculated dynamically
|
|
|
|
### Author URLs Table
|
|
|
|
**Table Name**: `author_urls`
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| author_id | UUID | FOREIGN KEY, NOT NULL | Reference to authors table |
|
|
| url | VARCHAR(255) | NOT NULL | URL associated with author |
|
|
|
|
**Indexes:**
|
|
- Foreign key index on `author_id`
|
|
- Composite index on `(author_id, url)` for uniqueness
|
|
|
|
**Business Rules:**
|
|
- One author can have multiple URLs
|
|
- URLs are stored as simple strings without validation
|
|
- Duplicate URLs for the same author are prevented by application logic
|
|
|
|
### Series Table
|
|
|
|
**Table Name**: `series`
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
|
| name | VARCHAR(255) | NOT NULL, UNIQUE | Series name |
|
|
| description | VARCHAR(1000) | NULL | Series description |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
|
|
|
**Indexes:**
|
|
- Primary key on `id`
|
|
- Unique index on `name`
|
|
|
|
**Business Rules:**
|
|
- Series names must be unique
|
|
- Stories in a series are ordered by volume number
|
|
- Series without stories are allowed (placeholder series)
|
|
|
|
### Tags Table
|
|
|
|
**Table Name**: `tags`
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
|
| name | VARCHAR(100) | NOT NULL, UNIQUE | Tag name |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
|
|
|
**Indexes:**
|
|
- Primary key on `id`
|
|
- Unique index on `name`
|
|
- Index on `name` for autocomplete queries
|
|
|
|
**Business Rules:**
|
|
- Tag names must be unique and are stored in lowercase
|
|
- Tags are created automatically when referenced by stories
|
|
- Tag usage statistics are calculated dynamically
|
|
|
|
### Story Tags Junction Table
|
|
|
|
**Table Name**: `story_tags`
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| story_id | UUID | FOREIGN KEY, NOT NULL | Reference to stories table |
|
|
| tag_id | UUID | FOREIGN KEY, NOT NULL | Reference to tags table |
|
|
|
|
**Constraints:**
|
|
- Primary key on `(story_id, tag_id)`
|
|
- Foreign key to `stories(id)` with CASCADE DELETE
|
|
- Foreign key to `tags(id)` with CASCADE DELETE
|
|
|
|
**Indexes:**
|
|
- Composite primary key index
|
|
- Index on `tag_id` for reverse lookups
|
|
|
|
## Data Types and Conventions
|
|
|
|
### UUID Strategy
|
|
- All primary keys use UUID (Universally Unique Identifier)
|
|
- Generated using `GenerationType.UUID` in Hibernate
|
|
- Provides natural uniqueness across distributed systems
|
|
- 36-character string representation (e.g., `123e4567-e89b-12d3-a456-426614174000`)
|
|
|
|
### Timestamp Management
|
|
- All entities have `created_at` timestamp
|
|
- Stories and Authors have `updated_at` timestamp (automatically updated)
|
|
- Series and Tags only have `created_at` (they're rarely modified)
|
|
- All timestamps use `LocalDateTime` in Java, stored as `TIMESTAMP` in PostgreSQL
|
|
|
|
### Text Fields
|
|
- **VARCHAR(n)**: For constrained text fields (names, paths, URLs)
|
|
- **TEXT**: For unlimited text content (story content, notes, descriptions)
|
|
- **HTML Content**: Stored as-is but sanitized on input and output
|
|
- **Plain Text**: Automatically extracted from HTML using Jsoup
|
|
|
|
### Validation Rules
|
|
- **Required Fields**: Entity names/titles are always required
|
|
- **Length Limits**: Names limited to 255 characters, descriptions to 1000
|
|
- **Rating Range**: All ratings constrained to 1-5 range
|
|
- **URL Format**: No format validation at database level
|
|
- **Uniqueness**: Names are unique within their entity type
|
|
|
|
## Relationships and Cascading
|
|
|
|
### One-to-Many Relationships
|
|
- **Author → Stories**: Lazy loaded, cascade ALL operations
|
|
- **Series → Stories**: Lazy loaded, ordered by volume, cascade ALL
|
|
- **Author → Author URLs**: Eager loaded via `@ElementCollection`
|
|
|
|
### Many-to-Many Relationships
|
|
- **Stories ↔ Tags**: Via `story_tags` junction table
|
|
- Managed bidirectionally with helper methods
|
|
- Cascade DELETE on both sides
|
|
|
|
### Foreign Key Constraints
|
|
- All foreign keys have proper referential integrity
|
|
- DELETE operations cascade appropriately
|
|
- No orphaned records are allowed
|
|
|
|
## Performance Considerations
|
|
|
|
### Indexing Strategy
|
|
- Primary keys automatically indexed
|
|
- Foreign keys have dedicated indexes
|
|
- Frequently queried fields (rating, created_at) are indexed
|
|
- Unique constraints automatically create indexes
|
|
|
|
### Query Optimization
|
|
- Lazy loading prevents N+1 queries
|
|
- Pagination used for large result sets
|
|
- Specialized queries for common access patterns
|
|
- Typesense search engine for full-text search (separate from PostgreSQL)
|
|
|
|
### Data Volume Estimates
|
|
- **Stories**: Expected 1K-10K records per user
|
|
- **Authors**: Expected 100-1K records per user
|
|
- **Tags**: Expected 50-500 records per user
|
|
- **Series**: Expected 10-100 records per user
|
|
- **Join Tables**: Scale with story count and tagging usage
|
|
|
|
## Backup and Migration Considerations
|
|
|
|
### Schema Evolution
|
|
- Uses Hibernate `ddl-auto: update` for development
|
|
- Production should use controlled migration tools (Flyway/Liquibase)
|
|
- UUID keys allow safe data migration between environments
|
|
|
|
### Data Integrity
|
|
- Foreign key constraints ensure referential integrity
|
|
- Check constraints validate data ranges
|
|
- Application-level validation provides user-friendly error messages
|
|
- Unique constraints prevent duplicate data
|
|
|
|
### Backup Strategy
|
|
- Full PostgreSQL dumps for complete backup
|
|
- Image files stored separately in filesystem
|
|
- Consider incremental backups for large installations
|
|
- Test restore procedures regularly
|
|
|
|
This data model provides a solid foundation for personal story library management with room for future enhancements while maintaining data integrity and performance. |