# StoryCove Data Model Documentation ## Overview StoryCove uses PostgreSQL as its primary database with UUID-based primary keys throughout. The data model is designed to support a personal library of short stories with rich metadata, author information, and flexible organization through tags and series. ## Entity Relationship Diagram ``` ┌─────────────┐ ┌──────────────┐ ┌─────────────┐ │ Authors │────│ Stories │────│ Series │ │ │ │ │ │ │ │ - id (PK) │ │ - id (PK) │ │ - id (PK) │ │ - name │ │ - title │ │ - name │ │ - notes │ │ - content* │ │ - desc │ │ - rating │ │ - rating │ │ │ │ - avatar │ │ - volume │ │ │ └─────────────┘ │ - cover │ └─────────────┘ │ │ - word_count │ │ │ - source_url │ │ │ - timestamps │ │ └──────────────┘ │ │ │ │ ┌─────────────┐ │ ┌─────────────┐ │ Author_URLs │ │ │ Tags │ │ │ │ │ │ │ - author_id │ │ │ - id (PK) │ │ - url │ │ │ - name │ └─────────────┘ │ └─────────────┘ │ │ │ │ ┌─────────────┐ │ │ Story_Tags │─────────┘ │ │ │ - story_id │ │ - tag_id │ └─────────────┘ ``` ## Detailed Entity Specifications ### Stories Table **Table Name**: `stories` | Column | Type | Constraints | Description | |--------|------|-------------|-------------| | id | UUID | PRIMARY KEY, NOT NULL | Unique identifier | | title | VARCHAR(255) | NOT NULL | Story title | | summary | TEXT | NULL | Optional story summary | | description | VARCHAR(1000) | NULL | Optional description | | content_html | TEXT | NULL | HTML content of the story | | content_plain | TEXT | NULL | Plain text version (auto-generated) | | source_url | VARCHAR(255) | NULL | Source URL where story was found | | cover_path | VARCHAR(255) | NULL | Path to cover image file | | word_count | INTEGER | NOT NULL, DEFAULT 0 | Word count (auto-calculated) | | rating | INTEGER | NULL, CHECK (rating >= 1 AND rating <= 5) | Story rating | | volume | INTEGER | NULL | Volume number if part of series | | author_id | UUID | FOREIGN KEY | Reference to authors table | | series_id | UUID | FOREIGN KEY, NULL | Reference to series table | | created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp | | updated_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Last update timestamp | **Indexes:** - Primary key on `id` - Foreign key index on `author_id` - Foreign key index on `series_id` - Index on `created_at` for recent stories queries - Index on `rating` for top-rated queries - Unique constraint on `source_url` where not null **Business Rules:** - Word count is automatically calculated from `content_plain` or `content_html` - Plain text content is automatically extracted from HTML content using Jsoup - Volume is only meaningful when series_id is set - Rating must be between 1-5 if provided ### Authors Table **Table Name**: `authors` | Column | Type | Constraints | Description | |--------|------|-------------|-------------| | id | UUID | PRIMARY KEY, NOT NULL | Unique identifier | | name | VARCHAR(255) | NOT NULL, UNIQUE | Author name | | notes | TEXT | NULL | Notes about the author | | author_rating | INTEGER | NULL, CHECK (author_rating >= 1 AND author_rating <= 5) | Author rating | | avatar_image_path | VARCHAR(255) | NULL | Path to avatar image | | created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp | | updated_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Last update timestamp | **Indexes:** - Primary key on `id` - Unique index on `name` - Index on `author_rating` for top-rated queries **Business Rules:** - Author names must be unique across the system - Rating must be between 1-5 if provided - Author statistics (story count, average rating) are calculated dynamically ### Author URLs Table **Table Name**: `author_urls` | Column | Type | Constraints | Description | |--------|------|-------------|-------------| | author_id | UUID | FOREIGN KEY, NOT NULL | Reference to authors table | | url | VARCHAR(255) | NOT NULL | URL associated with author | **Indexes:** - Foreign key index on `author_id` - Composite index on `(author_id, url)` for uniqueness **Business Rules:** - One author can have multiple URLs - URLs are stored as simple strings without validation - Duplicate URLs for the same author are prevented by application logic ### Series Table **Table Name**: `series` | Column | Type | Constraints | Description | |--------|------|-------------|-------------| | id | UUID | PRIMARY KEY, NOT NULL | Unique identifier | | name | VARCHAR(255) | NOT NULL, UNIQUE | Series name | | description | VARCHAR(1000) | NULL | Series description | | created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp | **Indexes:** - Primary key on `id` - Unique index on `name` **Business Rules:** - Series names must be unique - Stories in a series are ordered by volume number - Series without stories are allowed (placeholder series) ### Tags Table **Table Name**: `tags` | Column | Type | Constraints | Description | |--------|------|-------------|-------------| | id | UUID | PRIMARY KEY, NOT NULL | Unique identifier | | name | VARCHAR(100) | NOT NULL, UNIQUE | Tag name | | created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp | **Indexes:** - Primary key on `id` - Unique index on `name` - Index on `name` for autocomplete queries **Business Rules:** - Tag names must be unique and are stored in lowercase - Tags are created automatically when referenced by stories - Tag usage statistics are calculated dynamically ### Story Tags Junction Table **Table Name**: `story_tags` | Column | Type | Constraints | Description | |--------|------|-------------|-------------| | story_id | UUID | FOREIGN KEY, NOT NULL | Reference to stories table | | tag_id | UUID | FOREIGN KEY, NOT NULL | Reference to tags table | **Constraints:** - Primary key on `(story_id, tag_id)` - Foreign key to `stories(id)` with CASCADE DELETE - Foreign key to `tags(id)` with CASCADE DELETE **Indexes:** - Composite primary key index - Index on `tag_id` for reverse lookups ## Data Types and Conventions ### UUID Strategy - All primary keys use UUID (Universally Unique Identifier) - Generated using `GenerationType.UUID` in Hibernate - Provides natural uniqueness across distributed systems - 36-character string representation (e.g., `123e4567-e89b-12d3-a456-426614174000`) ### Timestamp Management - All entities have `created_at` timestamp - Stories and Authors have `updated_at` timestamp (automatically updated) - Series and Tags only have `created_at` (they're rarely modified) - All timestamps use `LocalDateTime` in Java, stored as `TIMESTAMP` in PostgreSQL ### Text Fields - **VARCHAR(n)**: For constrained text fields (names, paths, URLs) - **TEXT**: For unlimited text content (story content, notes, descriptions) - **HTML Content**: Stored as-is but sanitized on input and output - **Plain Text**: Automatically extracted from HTML using Jsoup ### Validation Rules - **Required Fields**: Entity names/titles are always required - **Length Limits**: Names limited to 255 characters, descriptions to 1000 - **Rating Range**: All ratings constrained to 1-5 range - **URL Format**: No format validation at database level - **Uniqueness**: Names are unique within their entity type ## Relationships and Cascading ### One-to-Many Relationships - **Author → Stories**: Lazy loaded, cascade ALL operations - **Series → Stories**: Lazy loaded, ordered by volume, cascade ALL - **Author → Author URLs**: Eager loaded via `@ElementCollection` ### Many-to-Many Relationships - **Stories ↔ Tags**: Via `story_tags` junction table - Managed bidirectionally with helper methods - Cascade DELETE on both sides ### Foreign Key Constraints - All foreign keys have proper referential integrity - DELETE operations cascade appropriately - No orphaned records are allowed ## Performance Considerations ### Indexing Strategy - Primary keys automatically indexed - Foreign keys have dedicated indexes - Frequently queried fields (rating, created_at) are indexed - Unique constraints automatically create indexes ### Query Optimization - Lazy loading prevents N+1 queries - Pagination used for large result sets - Specialized queries for common access patterns - Typesense search engine for full-text search (separate from PostgreSQL) ### Data Volume Estimates - **Stories**: Expected 1K-10K records per user - **Authors**: Expected 100-1K records per user - **Tags**: Expected 50-500 records per user - **Series**: Expected 10-100 records per user - **Join Tables**: Scale with story count and tagging usage ## Backup and Migration Considerations ### Schema Evolution - Uses Hibernate `ddl-auto: update` for development - Production should use controlled migration tools (Flyway/Liquibase) - UUID keys allow safe data migration between environments ### Data Integrity - Foreign key constraints ensure referential integrity - Check constraints validate data ranges - Application-level validation provides user-friendly error messages - Unique constraints prevent duplicate data ### Backup Strategy - Full PostgreSQL dumps for complete backup - Image files stored separately in filesystem - Consider incremental backups for large installations - Test restore procedures regularly This data model provides a solid foundation for personal story library management with room for future enhancements while maintaining data integrity and performance.