Update of documentation
This commit is contained in:
263
docs/DATA_MODEL.md
Normal file
263
docs/DATA_MODEL.md
Normal file
@@ -0,0 +1,263 @@
|
||||
# StoryCove Data Model Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
StoryCove uses PostgreSQL as its primary database with UUID-based primary keys throughout. The data model is designed to support a personal library of short stories with rich metadata, author information, and flexible organization through tags and series.
|
||||
|
||||
## Entity Relationship Diagram
|
||||
|
||||
```
|
||||
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
|
||||
│ Authors │────│ Stories │────│ Series │
|
||||
│ │ │ │ │ │
|
||||
│ - id (PK) │ │ - id (PK) │ │ - id (PK) │
|
||||
│ - name │ │ - title │ │ - name │
|
||||
│ - notes │ │ - content* │ │ - desc │
|
||||
│ - rating │ │ - rating │ │ │
|
||||
│ - avatar │ │ - volume │ │ │
|
||||
└─────────────┘ │ - cover │ └─────────────┘
|
||||
│ │ - word_count │
|
||||
│ │ - source_url │
|
||||
│ │ - timestamps │
|
||||
│ └──────────────┘
|
||||
│ │
|
||||
│ │
|
||||
┌─────────────┐ │ ┌─────────────┐
|
||||
│ Author_URLs │ │ │ Tags │
|
||||
│ │ │ │ │
|
||||
│ - author_id │ │ │ - id (PK) │
|
||||
│ - url │ │ │ - name │
|
||||
└─────────────┘ │ └─────────────┘
|
||||
│ │
|
||||
│ │
|
||||
┌─────────────┐ │
|
||||
│ Story_Tags │─────────┘
|
||||
│ │
|
||||
│ - story_id │
|
||||
│ - tag_id │
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
## Detailed Entity Specifications
|
||||
|
||||
### Stories Table
|
||||
|
||||
**Table Name**: `stories`
|
||||
|
||||
| Column | Type | Constraints | Description |
|
||||
|--------|------|-------------|-------------|
|
||||
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
||||
| title | VARCHAR(255) | NOT NULL | Story title |
|
||||
| summary | TEXT | NULL | Optional story summary |
|
||||
| description | VARCHAR(1000) | NULL | Optional description |
|
||||
| content_html | TEXT | NULL | HTML content of the story |
|
||||
| content_plain | TEXT | NULL | Plain text version (auto-generated) |
|
||||
| source_url | VARCHAR(255) | NULL | Source URL where story was found |
|
||||
| cover_path | VARCHAR(255) | NULL | Path to cover image file |
|
||||
| word_count | INTEGER | NOT NULL, DEFAULT 0 | Word count (auto-calculated) |
|
||||
| rating | INTEGER | NULL, CHECK (rating >= 1 AND rating <= 5) | Story rating |
|
||||
| volume | INTEGER | NULL | Volume number if part of series |
|
||||
| author_id | UUID | FOREIGN KEY | Reference to authors table |
|
||||
| series_id | UUID | FOREIGN KEY, NULL | Reference to series table |
|
||||
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
||||
| updated_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Last update timestamp |
|
||||
|
||||
**Indexes:**
|
||||
- Primary key on `id`
|
||||
- Foreign key index on `author_id`
|
||||
- Foreign key index on `series_id`
|
||||
- Index on `created_at` for recent stories queries
|
||||
- Index on `rating` for top-rated queries
|
||||
- Unique constraint on `source_url` where not null
|
||||
|
||||
**Business Rules:**
|
||||
- Word count is automatically calculated from `content_plain` or `content_html`
|
||||
- Plain text content is automatically extracted from HTML content using Jsoup
|
||||
- Volume is only meaningful when series_id is set
|
||||
- Rating must be between 1-5 if provided
|
||||
|
||||
### Authors Table
|
||||
|
||||
**Table Name**: `authors`
|
||||
|
||||
| Column | Type | Constraints | Description |
|
||||
|--------|------|-------------|-------------|
|
||||
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
||||
| name | VARCHAR(255) | NOT NULL, UNIQUE | Author name |
|
||||
| notes | TEXT | NULL | Notes about the author |
|
||||
| author_rating | INTEGER | NULL, CHECK (author_rating >= 1 AND author_rating <= 5) | Author rating |
|
||||
| avatar_image_path | VARCHAR(255) | NULL | Path to avatar image |
|
||||
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
||||
| updated_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Last update timestamp |
|
||||
|
||||
**Indexes:**
|
||||
- Primary key on `id`
|
||||
- Unique index on `name`
|
||||
- Index on `author_rating` for top-rated queries
|
||||
|
||||
**Business Rules:**
|
||||
- Author names must be unique across the system
|
||||
- Rating must be between 1-5 if provided
|
||||
- Author statistics (story count, average rating) are calculated dynamically
|
||||
|
||||
### Author URLs Table
|
||||
|
||||
**Table Name**: `author_urls`
|
||||
|
||||
| Column | Type | Constraints | Description |
|
||||
|--------|------|-------------|-------------|
|
||||
| author_id | UUID | FOREIGN KEY, NOT NULL | Reference to authors table |
|
||||
| url | VARCHAR(255) | NOT NULL | URL associated with author |
|
||||
|
||||
**Indexes:**
|
||||
- Foreign key index on `author_id`
|
||||
- Composite index on `(author_id, url)` for uniqueness
|
||||
|
||||
**Business Rules:**
|
||||
- One author can have multiple URLs
|
||||
- URLs are stored as simple strings without validation
|
||||
- Duplicate URLs for the same author are prevented by application logic
|
||||
|
||||
### Series Table
|
||||
|
||||
**Table Name**: `series`
|
||||
|
||||
| Column | Type | Constraints | Description |
|
||||
|--------|------|-------------|-------------|
|
||||
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
||||
| name | VARCHAR(255) | NOT NULL, UNIQUE | Series name |
|
||||
| description | VARCHAR(1000) | NULL | Series description |
|
||||
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
||||
|
||||
**Indexes:**
|
||||
- Primary key on `id`
|
||||
- Unique index on `name`
|
||||
|
||||
**Business Rules:**
|
||||
- Series names must be unique
|
||||
- Stories in a series are ordered by volume number
|
||||
- Series without stories are allowed (placeholder series)
|
||||
|
||||
### Tags Table
|
||||
|
||||
**Table Name**: `tags`
|
||||
|
||||
| Column | Type | Constraints | Description |
|
||||
|--------|------|-------------|-------------|
|
||||
| id | UUID | PRIMARY KEY, NOT NULL | Unique identifier |
|
||||
| name | VARCHAR(100) | NOT NULL, UNIQUE | Tag name |
|
||||
| created_at | TIMESTAMP | NOT NULL, DEFAULT CURRENT_TIMESTAMP | Creation timestamp |
|
||||
|
||||
**Indexes:**
|
||||
- Primary key on `id`
|
||||
- Unique index on `name`
|
||||
- Index on `name` for autocomplete queries
|
||||
|
||||
**Business Rules:**
|
||||
- Tag names must be unique and are stored in lowercase
|
||||
- Tags are created automatically when referenced by stories
|
||||
- Tag usage statistics are calculated dynamically
|
||||
|
||||
### Story Tags Junction Table
|
||||
|
||||
**Table Name**: `story_tags`
|
||||
|
||||
| Column | Type | Constraints | Description |
|
||||
|--------|------|-------------|-------------|
|
||||
| story_id | UUID | FOREIGN KEY, NOT NULL | Reference to stories table |
|
||||
| tag_id | UUID | FOREIGN KEY, NOT NULL | Reference to tags table |
|
||||
|
||||
**Constraints:**
|
||||
- Primary key on `(story_id, tag_id)`
|
||||
- Foreign key to `stories(id)` with CASCADE DELETE
|
||||
- Foreign key to `tags(id)` with CASCADE DELETE
|
||||
|
||||
**Indexes:**
|
||||
- Composite primary key index
|
||||
- Index on `tag_id` for reverse lookups
|
||||
|
||||
## Data Types and Conventions
|
||||
|
||||
### UUID Strategy
|
||||
- All primary keys use UUID (Universally Unique Identifier)
|
||||
- Generated using `GenerationType.UUID` in Hibernate
|
||||
- Provides natural uniqueness across distributed systems
|
||||
- 36-character string representation (e.g., `123e4567-e89b-12d3-a456-426614174000`)
|
||||
|
||||
### Timestamp Management
|
||||
- All entities have `created_at` timestamp
|
||||
- Stories and Authors have `updated_at` timestamp (automatically updated)
|
||||
- Series and Tags only have `created_at` (they're rarely modified)
|
||||
- All timestamps use `LocalDateTime` in Java, stored as `TIMESTAMP` in PostgreSQL
|
||||
|
||||
### Text Fields
|
||||
- **VARCHAR(n)**: For constrained text fields (names, paths, URLs)
|
||||
- **TEXT**: For unlimited text content (story content, notes, descriptions)
|
||||
- **HTML Content**: Stored as-is but sanitized on input and output
|
||||
- **Plain Text**: Automatically extracted from HTML using Jsoup
|
||||
|
||||
### Validation Rules
|
||||
- **Required Fields**: Entity names/titles are always required
|
||||
- **Length Limits**: Names limited to 255 characters, descriptions to 1000
|
||||
- **Rating Range**: All ratings constrained to 1-5 range
|
||||
- **URL Format**: No format validation at database level
|
||||
- **Uniqueness**: Names are unique within their entity type
|
||||
|
||||
## Relationships and Cascading
|
||||
|
||||
### One-to-Many Relationships
|
||||
- **Author → Stories**: Lazy loaded, cascade ALL operations
|
||||
- **Series → Stories**: Lazy loaded, ordered by volume, cascade ALL
|
||||
- **Author → Author URLs**: Eager loaded via `@ElementCollection`
|
||||
|
||||
### Many-to-Many Relationships
|
||||
- **Stories ↔ Tags**: Via `story_tags` junction table
|
||||
- Managed bidirectionally with helper methods
|
||||
- Cascade DELETE on both sides
|
||||
|
||||
### Foreign Key Constraints
|
||||
- All foreign keys have proper referential integrity
|
||||
- DELETE operations cascade appropriately
|
||||
- No orphaned records are allowed
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Indexing Strategy
|
||||
- Primary keys automatically indexed
|
||||
- Foreign keys have dedicated indexes
|
||||
- Frequently queried fields (rating, created_at) are indexed
|
||||
- Unique constraints automatically create indexes
|
||||
|
||||
### Query Optimization
|
||||
- Lazy loading prevents N+1 queries
|
||||
- Pagination used for large result sets
|
||||
- Specialized queries for common access patterns
|
||||
- Typesense search engine for full-text search (separate from PostgreSQL)
|
||||
|
||||
### Data Volume Estimates
|
||||
- **Stories**: Expected 1K-10K records per user
|
||||
- **Authors**: Expected 100-1K records per user
|
||||
- **Tags**: Expected 50-500 records per user
|
||||
- **Series**: Expected 10-100 records per user
|
||||
- **Join Tables**: Scale with story count and tagging usage
|
||||
|
||||
## Backup and Migration Considerations
|
||||
|
||||
### Schema Evolution
|
||||
- Uses Hibernate `ddl-auto: update` for development
|
||||
- Production should use controlled migration tools (Flyway/Liquibase)
|
||||
- UUID keys allow safe data migration between environments
|
||||
|
||||
### Data Integrity
|
||||
- Foreign key constraints ensure referential integrity
|
||||
- Check constraints validate data ranges
|
||||
- Application-level validation provides user-friendly error messages
|
||||
- Unique constraints prevent duplicate data
|
||||
|
||||
### Backup Strategy
|
||||
- Full PostgreSQL dumps for complete backup
|
||||
- Image files stored separately in filesystem
|
||||
- Consider incremental backups for large installations
|
||||
- Test restore procedures regularly
|
||||
|
||||
This data model provides a solid foundation for personal story library management with room for future enhancements while maintaining data integrity and performance.
|
||||
Reference in New Issue
Block a user