Files

Stefan Hardegger 0bb7c4f5e6 Initial Commit

2025-11-18 08:16:12 +01:00

16 KiB

Raw Blame History

MemoHanzi - Implementation Specification

Version: 1.0
Status: Ready for Implementation
Target: Claude Code Application Name: MemoHanzi (记汉字 - "Remember Hanzi")

Quick Start Summary

What: MemoHanzi is a self-hosted web app for learning Chinese characters (hanzi) using spaced repetition (SM-2 algorithm)

Tech Stack:

Next.js 16 (TypeScript, App Router, Server Actions)
PostgreSQL 18 + Prisma ORM
NextAuth.js v5 for authentication
Docker Compose deployment with Nginx reverse proxy
Tailwind CSS, React Hook Form, Zod validation, Recharts

MVP Timeline: 10-12 weeks

1. Core Features (MVP)

User Features

✅ Registration/Login with email & password
✅ Create and manage personal hanzi collections
✅ Browse and use global HSK-level collections
✅ Learning sessions with 4-choice pinyin quiz
✅ SM-2 spaced repetition algorithm
✅ Progress tracking & statistics dashboard
✅ Search hanzi database (by character, pinyin, meaning)
✅ User preferences (language, display options, learning settings)

Admin Features

✅ Import hanzi data (JSON/CSV from HSK vocabulary source)
✅ Manage global collections
✅ User management (roles, activation)

2. System Architecture

Deployment Stack

[Nginx Reverse Proxy:80/443] 
    ↓ HTTPS/Rate Limiting/Caching
[Next.js App:3000]
    ↓ Prisma ORM
[PostgreSQL:5432]

Project Structure

memohanzi/
├── src/
│   ├── app/              # Next.js App Router
│   │   ├── (auth)/       # Login, register
│   │   ├── (app)/        # Dashboard, learn, collections, hanzi, progress, settings
│   │   └── (admin)/      # Admin pages
│   ├── actions/          # Server Actions (auth, collections, hanzi, learning, etc.)
│   ├── components/       # React components
│   ├── lib/              # Utils (SM-2 algorithm, parsers, validation)
│   └── types/            # TypeScript types
├── prisma/
│   └── schema.prisma     # Database schema
├── docker/
│   ├── Dockerfile
│   └── nginx.conf
└── docker-compose.yml

3. Database Schema (Prisma)

Core Models

Language - Stores supported translation languages

Fields: code (ISO 639-1), name, nativeName, isActive

Hanzi - Base hanzi information

Fields: simplified (unique), radical, frequency
Relations: forms, hskLevels, partsOfSpeech, userProgress, collectionItems

HanziForm - Traditional variants

Fields: hanziId, traditional, isDefault
Relations: transcriptions, meanings, classifiers

HanziTranscription - Multiple transcription types

Fields: formId, type (pinyin/numeric/wadegiles/etc), value

HanziMeaning - Multi-language meanings

Fields: formId, languageId, meaning, orderIndex

HanziHSKLevel - HSK level tags

Fields: hanziId, level (e.g., "new-1", "old-3")

HanziPOS - Parts of speech

Fields: hanziId, pos (n/v/adj/etc)

HanziClassifier - Measure words

Fields: formId, classifier

User & Auth Models

User

Fields: email, password (hashed), name, role (USER/ADMIN/MODERATOR), isActive
Relations: collections, hanziProgress, preferences, sessions

UserPreference

Fields: preferredLanguageId, characterDisplay (SIMPLIFIED/TRADITIONAL/BOTH), transcriptionType, cardsPerSession, dailyGoal, removalThreshold, allowManualDifficulty

Account, Session, VerificationToken - NextAuth.js standard models

Learning Models

Collection

Fields: name, description, isGlobal, createdBy, isPublic
Relations: items (CollectionItem join table)

CollectionItem - Join table

Fields: collectionId, hanziId, orderIndex

UserHanziProgress - Tracks learning per hanzi

Fields: userId, hanziId, correctCount, incorrectCount, consecutiveCorrect
SM-2 fields: easeFactor (default 2.5), interval (default 1), nextReviewDate
Manual override: manualDifficulty (EASY/MEDIUM/HARD/SUSPENDED)

LearningSession - Track study sessions

Fields: userId, startedAt, endedAt, cardsReviewed, correctAnswers, incorrectAnswers, collectionId
Relations: reviews (SessionReview)

SessionReview - Individual card reviews

Fields: sessionId, hanziId, isCorrect, responseTime

4. Server Actions API

All actions return: { success: boolean, data?: T, message?: string, errors?: Record<string, string[]> }

Authentication (`src/actions/auth.ts`)

register(email, password, name) - Create account
login(email, password) - Authenticate
logout() - End session
updatePassword(current, new) - Change password
updateProfile(name, email, image) - Update user

Collections (`src/actions/collections.ts`)

createCollection(name, description, isPublic) - New collection
updateCollection(id, data) - Modify (owner/admin only)
deleteCollection(id) - Remove (owner/admin only)
getCollection(id) - Get with hanzi
getUserCollections() - List user's collections
getGlobalCollections() - List HSK collections
addHanziToCollection(collectionId, hanziIds[]) - Add hanzi
removeHanziFromCollection(collectionId, hanziId) - Remove hanzi

Hanzi (`src/actions/hanzi.ts`)

searchHanzi(query, hskLevel?, limit, offset) - Search database (public)
getHanzi(id) - Get details (public)
getHanziBySimplified(char) - Lookup by character (public)

Learning (`src/actions/learning.ts`)

startLearningSession(collectionId?, cardsCount) - Begin session, returns cards
submitAnswer(sessionId, hanziId, selected, correct, time) - Record answer, updates SM-2
endSession(sessionId) - Complete, return summary
getDueCards() - Get counts (now, today, week)
updateCardDifficulty(hanziId, difficulty) - Manual override
removeFromLearning(hanziId) - Stop learning card

Progress (`src/actions/progress.ts`)

getUserProgress(dateRange?) - Overall stats & charts
getHanziProgress(hanziId) - Individual hanzi stats
getLearningSessions(limit?) - Session history
getStatistics() - Dashboard stats
resetHanziProgress(hanziId) - Reset card

Preferences (`src/actions/preferences.ts`)

getPreferences() - Get settings
updatePreferences(data) - Update settings
getAvailableLanguages() - List languages

Admin (`src/actions/admin.ts`)

createGlobalCollection(name, description, hskLevel) - HSK collection
importHanzi(fileData, format) - Bulk import (JSON/CSV)
getImportHistory() - Past imports
getUserManagement(page, pageSize) - List users
updateUserRole(userId, role) - Change role
toggleUserStatus(userId) - Activate/deactivate

5. SM-2 Algorithm Implementation

Initial Values

easeFactor: 2.5
interval: 1 day
consecutiveCorrect: 0

On Correct Answer

if (consecutiveCorrect === 0) {
  interval = 1
} else if (consecutiveCorrect === 1) {
  interval = 6
} else {
  interval = Math.round(interval * easeFactor)
}

easeFactor = easeFactor + 0.1  // Can adjust based on quality
consecutiveCorrect++
nextReviewDate = now + interval days

On Incorrect Answer

interval = 1
consecutiveCorrect = 0
nextReviewDate = now + 1 day
easeFactor = Math.max(1.3, easeFactor - 0.2)

Card Selection

Query: WHERE nextReviewDate <= now AND userId = currentUser
Apply manual difficulty (SUSPENDED = exclude, HARD = priority, EASY = depriority)
Sort: nextReviewDate ASC, incorrectCount DESC, consecutiveCorrect ASC
Limit to user's cardsPerSession
If not enough, add new cards from collections

Wrong Answer Generation

Select 3 random incorrect pinyin from same HSK level
Ensure no duplicates
Randomize order (Fisher-Yates shuffle)

6. UI/UX Pages

Public

/ - Landing page
/login - Login form
/register - Registration form

Authenticated

/dashboard - Due cards, progress widgets, recent activity, quick start
/learn/[collectionId] - Learning session with cards
/collections - List all collections (global + user's)
/collections/[id] - Collection detail, hanzi list, edit
/collections/new - Create collection
/hanzi - Search hanzi (filters, pagination)
/hanzi/[id] - Hanzi detail (all transcriptions, meanings, etc)
/progress - Charts, stats, session history
/settings - User preferences

Admin

/admin/collections - Manage global collections
/admin/hanzi - Manage hanzi database
/admin/import - Import data (JSON/CSV upload)
/admin/users - User management

Key UI Components

LearningCard: Large hanzi, 4 pinyin options in 2x2 grid, progress bar
AnswerFeedback: Green/red feedback, show correct answer, streak, removal suggestion
CollectionCard: Name, count, progress, quick actions
DashboardWidgets: Due cards, daily progress, streak, recent activity
Charts: Activity heatmap, accuracy line chart, HSK breakdown bar chart

Design

Mobile-first responsive
Dark mode support
Tailwind CSS
Keyboard shortcuts (1-4 for answers, Space to continue)
WCAG 2.1 AA accessibility

7. Data Import Formats

HSK JSON (from github.com/drkameleon/complete-hsk-vocabulary)

{
  "simplified": "爱好",
  "radical": "爫",
  "level": ["new-1", "old-3"],
  "frequency": 4902,
  "pos": ["n", "v"],
  "forms": [{
    "traditional": "愛好",
    "transcriptions": {
      "pinyin": "ài hào",
      "numeric": "ai4 hao4"
    },
    "meanings": ["to like; hobby"],
    "classifiers": ["个"]
  }]
}

CSV Format

simplified,traditional,pinyin,meaning,hsk_level,radical,frequency,pos,classifiers
爱好,愛好,ài hào,"to like; hobby",new-1,爫,4902,"n,v",个

8. Testing Strategy

Unit Tests (70% coverage target)

SM-2 algorithm - All calculation paths
Card selection logic - Sorting, filtering, limits
Parsers - JSON/CSV parsing, error handling
Validation schemas - Zod schemas

Integration Tests (80% of Server Actions)

Auth actions with database
Learning flow (start session, submit answers, end session)
Collection CRUD
Import process

E2E Tests (Critical paths)

Complete learning session
Create collection and add hanzi
Search hanzi
Admin import
Auth flow

Tools: Vitest (unit/integration), Playwright (E2E)

9. Development Milestones

Week 1: Foundation

Setup Next.js 16 project
Configure Prisma + PostgreSQL
Setup Docker Compose
Create all data models
Configure NextAuth.js

Week 2: Authentication

Registration/login pages
Middleware protection
User preferences
Integration tests

Week 3-4: Data Import

Admin role middleware
HSK JSON parser
CSV parser
Import UI and actions
Test with real HSK data

Week 5: Collections

Collections CRUD
Add/remove hanzi
Global HSK collections

Week 5: Hanzi Search

Search page
Filters (HSK level)
Hanzi detail view
Pagination

Week 6: SM-2 Algorithm

Implement algorithm
Card selection logic
Progress tracking
Unit tests (90%+ coverage)

Week 7-8: Learning Interface

Learning session pages
Card component
Answer submission
Feedback UI
Session summary
Keyboard shortcuts
E2E tests

Week 9: Dashboard & Progress

Dashboard widgets
Progress page
Charts (Recharts)
Statistics calculations

Week 10: UI Polish

Responsive layouts
Mobile navigation
Dark mode
Loading/empty states
Toast notifications
Accessibility improvements

Week 11: Testing & Docs

Complete test coverage
E2E tests for all critical flows
README and documentation
Security audit

Week 12: Deployment

Production environment
Docker deployment
SSL certificates
Database backup
Import HSK data
Final testing

10. Docker Configuration

docker-compose.yml

version: '3.8'
services:
  nginx:
    image: nginx:alpine
    ports: ["80:80", "443:443"]
    volumes:
      - ./docker/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./docker/ssl:/etc/nginx/ssl:ro
    depends_on: [app]
  
  app:
    build: .
    expose: ["3000"]
    environment:
      - DATABASE_URL=postgresql://memohanzi_user:password@postgres:5432/memohanzi_db
      - NEXTAUTH_URL=https://yourdomain.com
      - NEXTAUTH_SECRET=${NEXTAUTH_SECRET}
    depends_on:
      postgres:
        condition: service_healthy
  
  postgres:
    image: postgres:18-alpine
    environment:
      POSTGRES_USER: memohanzi_user
      POSTGRES_PASSWORD: password
      POSTGRES_DB: memohanzi_db
    volumes:
      - postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U hanzi_user"]

volumes:
  postgres-data:

Environment Variables

# .env.local
DATABASE_URL="postgresql://memohanzi_user:password@localhost:5432/memohanzi_db"
NEXTAUTH_URL="http://localhost:3000"
NEXTAUTH_SECRET="generate-with-openssl-rand-base64-32"
NODE_ENV="development"

11. Security Checklist

Passwords hashed with bcrypt (10 rounds)
Session tokens httpOnly, sameSite
CSRF protection (NextAuth.js)
Rate limiting (Nginx)
Input validation (Zod, server-side)
SQL injection prevented (Prisma)
XSS prevention (React escaping)
HTTPS enforced (Nginx)
Secure headers (Nginx)
Role-based access enforced server-side
No sensitive data in logs
Environment variables for secrets

12. Phase 2 Features

Additional Languages - Multi-language support for meanings
Learning Modes - Radical identification, hanzi-to-meaning, meaning-to-hanzi, tone practice
Autocomplete Data - Auto-fill missing hanzi info from APIs
User Suggestions - Allow users to report/suggest corrections

13. Phase 3 Ideas

Writing practice (stroke order validation)
Social features (public collections, sharing)
Gamification (streaks, badges, leaderboards)
Mobile apps (React Native)
Audio pronunciation
Example sentences
Advanced SRS algorithms

14. Quick Reference Commands

Development:

# Start
docker-compose up
npm run dev

# Database
npx prisma migrate dev
npx prisma db seed
npx prisma studio

# Testing
npm run test
npm run test:e2e

Production:

# Deploy
docker-compose up -d --build

# Monitor
docker-compose logs -f

15. Success Criteria (MVP)

Technical:

All tests passing (70%+ coverage)
Can import complete HSK vocabulary (5000+ hanzi)
Page load <2s
Learning session responsive (<100ms)
Mobile responsive

Functional:

Complete learning session works end-to-end
SM-2 algorithm calculates correctly
Progress tracking accurate
Collections management works
Search works efficiently

User Experience:

Can learn 20+ cards in 5-10 minutes
Interface intuitive
Daily use sustainable

Implementation Notes

Priority Order

Authentication (foundational)
Data import (need data)
Collections (organize learning)
Search (browse data)
Learning algorithm (core logic)
Learning interface (user interaction)
Progress tracking (motivation)
Polish & deploy

Critical Paths to Test

Register → Login → Create Collection → Add Hanzi → Start Learning → Complete Session → View Progress
Admin → Import HSK Data → Create Global Collection → User uses global collection
Search Hanzi → View Detail → Add to Collection → Learn

Key Implementation Files

prisma/schema.prisma - All data models
src/lib/learning/sm2.ts - SM-2 algorithm
src/lib/learning/card-selector.ts - Card selection
src/lib/import/hsk-parser.ts - Parse HSK JSON
src/actions/learning.ts - Learning Server Actions
src/app/(app)/learn/[collectionId]/page.tsx - Learning UI

Resources

HSK Data Source: https://github.com/drkameleon/complete-hsk-vocabulary
Next.js Docs: https://nextjs.org/docs
Prisma Docs: https://www.prisma.io/docs
NextAuth Docs: https://authjs.dev
SM-2 Algorithm: https://www.supermemo.com/en/archives1990-2015/english/ol/sm2

This specification is complete and ready for implementation with Claude Code.

Start with Milestone 1 (Week 1: Foundation) and proceed sequentially through the milestones.

16 KiB Raw Blame History