MemoHanzi - Implementation Package
Welcome! This package contains everything needed to implement MemoHanzi (记汉字 - "Remember Hanzi"), a self-hosted Chinese character learning application.
📦 What's Included
1. HANZI-LEARNING-APP-SPECIFICATION.md (Main Spec)
Complete technical specification including:
- ✅ System architecture
- ✅ Complete database schema (Prisma)
- ✅ All Server Actions API
- ✅ SM-2 algorithm implementation
- ✅ UI/UX specifications
- ✅ Testing strategy
- ✅ 12-week milestone plan
- ✅ Docker configuration
- ✅ Phase 2 & 3 roadmaps
2. CLAUDE.md (Instructions for Claude Code)
Strict implementation guidelines to prevent deviation from spec:
- Critical rules for following the specification
- Milestone approach
- Common mistakes to avoid
- Progress reporting format
3. PROJECT-NAMING.md (Naming Conventions)
All naming conventions for the MemoHanzi project:
- Application naming (display vs technical)
- Database naming
- Code conventions
- Branding guidelines
🚀 Quick Start
For Claude Code:
- Read
CLAUDE.mdfirst - Read
HANZI-LEARNING-APP-SPECIFICATION.md - Read
PROJECT-NAMING.mdfor naming conventions - Start with Milestone 1 (Week 1: Foundation)
For Human Developers:
- Read
HANZI-LEARNING-APP-SPECIFICATION.md- this is your blueprint - Check
PROJECT-NAMING.mdfor naming consistency - Follow the 12-week milestone plan
- Use the testing strategy throughout
📋 Project Overview
Name: MemoHanzi (记汉字)
Tagline: Remember Hanzi, effortlessly
Type: Self-hosted web application
Purpose: Learn Chinese characters using spaced repetition (SM-2 algorithm)
Tech Stack:
- Next.js 16 (TypeScript, App Router)
- PostgreSQL 18 + Prisma
- NextAuth.js v5
- Docker Compose + Nginx
- Tailwind CSS
Timeline: 10-12 weeks for MVP
🎯 Key Features (MVP)
For Users:
- Learn Hanzi with spaced repetition
- Multiple choice pinyin quiz (4 options)
- Create custom collections or use HSK levels
- Track progress with charts and statistics
- Search complete Hanzi database
For Admins:
- Import HSK vocabulary (JSON/CSV)
- Manage global collections
- User administration
📁 Project Structure
memohanzi/ # Your project root
├── src/
│ ├── app/ # Next.js App Router
│ ├── actions/ # Server Actions
│ ├── components/ # React components
│ ├── lib/ # Utils (SM-2, parsers, etc)
│ └── types/ # TypeScript types
├── prisma/
│ └── schema.prisma # Database schema
├── docker/
│ ├── Dockerfile
│ └── nginx.conf
├── docker-compose.yml
└── README.md
🔑 Critical Implementation Notes
1. Follow the Specification EXACTLY
The specification is the single source of truth. Do not deviate without approval.
2. Implement Milestones Sequentially
Complete Week 1 → Week 2 → Week 3... in order.
3. SM-2 Algorithm is Critical
Use the EXACT formulas provided in Section 5 of the spec.
4. Testing is Mandatory
Write tests as you implement. Target: 70%+ coverage.
5. Database Schema is Fixed
Implement ALL models exactly as specified in the Prisma schema.
📊 Development Milestones
| Week | Milestone | Focus | Status |
|---|---|---|---|
| 1 | Foundation | Setup project, Docker, Prisma schema | ✅ Complete |
| 2 | Authentication | User registration, login, preferences | ✅ Complete |
| 3-4 | Data Import | Admin imports HSK data (JSON/CSV) | ✅ Complete |
| 5 | Collections | User collections + global HSK collections | ✅ Complete |
| 5 | Hanzi Search | Search interface and detail views | ✅ Complete |
| 6 | SM-2 Algorithm | Core learning algorithm + tests | ✅ Complete |
| 7-8 | Learning UI | Learning session interface | ✅ Complete |
| 9 | Dashboard | Progress tracking and visualizations | ✅ Complete |
| 10 | UI Polish | Responsive design, dark mode | 🔄 Next |
| 11 | Testing & Docs | Complete test coverage | |
| 12 | Deployment | Production deployment + data import |
✅ Milestone 3 Completed Features
Data Import System:
- ✅ HSK JSON parser supporting complete-hsk-vocabulary format
- ✅ CSV parser with flexible column mapping
- ✅ Admin import page with file upload and paste functionality
- ✅ Update existing entries or skip duplicates option
- ✅ Detailed import results with success/failure counts and line-level errors
- ✅ Format validation and error reporting
- ✅ Support for multi-character hanzi (words like 中国)
- ✅ All transcription types (pinyin, numeric, wade-giles, zhuyin, ipa)
- ✅ 14 passing integration tests for both JSON and CSV parsers
Database Initialization System:
- ✅ Multi-file selection for batch initialization
- ✅ Real-time progress updates via Server-Sent Events (SSE)
- ✅ Progress bar showing current operation and percentage
- ✅ Automatic HSK level collection creation
- ✅ Auto-populate collections with hanzi based on level attribute
- ✅ Optional clean data mode (delete all existing data before import)
- ✅ Comprehensive statistics: hanzi imported, collections created, items added
- ✅ Admin initialization page at /admin/initialize
- ✅ SSE API route at /api/admin/initialize for long-running operations
Files Created:
src/lib/import/json-parser.ts- HSK JSON format parsersrc/lib/import/csv-parser.ts- CSV format parsersrc/lib/import/json-parser.test.ts- JSON parser testssrc/lib/import/csv-parser.test.ts- CSV parser testssrc/actions/admin.ts- Admin-only import and initialization actionssrc/actions/admin.integration.test.ts- Admin action testssrc/app/(admin)/admin/import/page.tsx- Import UIsrc/app/(admin)/admin/initialize/page.tsx- Initialization UI with SSE progresssrc/app/api/admin/initialize/route.ts- SSE API endpoint for real-time progress
✅ Milestone 4 Completed Features
Collections Management:
- ✅ Complete CRUD operations for collections (create, read, update, delete)
- ✅ Global HSK collections (admin-created, read-only for users)
- ✅ User personal collections (full control)
- ✅ Add hanzi to collections via:
- Search & multi-select with checkboxes
- Paste list (comma, space, or newline separated)
- Create collection with hanzi list
- ✅ Remove hanzi (individual and bulk selection)
- ✅ Collection detail view with hanzi list
- ✅ Order preservation for added hanzi
- ✅ Duplicate detection and validation
- ✅ 21 passing integration tests
Files Created:
src/actions/collections.ts- Collection Server Actionssrc/actions/collections.integration.test.ts- Complete test suitesrc/app/(app)/collections/page.tsx- Collections list pagesrc/app/(app)/collections/[id]/page.tsx- Collection detail pagesrc/app/(app)/collections/new/page.tsx- Create collection page
✅ Milestone 5 Completed Features
Hanzi Search & Detail Views:
- ✅ Public hanzi search (no authentication required)
- ✅ Search by simplified, traditional, pinyin, or meaning
- ✅ HSK level filtering (12 levels: new-1 through new-6, old-1 through old-6)
- ✅ Pagination with hasMore indicator (20 results per page)
- ✅ Comprehensive detail view showing:
- All forms (simplified, traditional with isDefault indicator)
- All transcriptions (pinyin, numeric, wade-giles, etc.)
- All meanings with language codes
- HSK level badges
- Parts of speech
- Classifiers, radical, frequency
- ✅ Add to collection from detail page
- ✅ 16 passing integration tests
Files Created:
src/actions/hanzi.ts- Public hanzi search actionssrc/app/(app)/hanzi/page.tsx- Search page with filterssrc/app/(app)/hanzi/[id]/page.tsx- Detail page with all datasrc/actions/hanzi.integration.test.ts- Complete test suite
Key Features:
- searchHanzi(): Fuzzy search across simplified, traditional, pinyin, and meanings
- HSK level filtering for targeted vocabulary
- Pagination with hasMore indicator for infinite scroll support
- Complete hanzi data display including rare transcription types
- Direct integration with collections (add from detail page)
✅ Milestone 6 Completed Features
SM-2 Algorithm Implementation:
- ✅ Core SM-2 spaced repetition algorithm following SuperMemo specification
- ✅ Exact formulas for correct and incorrect answer calculations
- ✅ Initial values: easeFactor=2.5, interval=1, consecutiveCorrect=0
- ✅ Correct answer logic:
- First correct: interval = 1 day
- Second correct: interval = 6 days
- Third+ correct: interval = Math.round(interval × easeFactor)
- Increase easeFactor by 0.1 with each correct answer
- ✅ Incorrect answer logic:
- Reset interval to 1 day
- Reset consecutiveCorrect to 0
- Decrease easeFactor by 0.2 (minimum 1.3)
- Increment incorrectCount
- ✅ Card selection algorithm:
- Filter out SUSPENDED cards
- Select due cards (nextReviewDate ≤ now)
- Priority: HARD > NORMAL > EASY
- Sort by: nextReviewDate ASC, incorrectCount DESC, consecutiveCorrect ASC
- Limit to cardsPerSession
- ✅ Wrong answer generation with Fisher-Yates shuffle
- ✅ 38 passing unit tests with 100% statement and line coverage
- ✅ 94.11% branch coverage (exceeds 90% requirement)
Files Created:
src/lib/learning/sm2.ts- Core algorithm implementationsrc/lib/learning/sm2.test.ts- Comprehensive unit tests
Functions Implemented:
calculateCorrectAnswer()- Update progress for correct answerscalculateIncorrectAnswer()- Update progress for incorrect answersselectCardsForSession()- Select due cards with priority sortinggenerateWrongAnswers()- Generate 3 incorrect options from same HSK levelshuffleOptions()- Fisher-Yates shuffle for randomizing answer positions
✅ Milestone 7-8 Completed Features
Learning Interface:
- ✅ Learning session page (
/learn/[collectionId]) with dynamic routing - ✅ Large hanzi display (text-9xl) for easy reading
- ✅ 4 pinyin answer options in 2x2 grid layout
- ✅ Auto-submit after answer selection (100ms delay)
- ✅ Progress bar showing "Card X of Y" with percentage
- ✅ Green/red feedback overlay with checkmark/X icons
- ✅ Correct answer display for incorrect responses
- ✅ English meaning display after answer submission
- ✅ Session summary screen with statistics:
- Total cards, correct/incorrect counts
- Accuracy percentage
- Session duration in minutes
- ✅ Keyboard shortcuts:
- Press 1-4 to select answer options
- Press Space to continue after feedback
- ✅ Loading and error states
- ✅ Responsive mobile-first design
Learning Server Actions:
- ✅
startLearningSession()- Initialize session with card selection and answer generation - ✅
submitAnswer()- Record answer and update SM-2 progress - ✅
endSession()- Mark session complete and return summary - ✅
getDueCards()- Count cards due today/this week - ✅
updateCardDifficulty()- Manual difficulty override (EASY/MEDIUM/HARD/SUSPENDED) - ✅
removeFromLearning()- Suspend card from learning
SM-2 Integration:
- ✅ Automatic progress tracking with SM-2 algorithm
- ✅ Due card selection with priority sorting
- ✅ New card introduction when insufficient due cards
- ✅ Two-stage card randomization:
- Random tiebreaker for equal-priority cards during selection
- Final shuffle of selected cards for presentation
- ✅ Wrong answer generation from same HSK level
- ✅ Session tracking in database (LearningSession, SessionReview)
Navigation Integration:
- ✅ "Start Learning" button on collection detail pages
- ✅ "Learn All" option on dashboard
- ✅ Routes:
/learn/alland/learn/[collectionId]
Files Created:
src/actions/learning.ts- Learning session Server Actions (700+ lines)src/app/(app)/learn/[collectionId]/page.tsx- Learning session UI (340+ lines)
Enhancements:
- ✅ English meaning display for vocabulary reinforcement
- ✅ Randomized card presentation to prevent demoralization
- ✅ All 38 SM-2 algorithm tests passing with 98.92% coverage
✅ Milestone 9 Completed Features
Dashboard Enhancements:
- ✅ Real-time statistics widgets replacing hardcoded zeros
- ✅ Due cards counter (now, today, this week)
- ✅ Total learned cards count
- ✅ Daily goal progress tracker (reviewed today / daily goal)
- ✅ Learning streak calculation (consecutive days with reviews)
- ✅ Recent activity section showing last 5 learning sessions
- ✅ Session cards with accuracy percentages and collection names
- ✅ Navigation link to progress page
Progress Page:
- ✅ Comprehensive progress page at
/progress - ✅ Date range selector (7/30/90/365 days)
- ✅ Summary statistics cards:
- Cards reviewed in selected period
- Overall accuracy percentage
- Total cards in learning
- Average session length (minutes)
- ✅ Daily Activity bar chart (Recharts):
- Stacked correct/incorrect reviews by date
- Interactive tooltips with detailed counts
- ✅ Accuracy Trend line chart:
- Daily accuracy percentage over time
- Smooth line visualization
- ✅ Session history table:
- Sortable by date
- Shows collection, cards reviewed, accuracy, session length
- Responsive design
- ✅ Dark mode compatible color schemes
Progress Server Actions:
- ✅
getStatistics()- Returns due cards, total learned, daily goal, streak - ✅
getUserProgress()- Returns overview stats and daily activity breakdown - ✅
getLearningSessions()- Returns paginated session history - ✅
getHanziProgress()- Individual hanzi progress details - ✅
resetHanziProgress()- Reset card to initial state
Statistics Calculations:
- ✅ Streak calculation algorithm (consecutive days with reviews)
- ✅ Daily activity aggregation using Map for efficient grouping
- ✅ Accuracy calculations (correct / total reviews)
- ✅ Average session length (total duration / session count)
- ✅ Date range filtering for historical data
Recharts Integration:
- ✅ Installed and configured Recharts library
- ✅ Line chart component for trends
- ✅ Bar chart component with stacking for activity
- ✅ Responsive containers for mobile/desktop
- ✅ Custom tooltips and legends
Files Created:
src/actions/progress.ts- Progress tracking Server Actions (550+ lines)src/app/(app)/progress/page.tsx- Progress visualization page (380+ lines)
Files Modified:
src/app/(app)/dashboard/page.tsx- Added real statistics and recent activity- Navigation updated across dashboard and progress pages
🎨 Naming Conventions
User-Facing:
- Always: "MemoHanzi" (capitalized)
- Tagline: "Remember Hanzi, effortlessly"
- Include Chinese: 记汉字
Technical:
- Directory:
memohanzi/ - Database:
memohanzi_db - User:
memohanzi_user - Container:
memohanzi-app,memohanzi-postgres
See PROJECT-NAMING.md for complete guidelines.
📚 Data Source
HSK Vocabulary: https://github.com/drkameleon/complete-hsk-vocabulary/
This repository contains comprehensive HSK (Chinese proficiency test) vocabulary data that will be imported into MemoHanzi.
🔐 Security Highlights
- ✅ Passwords hashed with bcrypt
- ✅ NextAuth.js session management
- ✅ HTTPS via Nginx
- ✅ Rate limiting
- ✅ Input validation (Zod)
- ✅ SQL injection prevention (Prisma)
🧪 Testing Strategy
- Unit Tests (70%): Business logic, SM-2 algorithm, parsers
- Integration Tests (20%): Server Actions, database operations
- E2E Tests (10%): Critical user flows
Tools: Vitest (unit/integration), Playwright (E2E)
📖 Additional Resources
- Next.js Docs: https://nextjs.org/docs
- Prisma Docs: https://www.prisma.io/docs
- NextAuth Docs: https://authjs.dev
- SM-2 Algorithm: https://www.supermemo.com/en/archives1990-2015/english/ol/sm2
🎯 Success Criteria (MVP)
Technical:
- All tests passing (70%+ coverage)
- Can import complete HSK vocabulary
- Page load <2s
- Mobile responsive
Functional:
- Complete learning session works end-to-end
- SM-2 algorithm accurate
- Progress tracking working
- Collections management functional
- Search efficient
User Experience:
- Can learn 20+ cards in 5-10 minutes
- Interface intuitive
- Daily use sustainable
🚦 Getting Started Checklist
Before beginning implementation:
- Read complete specification document
- Understand the tech stack
- Review the milestone approach
- Check naming conventions
- Set up development environment
- Confirm understanding of SM-2 algorithm
💡 Important Reminders
- Read the spec before implementing anything
- Ask questions if anything is unclear
- Don't make up alternatives or "improvements"
- Write tests as you go
- Follow the milestone sequence
- Use consistent naming (check PROJECT-NAMING.md)
📞 Next Steps
For Claude Code: Start by saying: "I've read CLAUDE.md and HANZI-LEARNING-APP-SPECIFICATION.md. Ready to begin Milestone 1: Project Foundation."
For Human Developers:
- Set up your development environment
- Create project directory:
memohanzi/ - Begin Milestone 1 tasks
- Reference the specification frequently
Good luck building MemoHanzi! 🎯
记汉字 - Remember Hanzi, effortlessly.