Files
memohanzi/README.md
Stefan Hardegger 33377009d0 intialization
2025-11-21 13:27:37 +01:00

355 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# MemoHanzi - Implementation Package
Welcome! This package contains everything needed to implement MemoHanzi (记汉字 - "Remember Hanzi"), a self-hosted Chinese character learning application.
## 📦 What's Included
### 1. **HANZI-LEARNING-APP-SPECIFICATION.md** (Main Spec)
Complete technical specification including:
- ✅ System architecture
- ✅ Complete database schema (Prisma)
- ✅ All Server Actions API
- ✅ SM-2 algorithm implementation
- ✅ UI/UX specifications
- ✅ Testing strategy
- ✅ 12-week milestone plan
- ✅ Docker configuration
- ✅ Phase 2 & 3 roadmaps
### 2. **CLAUDE.md** (Instructions for Claude Code)
Strict implementation guidelines to prevent deviation from spec:
- Critical rules for following the specification
- Milestone approach
- Common mistakes to avoid
- Progress reporting format
### 3. **PROJECT-NAMING.md** (Naming Conventions)
All naming conventions for the MemoHanzi project:
- Application naming (display vs technical)
- Database naming
- Code conventions
- Branding guidelines
## 🚀 Quick Start
### For Claude Code:
1. Read `CLAUDE.md` first
2. Read `HANZI-LEARNING-APP-SPECIFICATION.md`
3. Read `PROJECT-NAMING.md` for naming conventions
4. Start with Milestone 1 (Week 1: Foundation)
### For Human Developers:
1. Read `HANZI-LEARNING-APP-SPECIFICATION.md` - this is your blueprint
2. Check `PROJECT-NAMING.md` for naming consistency
3. Follow the 12-week milestone plan
4. Use the testing strategy throughout
## 📋 Project Overview
**Name:** MemoHanzi (记汉字)
**Tagline:** Remember Hanzi, effortlessly
**Type:** Self-hosted web application
**Purpose:** Learn Chinese characters using spaced repetition (SM-2 algorithm)
**Tech Stack:**
- Next.js 16 (TypeScript, App Router)
- PostgreSQL 18 + Prisma
- NextAuth.js v5
- Docker Compose + Nginx
- Tailwind CSS
**Timeline:** 10-12 weeks for MVP
## 🎯 Key Features (MVP)
**For Users:**
- Learn Hanzi with spaced repetition
- Multiple choice pinyin quiz (4 options)
- Create custom collections or use HSK levels
- Track progress with charts and statistics
- Search complete Hanzi database
**For Admins:**
- Import HSK vocabulary (JSON/CSV)
- Manage global collections
- User administration
## 📁 Project Structure
```
memohanzi/ # Your project root
├── src/
│ ├── app/ # Next.js App Router
│ ├── actions/ # Server Actions
│ ├── components/ # React components
│ ├── lib/ # Utils (SM-2, parsers, etc)
│ └── types/ # TypeScript types
├── prisma/
│ └── schema.prisma # Database schema
├── docker/
│ ├── Dockerfile
│ └── nginx.conf
├── docker-compose.yml
└── README.md
```
## 🔑 Critical Implementation Notes
### 1. Follow the Specification EXACTLY
The specification is the single source of truth. Do not deviate without approval.
### 2. Implement Milestones Sequentially
Complete Week 1 → Week 2 → Week 3... in order.
### 3. SM-2 Algorithm is Critical
Use the EXACT formulas provided in Section 5 of the spec.
### 4. Testing is Mandatory
Write tests as you implement. Target: 70%+ coverage.
### 5. Database Schema is Fixed
Implement ALL models exactly as specified in the Prisma schema.
## 📊 Development Milestones
| Week | Milestone | Focus | Status |
|------|-----------|-------|--------|
| 1 | Foundation | Setup project, Docker, Prisma schema | ✅ Complete |
| 2 | Authentication | User registration, login, preferences | ✅ Complete |
| 3-4 | Data Import | Admin imports HSK data (JSON/CSV) | ✅ Complete |
| 5 | Collections | User collections + global HSK collections | ✅ Complete |
| 5 | Hanzi Search | Search interface and detail views | ✅ Complete |
| 6 | SM-2 Algorithm | Core learning algorithm + tests | ✅ Complete |
| 7-8 | Learning UI | Learning session interface | 🔄 Next |
| 9 | Dashboard | Progress tracking and visualizations | |
| 10 | UI Polish | Responsive design, dark mode | |
| 11 | Testing & Docs | Complete test coverage | |
| 12 | Deployment | Production deployment + data import | |
### ✅ Milestone 3 Completed Features
**Data Import System:**
- ✅ HSK JSON parser supporting complete-hsk-vocabulary format
- ✅ CSV parser with flexible column mapping
- ✅ Admin import page with file upload and paste functionality
- ✅ Update existing entries or skip duplicates option
- ✅ Detailed import results with success/failure counts and line-level errors
- ✅ Format validation and error reporting
- ✅ Support for multi-character hanzi (words like 中国)
- ✅ All transcription types (pinyin, numeric, wade-giles, zhuyin, ipa)
- ✅ 14 passing integration tests for both JSON and CSV parsers
**Database Initialization System:**
- ✅ Multi-file selection for batch initialization
- ✅ Real-time progress updates via Server-Sent Events (SSE)
- ✅ Progress bar showing current operation and percentage
- ✅ Automatic HSK level collection creation
- ✅ Auto-populate collections with hanzi based on level attribute
- ✅ Optional clean data mode (delete all existing data before import)
- ✅ Comprehensive statistics: hanzi imported, collections created, items added
- ✅ Admin initialization page at /admin/initialize
- ✅ SSE API route at /api/admin/initialize for long-running operations
**Files Created:**
- `src/lib/import/json-parser.ts` - HSK JSON format parser
- `src/lib/import/csv-parser.ts` - CSV format parser
- `src/lib/import/json-parser.test.ts` - JSON parser tests
- `src/lib/import/csv-parser.test.ts` - CSV parser tests
- `src/actions/admin.ts` - Admin-only import and initialization actions
- `src/actions/admin.integration.test.ts` - Admin action tests
- `src/app/(admin)/admin/import/page.tsx` - Import UI
- `src/app/(admin)/admin/initialize/page.tsx` - Initialization UI with SSE progress
- `src/app/api/admin/initialize/route.ts` - SSE API endpoint for real-time progress
### ✅ Milestone 4 Completed Features
**Collections Management:**
- ✅ Complete CRUD operations for collections (create, read, update, delete)
- ✅ Global HSK collections (admin-created, read-only for users)
- ✅ User personal collections (full control)
- ✅ Add hanzi to collections via:
- Search & multi-select with checkboxes
- Paste list (comma, space, or newline separated)
- Create collection with hanzi list
- ✅ Remove hanzi (individual and bulk selection)
- ✅ Collection detail view with hanzi list
- ✅ Order preservation for added hanzi
- ✅ Duplicate detection and validation
- ✅ 21 passing integration tests
**Files Created:**
- `src/actions/collections.ts` - Collection Server Actions
- `src/actions/collections.integration.test.ts` - Complete test suite
- `src/app/(app)/collections/page.tsx` - Collections list page
- `src/app/(app)/collections/[id]/page.tsx` - Collection detail page
- `src/app/(app)/collections/new/page.tsx` - Create collection page
### ✅ Milestone 5 Completed Features
**Hanzi Search & Detail Views:**
- ✅ Public hanzi search (no authentication required)
- ✅ Search by simplified, traditional, pinyin, or meaning
- ✅ HSK level filtering (12 levels: new-1 through new-6, old-1 through old-6)
- ✅ Pagination with hasMore indicator (20 results per page)
- ✅ Comprehensive detail view showing:
- All forms (simplified, traditional with isDefault indicator)
- All transcriptions (pinyin, numeric, wade-giles, etc.)
- All meanings with language codes
- HSK level badges
- Parts of speech
- Classifiers, radical, frequency
- ✅ Add to collection from detail page
- ✅ 16 passing integration tests
**Files Created:**
- `src/actions/hanzi.ts` - Public hanzi search actions
- `src/app/(app)/hanzi/page.tsx` - Search page with filters
- `src/app/(app)/hanzi/[id]/page.tsx` - Detail page with all data
- `src/actions/hanzi.integration.test.ts` - Complete test suite
**Key Features:**
- searchHanzi(): Fuzzy search across simplified, traditional, pinyin, and meanings
- HSK level filtering for targeted vocabulary
- Pagination with hasMore indicator for infinite scroll support
- Complete hanzi data display including rare transcription types
- Direct integration with collections (add from detail page)
### ✅ Milestone 6 Completed Features
**SM-2 Algorithm Implementation:**
- ✅ Core SM-2 spaced repetition algorithm following SuperMemo specification
- ✅ Exact formulas for correct and incorrect answer calculations
- ✅ Initial values: easeFactor=2.5, interval=1, consecutiveCorrect=0
- ✅ Correct answer logic:
- First correct: interval = 1 day
- Second correct: interval = 6 days
- Third+ correct: interval = Math.round(interval × easeFactor)
- Increase easeFactor by 0.1 with each correct answer
- ✅ Incorrect answer logic:
- Reset interval to 1 day
- Reset consecutiveCorrect to 0
- Decrease easeFactor by 0.2 (minimum 1.3)
- Increment incorrectCount
- ✅ Card selection algorithm:
- Filter out SUSPENDED cards
- Select due cards (nextReviewDate ≤ now)
- Priority: HARD > NORMAL > EASY
- Sort by: nextReviewDate ASC, incorrectCount DESC, consecutiveCorrect ASC
- Limit to cardsPerSession
- ✅ Wrong answer generation with Fisher-Yates shuffle
- ✅ 38 passing unit tests with 100% statement and line coverage
- ✅ 94.11% branch coverage (exceeds 90% requirement)
**Files Created:**
- `src/lib/learning/sm2.ts` - Core algorithm implementation
- `src/lib/learning/sm2.test.ts` - Comprehensive unit tests
**Functions Implemented:**
- `calculateCorrectAnswer()` - Update progress for correct answers
- `calculateIncorrectAnswer()` - Update progress for incorrect answers
- `selectCardsForSession()` - Select due cards with priority sorting
- `generateWrongAnswers()` - Generate 3 incorrect options from same HSK level
- `shuffleOptions()` - Fisher-Yates shuffle for randomizing answer positions
## 🎨 Naming Conventions
**User-Facing:**
- Always: "MemoHanzi" (capitalized)
- Tagline: "Remember Hanzi, effortlessly"
- Include Chinese: 记汉字
**Technical:**
- Directory: `memohanzi/`
- Database: `memohanzi_db`
- User: `memohanzi_user`
- Container: `memohanzi-app`, `memohanzi-postgres`
See `PROJECT-NAMING.md` for complete guidelines.
## 📚 Data Source
HSK Vocabulary: https://github.com/drkameleon/complete-hsk-vocabulary/
This repository contains comprehensive HSK (Chinese proficiency test) vocabulary data that will be imported into MemoHanzi.
## 🔐 Security Highlights
- ✅ Passwords hashed with bcrypt
- ✅ NextAuth.js session management
- ✅ HTTPS via Nginx
- ✅ Rate limiting
- ✅ Input validation (Zod)
- ✅ SQL injection prevention (Prisma)
## 🧪 Testing Strategy
- **Unit Tests (70%):** Business logic, SM-2 algorithm, parsers
- **Integration Tests (20%):** Server Actions, database operations
- **E2E Tests (10%):** Critical user flows
**Tools:** Vitest (unit/integration), Playwright (E2E)
## 📖 Additional Resources
- **Next.js Docs:** https://nextjs.org/docs
- **Prisma Docs:** https://www.prisma.io/docs
- **NextAuth Docs:** https://authjs.dev
- **SM-2 Algorithm:** https://www.supermemo.com/en/archives1990-2015/english/ol/sm2
## 🎯 Success Criteria (MVP)
**Technical:**
- [ ] All tests passing (70%+ coverage)
- [ ] Can import complete HSK vocabulary
- [ ] Page load <2s
- [ ] Mobile responsive
**Functional:**
- [ ] Complete learning session works end-to-end
- [ ] SM-2 algorithm accurate
- [ ] Progress tracking working
- [ ] Collections management functional
- [ ] Search efficient
**User Experience:**
- [ ] Can learn 20+ cards in 5-10 minutes
- [ ] Interface intuitive
- [ ] Daily use sustainable
## 🚦 Getting Started Checklist
Before beginning implementation:
- [ ] Read complete specification document
- [ ] Understand the tech stack
- [ ] Review the milestone approach
- [ ] Check naming conventions
- [ ] Set up development environment
- [ ] Confirm understanding of SM-2 algorithm
## 💡 Important Reminders
1. **Read the spec before implementing anything**
2. **Ask questions if anything is unclear**
3. **Don't make up alternatives or "improvements"**
4. **Write tests as you go**
5. **Follow the milestone sequence**
6. **Use consistent naming (check PROJECT-NAMING.md)**
## 📞 Next Steps
**For Claude Code:**
Start by saying: "I've read CLAUDE.md and HANZI-LEARNING-APP-SPECIFICATION.md. Ready to begin Milestone 1: Project Foundation."
**For Human Developers:**
1. Set up your development environment
2. Create project directory: `memohanzi/`
3. Begin Milestone 1 tasks
4. Reference the specification frequently
---
**Good luck building MemoHanzi! 🎯**
记汉字 - Remember Hanzi, effortlessly.