A Rails 8 app for tracking Chinese vocabulary (HSK) learning progress using Anki review history.
The core idea: take a user's Anki flashcard collection and a Chinese dictionary (CC-CEDICT), combine them, and display HSK vocabulary grouped by learning state:
- Mastered — well-established characters reviewed many times
- Learning — characters seen but not yet reliably recalled
- Struggling — characters with a high lapse rate
- Not started — characters in the HSK list not yet studied
- Ruby 4.0.1 (see
.ruby-version) - SQLite3
- An Anki collection export (
.colpkgfile) — optional, but needed for learning history
bundle install
bin/rails db:schema:loadDownload and import CC-CEDICT:
bin/rails dictionary_download:cc_cedict
bin/rails dictionary_import:cc_cedictbin/rails tag_download:hsk
bin/rails tag_import:hsk_2
bin/rails tag_import:hsk_3A small number of HSK vocabulary items are not present in CC-CEDICT (multi-character phrases, erhua variants, modern terms). A curated set is included:
bin/rails dictionary_import:custom_entriesTo report words still absent from both the dictionary and TSV fallback stubs:
bin/rails tag_import:audit_hsk_3_gapsSee docs/DATA_GAPS.md for current known counts and context.
Download and import frequency data used to populate DictionaryEntry#frequency_rank:
bin/rails frequency:download
bin/rails frequency:importExport a backup from Anki (File → Export → Anki Collection Package) and unzip it:
unzip -o your_collection.colpkg collection.anki21 -d tmp/anki/The app reads the Anki SQLite file directly as a second, read-only database connection. The path is configured in config/database.yml (tmp/anki/collection.anki21 for development).
Note: the Anki connection is marked
database_tasks: falseso Rails migration commands cannot overwrite your collection file.
bin/rails anki:migrate_to_models[your@email.com]This reads cards and review logs from the Anki collection and writes UserLearning and ReviewLog records into the primary database. It is safe to re-run — all imports are idempotent.
The migration targets the deck named Mandarin: Vocabulary::a. HSK and also picks up cards temporarily moved to Custom Study Sessions (via Anki's odid field).
bin/rails serverCreate an account at http://localhost:3000/sign_up.
bundle exec rspecThe test suite uses a self-contained Anki test database built from in-memory seed data — no real Anki file is required.
The app uses two SQLite databases simultaneously:
| Connection | Purpose | File |
|---|---|---|
primary |
Main app data (dictionary, users, learning records) | storage/development.sqlite3 |
anki |
Read-only connection to an Anki collection | tmp/anki/collection.anki21 |
- DictionaryEntry — a single Chinese character or phrase. Has many
Meanings andTags. - Meaning — an English translation with pinyin and a
Source(e.g. CC-CEDICT or learn_hanzi). - Tag — hierarchical (self-referential). Used to organise vocab by HSK level and lesson.
- UserLearning — join between a
Userand aDictionaryEntry, with state (new,learning,mastered,suspended) migrated from Anki card queue values. - ReviewLog — individual review events migrated from Anki, linked to a
UserLearning.
Anki::DB is an abstract ActiveRecord base that connects to the Anki SQLite. Anki::Note, Anki::Card, and Anki::Revlog are read-only models on top of it. ReadOnlyRecord (app/models/read_only_record.rb) raises on any write attempt.
bin/rubocop # lint
bin/brakeman --no-pager # security scanProduction runs as a Docker container on a Synology NAS via Portainer.
Pushing to main triggers a GitHub Actions pipeline that builds a Docker image, pushes it to a private registry at gismo:5000 (Tailscale-accessible), and redeploys the Portainer stack automatically.
The compose stack definition and deployment tooling live in the companion composer project (stacks/learn-hanzi/docker-compose.yml). To redeploy manually:
cd ../composer
rake provision:learn_hanziIssues and pull requests welcome. See CLAUDE.md for development conventions (commit format, branching, TDD workflow) and docs/DEVELOPMENT.md for the project's early development history.