Skip to content

繁轉簡一對多#464

Open
danny0838 wants to merge 15 commits into
BYVoid:masterfrom
danny0838:繁轉簡一對多

Hidden character warning

The head ref may contain hidden characters: "\u7e41\u8f49\u7c21\u4e00\u5c0d\u591a"
Open

繁轉簡一對多#464
danny0838 wants to merge 15 commits into
BYVoid:masterfrom
danny0838:繁轉簡一對多

Conversation

@danny0838

Copy link
Copy Markdown
Contributor

No description provided.

@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch 2 times, most recently from 35cac23 to e86636a Compare June 29, 2020 13:52
@danny0838 danny0838 mentioned this pull request Jun 29, 2020
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch 6 times, most recently from e7454e9 to b64c704 Compare July 2, 2020 12:50
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch from b64c704 to 336bb20 Compare July 8, 2020 17:04
@danny0838

Copy link
Copy Markdown
Contributor Author

這項目前還有什麼大問題嗎?

@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch 3 times, most recently from f00d15e to a7693a1 Compare July 16, 2020 12:30
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch from a7693a1 to 2b426cf Compare July 21, 2020 13:18
@frankslin

Copy link
Copy Markdown
Collaborator

这个是很久以前的 PR 了,能否解释一下:

  1. 目前 STCharacters 和 TSCharacters 并没有关联的规则或测试,为什么 OpenCC TSCharacters.txt 文件需要「一对多」?这样做,预期能解决什么新问题?
  2. 类似「仝 -> 仝 同」和「剋 -> 克 剋」这样的记录,是否并不会影响任何模式的转换行为?能否改为只添加有影响的少数条目?
  3. 对于需要添加的会对转换行为有实际影响条目,能否解释一下这些条目的必要性?
  4. TSPhrases.txt‎ 中新增的词条的显著性在于什么?

另外 ts_multi.txt 属于来自「韵典网」的说明的文件,大部分情况是无需修改的,如果你坚持要要改,建议可直接给「韵典网」的项目发 PR,毕竟那边才是这些资料的源头。

@danny0838

Copy link
Copy Markdown
Contributor Author

基本上在 #492 (comment) 提過,就不贅述

我並沒看到韵典网上有 ts_multi.txt 。況且韵典网已經年久失修(網站顯示的維護時間是到 2020),也沒有公開原始碼和 GitHub repo,我不知如何提案,也無法預期提案會被關注。

既然 OpenCC 選擇將 ts_multi 作為文件,而非直接給韵典网的連結,那應該就是有需要按本專案的需求適時更新用字轉換標準,而非繼續讓過時甚至錯誤的文件永留。

@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch 9 times, most recently from 92c7ff9 to 6da8330 Compare May 30, 2026 14:15
danny0838 added a commit to danny0838/OpenCC that referenced this pull request May 30, 2026
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch 2 times, most recently from cfb5fdf to eb6dfa2 Compare May 30, 2026 15:26
danny0838 added a commit to danny0838/OpenCC that referenced this pull request Jun 1, 2026
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch from 8f96f49 to 71e5a02 Compare June 1, 2026 13:27
danny0838 added a commit to danny0838/OpenCC that referenced this pull request Jun 2, 2026
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch from 71e5a02 to 51c0bd9 Compare June 2, 2026 01:57
danny0838 added a commit to danny0838/OpenCC that referenced this pull request Jun 3, 2026
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch from 51c0bd9 to 8273b09 Compare June 3, 2026 06:22
danny0838 added a commit to danny0838/OpenCC that referenced this pull request Jun 7, 2026
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch 2 times, most recently from c591bc4 to edb8578 Compare June 7, 2026 12:22
danny0838 added a commit to danny0838/OpenCC that referenced this pull request Jun 8, 2026
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch 2 times, most recently from d9518d1 to 8e73599 Compare June 8, 2026 08:00
danny0838 added a commit to danny0838/OpenCC that referenced this pull request Jun 8, 2026
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch from 8e73599 to 8c49c80 Compare June 8, 2026 12:19
danny0838 added 15 commits June 12, 2026 17:13
* t2s: 除「目劄」外原則上可轉為「札」;額外處理異體字「箚」
* s2t: 多數「劄」相關詞可通用「札」,專名還原為「劄」
* t2s: 除「扞格」外原則上可轉為「捍」
* s2t: 多數「扞」相關詞可通用「捍」
* 繁中「甯」幾乎只用於人名,罕用於一般詞彙。故原則上維持原狀輸出。
* t2s: 除「袷袢」外可轉為「夹」
* s2t: 「袷衣」等詞偏好用「袷」
* t2s: 除人名外原則上可轉為「溪」
* s2t: 古地名、州名還原為「谿」,正式傳統中醫穴名用「谿」
* t2s: 除人名、地名外原則上可轉為「径」
* s2t: 表「直接」義用「逕」
@danny0838 danny0838 force-pushed the 繁轉簡一對多 branch from 8c49c80 to bed8279 Compare June 12, 2026 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants