Skip to content
View mhshih's full-sized avatar

Block or report mhshih

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 97,148 14,858 Updated Jun 2, 2026

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

1 Updated Jun 2, 2023

A Python module to bypass Cloudflare's anti-bot page.

Python 6,593 633 Updated Jun 10, 2025

Traditional Mandarin LLMs for Taiwan

Python 1,415 120 Updated Apr 20, 2025

DSPy: The framework for programming—not prompting—language models

Python 35,018 2,976 Updated Jun 11, 2026

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

MDX 75,611 8,208 Updated Mar 11, 2026

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 42,512 4,857 Updated Jun 14, 2026

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca

C 4,122 407 Updated Apr 18, 2025

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…

Jupyter Notebook 2,797 250 Updated Dec 12, 2023

A Traditional-Chinese instruction-following model with datasets based on Alpaca.

Python 137 17 Updated Mar 28, 2023

中文自然语言处理工具包 Toolkit for Chinese natural language processing

Java 2,690 716 Updated Nov 17, 2023

fChart 6.0以上版本的分類範例

Fortran 9 16 Updated Oct 22, 2021

萌典網站

Objective-C 649 99 Updated Jun 7, 2026

台語詞性句法變調

Python 2 1 Updated Sep 14, 2022

輸入全漢kah全羅,對齊後,ta̍k-ê詞標詞性

Python 1 Updated Sep 2, 2022

台日大辭典台語譯本 資料庫鏡像檔

Shell 18 11 Updated Oct 4, 2018

Comprehensive Python Cheatsheet

Python 1 Updated Jul 21, 2019

Taigi CWS/POS/NER natural language processing tool with Articut as kernel.

Python 25 6 Updated Jan 4, 2025

ACoLi CoNLL libraries: Several tools for processing, manipulating and transforming TSV formats (CoNLL-RDF, CoNLL-Merge, CQP4RDF)

7 1 Updated Nov 12, 2021

A web-based collaborative LaTeX editor

JavaScript 17,823 2,001 Updated Jun 12, 2026

A WaveRNN implementation

Python 201 46 Updated Oct 14, 2019

WaveRNN Vocoder + TTS

Python 2,188 688 Updated Jul 2, 2022

API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。

Python 415 37 Updated Jun 4, 2026

違章工廠舉報系統

Python 73 33 Updated Apr 22, 2026

POS Tag for Bahasa Indonesia

Java 60 27 Updated Jun 11, 2016

working with data from map.coa.gov.tw

PHP 15 9 Updated Feb 26, 2018
Python 1 Updated Jul 27, 2018
Next