Do interesting things
Basic Info
- China - Shanghai
- GitHub: Chi-Shan0707
I am currently an undergraduate student at the School of Mathematical Sciences, Fudan University, pursuing a double degree in Information and Computing Science and Artificial Intelligence.
At the same time, I have been actively self-studying interesting topics across computer science.
I am exploring the intersection I truly want to work on and study.
I hope to contribute to making society better.
My research interests
Currently, I am exploring the mathematical foundations of intelligence in large models and sequential decision-making. More specifically, I want to ground the concept of AI trust in mathematical interpretability rather than philosophy, and delve deeper into these areas through Reinforcement Learning methods.
See more of my motivation on my plan page.
My featured works
I enjoy building small but complete research systems, writing technical notes, and turning vague ideas into runnable artifacts.
code-not-text
Cross-domain limits of hand-crafted CoT-surface features; The same CoT features that predict math correctness (0.958) are noise for coding (0.434). See demo here~
Overview
A core probe of five hand-crafted features—token confidence, trajectory continuity, reflection count, novelty, neuron width—predicts solution correctness from the reasoning trace alone. On 31,040 runs across three domains:*Math (AIME, HMMT) AoA 0.958 — signal at 10% of trace*
*Science (GPQA) AoA 0.799 — confidence, not structure*
*Coding (LiveCodeBench) AoA 0.434 — below token-confidence baseline*
This isn't a feature engineering problem. I swept 83+ coding-specific scalars, added SSL pre-training, nonlinear MLPs, de-knotting, and a coding-specific run judge—all fail. The features measure reasoning quality in math but mere text fluency in coding: a measurement invariance failure. Correctness lives in the runtime, not in the text.
TinyLoRA-GRPO-Coder
Low-parameter adaptation and reinforcement learning for code generation
Overview
An independent open-source reimplementation and adaptation of TinyLoRA + GRPO from [Learning to Reason in 13 Parameters](https://arxiv.org/abs/2602.04118), migrated from math reasoning to verifiable competitive-programming code generation. Built on Qwen2.5-Coder-3B with only a tiny number of shared trainable parameters, the project uses real compile-and-run rewards rather than static heuristics. I developed the full pipeline end to end, including data processing, training, multi-GPU setup, reward design, evaluation, and validation, which significantly strengthened my ability to turn a paper into a working research system.My repos
- The following projects were primarily completed independently, with AI used only as an auxiliary tool where appropriate. These works were not conducted under a laboratory or research group; rather, they reflect my self-directed exploration, sustained learning, and independent implementation outside a formal research environment.
KaggleCompetitions
Participated in several Kaggle competitions (see repo: KaggleCompetitions), gaining broad exposure to and practical experience with various machine learning tools.
Hone My C Plus Plus
Explorations of Advanced Algorithms and Modern C++ can be found in Hone My C Plus Plus.
microgpt.cpp
A simple microgpt.cpp in ~300 lines (repo: microgpt.cpp).
Baseball
A Strike/Ball Classification Model using CNN + ResNet18 for baseball pitch analysis (repo: Baseball).
Sample Java
The essence of Randomized Algorithms—the "Art of Sampling"—and my first taste of Java. Repo: Sample Java.
DeepLearning / GenerativeModel / ReinforcementLearning
These three repositories serve as learning records covering theory → implementation:
- The following projects were completed in collaboration with others.
SVDomain
I propose SVDomain: a domain-conditioned low-rank framework for chain-of-thought analysis.
Overview
SVDomain is a domain-conditioned low-rank framework that builds feature views from token-level confidence and uncertainty statistics, trajectory summaries, and availability indicators, and learns a shared latent basis with a lightweight linear readout. - Canonical pipeline: StandardScaler → TruncatedSVD → LogisticRegression - Downstream tasks: EarlyStop, Best-of-N bridging, RL checkpoint ranking. - Focus: when low-rank structure becomes predictive, how bases transfer across anchors, and how the same low-rank object can support both prediction and explanation. This repository contains a paper-style writeup and code to reproduce experiments and analyses.This project was completed in collaboration with others. My collaborator contributed the meta-level raw data foundation; I proposed the framework, designed and ran the experiments, and conducted the validation.
Service and Community Involvement
Beyond my personal projects, I also contribute to community-oriented open-source work.
- github-unflag-playbook-cn
view website here ~~~
A Chinese playbook documenting GitHub account flagging/recovery experiences, appeal processes, and case archives for mainland China developers. - FDUGuideBook/nav-site
visit our website here
Contribute to this navigation site for the Fudan community continuously.
Tech stack and tools
| Domain | Skills |
|---|---|
| Language | |
| IDE | |
| OS | |
| Other |
Education
School of Mathematical Sciences, Fudan University
When clouds gather, the mountain grows lovelier still; when they part, it stands like a painting.
Clouds lend it shadow and light, and give shape to its height.
—