Dong Jun Kim junkim100

Hi, I'm Dongjun Kim

Master's student at Korea University's NLP&AI Lab, advised by Dr. Heuiseok Lim. I work at the intersection of LLM evaluation and mechanistic interpretability to make models measurable, transparent, and trustworthy.

What I work on

LLM Evaluation
- Ability decomposition and benchmark auditing (mixture-of-abilities, contamination checks, robustness sweeps)
- Reproducible pipelines, unified metrics, longitudinal tracking, leaderboard design
- Evaluation that correlates with user-perceived capability and downstream utility
Mechanistic Interpretability
- Circuits/features via sparse autoencoders, probing, attribution, and targeted patching/ablations
- Causal tracing and intervention studies to identify mechanisms behind reasoning and coding
- Model-edit-aware analyses to understand when changes help or harm capabilities
AI Safety and Reliability
- Auditing models for harmful behaviors and failure modes (e.g., deception, bias, adversarial vulnerability)
- Continuous knowledge editing with retrieval for time-evolving domains (e.g., law)

Get in touch

If you're working on evaluation, interpretability, or AI safety, I'm happy to connect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dong Jun Kim junkim100

Achievements

Achievements

Highlights

Organizations

Block or report junkim100

Hi, I'm Dongjun Kim

What I work on

Get in touch

Pinned Loading

Uh oh!