Designing language models that think, remember, and reason β like humans, but at scale.
Iβm an AI Research Engineer at KakaoBank, focusing on post-training and optimization of large language models that power real-world services for 26M+ active users.
My work explores how reasoning, memory, and alignment can be reimagined to make LLMs both cognitively inspired and production-ready.
Rather than scaling parameters alone, I study how models can think more like humans β reasoning in parallel, integrating contextual memory, and aligning with human trust and intention.
Current LLMs reason sequentially, generating one token at a time.
Inspired by recent directions such as Soft Token Reasoning (arXiv:2509.19170), Iβm exploring ways to enable parallel and continuous inference β models that can revise, aggregate, and evolve thoughts before producing answers.
This connects symbolic reasoning with diffusion-like latent dynamics, aiming for human-parallel cognition.
At KakaoBank, I lead post-training and inference optimization for 200B+ parameter LLMs, building high-impact reasoning agents in financial and service domains.
My work centers on:
- Interleaved reasoning combining function calls, memory, and tool use
- Multi-instruction reasoning, enabling one instruction to branch into multiple sub-tasks
- Latency-optimized alignment, balancing inference speed with reasoning depth
Following earlier work on episodic and structured memory (PREMem, 2025), I study how models can construct and manage internal memory representations β learning to consolidate, forget, and contextualize experiences across sessions.
The goal is a reasoning loop that grounds decisions in structured, evolving memory.
Reasoning and memory must ultimately be safe.
I develop and evaluate methods that ensure consistency, transparency, and calibration in model outputs β AI systems that reflect before responding and can justify their reasoning processes.
This connects deeply to my broader pursuit: aligning artificial reasoning with human cognition and ethics.
-
Finding Diamonds in Conversation Haystacks: A Benchmark for Conversational Data Retrieval (EMNLP 2025 Industry)
Yohan Lee, Yongwoo Song, Sangyeop Kimβ -
Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue (EMNLP 2025)
Sangyeop Kim*, Yohan Lee*, Sanghwa Kim, Hyunjong Kim, Sungzoon Choβ -
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs (ACL 2025)
Sangyeop Kim*, Yohan Lee*, Yongwoo Song*, Kimin Leeβ -
HEISIR: Hierarchical Expansion of Inverted Semantic Indexing for Training-free Retrieval of Conversational Data using LLMs (NAACL 2025)
Sangyeop Kimβ , Hangyeul Lee, Yohan Lee -
SAFARI: Sample-specific Assessment Framework for AI in Real-world Interactions (NAACL 2025)
Yohan Lee*, Sungho Park*, Sangwoo Han*, Yunsung Lee*β , Yongwoo Song, Adam Lee, Jiwung Hyun, Jaemin Kim, HyeJin Gong
π Seoul, South Korea
π Portfolio
πΌ LinkedIn
π§ yhlee.nlp [at] gmail.com
βAI should not only scale in size, but in understanding reasoning with reflection, memory, and humanity.β