Skip to content
View l-yohai's full-sized avatar
πŸ˜€
πŸ˜€

Block or report l-yohai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
l-yohai/README.md

🧠 Yohan Lee (μš”ν•œ)

AI Research Engineer @ KakaoBank | Building Reasoning Systems at Human & Model Scale

Designing language models that think, remember, and reason β€” like humans, but at scale.


🧩 About Me

I’m an AI Research Engineer at KakaoBank, focusing on post-training and optimization of large language models that power real-world services for 26M+ active users.
My work explores how reasoning, memory, and alignment can be reimagined to make LLMs both cognitively inspired and production-ready.

Rather than scaling parameters alone, I study how models can think more like humans β€” reasoning in parallel, integrating contextual memory, and aligning with human trust and intention.


πŸ”­ Research Focus

🧠 Human-like Reasoning

Current LLMs reason sequentially, generating one token at a time.
Inspired by recent directions such as Soft Token Reasoning (arXiv:2509.19170), I’m exploring ways to enable parallel and continuous inference β€” models that can revise, aggregate, and evolve thoughts before producing answers.
This connects symbolic reasoning with diffusion-like latent dynamics, aiming for human-parallel cognition.

βš™οΈ Scalable Reasoning Systems

At KakaoBank, I lead post-training and inference optimization for 200B+ parameter LLMs, building high-impact reasoning agents in financial and service domains.

My work centers on:

  • Interleaved reasoning combining function calls, memory, and tool use
  • Multi-instruction reasoning, enabling one instruction to branch into multiple sub-tasks
  • Latency-optimized alignment, balancing inference speed with reasoning depth

🧬 Memory & Cognitive Modeling

Following earlier work on episodic and structured memory (PREMem, 2025), I study how models can construct and manage internal memory representations β€” learning to consolidate, forget, and contextualize experiences across sessions.
The goal is a reasoning loop that grounds decisions in structured, evolving memory.

πŸ›‘οΈ Trustworthy & Human-Aligned AI

Reasoning and memory must ultimately be safe.
I develop and evaluate methods that ensure consistency, transparency, and calibration in model outputs β€” AI systems that reflect before responding and can justify their reasoning processes.
This connects deeply to my broader pursuit: aligning artificial reasoning with human cognition and ethics.


πŸ† Publications

  • Finding Diamonds in Conversation Haystacks: A Benchmark for Conversational Data Retrieval (EMNLP 2025 Industry)
    Yohan Lee, Yongwoo Song, Sangyeop Kim†

  • Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue (EMNLP 2025)

    Sangyeop Kim*, Yohan Lee*, Sanghwa Kim, Hyunjong Kim, Sungzoon Cho†

  • What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs (ACL 2025)

    Sangyeop Kim*, Yohan Lee*, Yongwoo Song*, Kimin Lee†

  • HEISIR: Hierarchical Expansion of Inverted Semantic Indexing for Training-free Retrieval of Conversational Data using LLMs (NAACL 2025)

    Sangyeop Kim†, Hangyeul Lee, Yohan Lee

  • SAFARI: Sample-specific Assessment Framework for AI in Real-world Interactions (NAACL 2025)
    Yohan Lee*, Sungho Park*, Sangwoo Han*, Yunsung Lee*†, Yongwoo Song, Adam Lee, Jiwung Hyun, Jaemin Kim, HyeJin Gong


πŸ“« Connect

πŸ“ Seoul, South Korea
🌐 Portfolio
πŸ’Ό LinkedIn
πŸ“§ yhlee.nlp [at] gmail.com


β€œAI should not only scale in size, but in understanding reasoning with reflection, memory, and humanity.”

Pinned Loading

  1. deepspeedai/DeepSpeed-MII deepspeedai/DeepSpeed-MII Public

    MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

    Python 2.1k 187

  2. tunib-ai/oslo tunib-ai/oslo Public archive

    OSLO: Open Source framework for Large-scale model Optimization

    Python 309 29

  3. jskwak98/Bookathon3_Bookie_On_And_On jskwak98/Bookathon3_Bookie_On_And_On Public

    Jupyter Notebook 31 9

  4. LostCow/KLUE LostCow/KLUE Public

    KLUE Benchmark 1st place (2021.12) solutions. (RE, MRC, NLI, STS, TC)

    Python 25 4

  5. daily_papers_ko daily_papers_ko Public archive

    This project aims to automatically translate and summarize Huggingface's daily papers into Korean using ChatGPT.

    Python 52 6

  6. CDR-Benchmark CDR-Benchmark Public

    Finding Diamonds in Conversation Haystacks: A Benchmark for Conversational Data Retrieval (EMNLP 2025 Industry Track)

    Python 2 1