-
M.S. in Graduate School of Artificial Intelligence, POSTECH (2023.02 - 2025.02)
- Lab: POSTECH NLP Lab
- Advisor: Gary Geunbae Lee
- Co-Advisor: Jungseul Ok, Yunsu Kim
- Master Thesis: Leveraging Code-Switched Data to Improve Multi-Vector Retrievers for Cross-Language Information Retrieval
-
B.S. in Information Convergence, Kwangwoon University (2017.03 - 2023.02)
- Major : Data Science
- Internship at AIRC-KETI (2022.04 - 2022.11)
- I am an NLP researcher specializing in Information Retrieval, exploring various search problems and solutions. My research interests include various language scenarios and domains, such as Cross-Language and Mixed-Language Information Retrieval, and advancing Retrieval-Augmented Generation (RAG) systems.
- Keywords : Cross-Language & Mixed-Language Information Retrieval, Information Retrieval, Re-ranking, RAG, Code-Switching
- Jonghwi Kim, Deokhyung Kang, Seonjeong Hwang, Yunsu Kim, Jungseul Ok, Gary Geunbae Lee. (2025). MiLQ: Benchmarking IR Models for Bilingual Web Search with Mixed Language Queries. (EMNLP 2025 main). [Paper]
- Daehee Kim, Deokhyung Kang, Jonghwi Kim, Sangwon Ryu, Gary Geunbae Lee. (2025). GuRE: Generative Query REwriter for Legal Passage Retrieval. arXiv preprint arXiv:2505.12950. [Paper]
- San Kim, Jonghwi Kim, Yejin Jeon, Gary Geunbae Lee. (2025). Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection. In Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2025 findings). [Paper]
- Heejin Do*, Sangwon Ryu*, Jonghwi Kim, Gary Geunbae Lee. (2025). Multi-Facet Blending for Faceted Query-by-Example Retrieval. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2025 main). [Paper]
- Jonghwi Kim, Yunsu Kim, Gary Geunbae Lee. (2023). ColBERT with Adversarial Language Adaptation for Multilingual Information Retrieval. In Proceedings of the Annual Conference on Human and Cognitive Language Technology (HCLT 2023 oral), pp. 239β244. [Paper]
- Jonghwi Kim, Saim Shin, Jin Yea Jang. (2022). A Clustering-based Undersampling Method to Prevent Information Loss from Text Data. In Proceedings of the Annual Conference on Human and Cognitive Language Technology (HCLT 2022 oral), pp. 251β256. [Paper]