Juhyun Oh

Hi! I am a third-year Ph.D. student in the School of Computing at KAIST, advised by Alice Oh. My research focuses on developing evaluation methods for language models that reflect multilingual, multicultural, and interactive real-world use. I am particularly interested in the gap between benchmark performance and how people experience models in everyday interactions, especially for users whose linguistic and cultural backgrounds differ from those of AI developers. To better understand and close this gap, I study both the behavioral patterns of language models and human-centered evaluation practices.

My research focuses on:

Understanding model behaviors, identifying where and why language models diverge from human reasoning in generation, evaluation, and interaction. (Eval Paradox, Uncovering Factor Level Preferences, Lovers or Friends)
Human-centered evaluation, developing interactive, context-aware, and culturally informed evaluation methods for multilingual and multicultural settings. (Multi-FAct, Intentionally Cultural Evaluation, OLA)

My long-term goal is to build evaluation practices that make language models more reliable and meaningful for real-world users.

Email: 411juhyun [at] kaist.ac.kr

Links: [Google Scholar] [Twitter] [CV]

News

(February 2026) Joining Mila as a visiting scholar for four months, working with Prof. David Adelani. Looking forward to living in Montreal!
(January 2026) One paper accepted at EACL Findings 2026!
(December 2025) Attending NeurIPS to present a paper at LLM-eval workshop.
(November 2025) Attending EMNLP to present two papers (Uncovering Factor Level Preferences, Intentionally Cultural Evaluation).

Selected Publications

For a complete list, check my Google Scholar

OLA: Output Language Alignment in Code-Switched LLM Interactions
Juhyun Oh, Haneul Yoo*, Faiz Ghifari Haznitrama*, Alice Oh
Arxiv, 2026
Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues
Eunsu Kim, Junyeong Park, Juhyun Oh, Alice Oh
Arxiv, 2026
Culture is Everywhere: A Call for Intentionally Cultural Evaluation
Juhyun Oh, Inha Cha, Michael Saxon, Hyunseung Lim, Shaily Bhatt, Alice Oh
Findings at EMNLP, 2025
Uncovering Factor Level Preferences to Improve Human-Model Alignment
Juhyun Oh*, Eunsu Kim*, Jiseon Kim, Wenda Xu, William Yang Wang, Alice Oh
Findings at EMNLP, 2025
Flex-TravelPlanner: A Benchmark for Flexible Planning with Language Agents
Juhyun Oh, Eunsu Kim, Alice Oh
Reasoning and Planning for LLMs @ ICLR, 2025
Multi-FAct: Assessing Multilingual LLMs' Multi-Regional Knowledge using FActScore
Sheikh Shafayat, Eunsu Kim*, Juhyun Oh*, Alice Oh
COLM, 2024
The Generative AI Paradox in Evaluation: "What It Can Solve, It May Not Evaluate"
Juhyun Oh*, Eunsu Kim*, Inha Cha*, Alice Oh
Student Research Workshop @EACL, 2024
Unlocking the tacit knowledge of data work in machine learning
Inha Cha*, Juhyun Oh*, Cheul Young Park*, Jiyoon Han, Hwalsuk Lee
CHI Extended Abstracts, 2023
KOLD: Korean offensive language dataset
Younghoon Jeong, Juhyun Oh, Jaimeen Ahn, Jongwon Lee, Jihyung Moon, Sungjoon Park, Alice Oh
EMNLP, 2022
KLUE: Korean Language Understanding Evaluation
Sungjoon Park, Jihyung Moon, ..., Juhyun Oh, ..., Alice Oh, Jung-Woo Ha, Kyunghyun Cho
NEURIPS D&B, 2021

Teaching Experience

(Spring, 2025) Teaching Assistant @ CS204 Discrete Mathematics (KAIST)
(Fall, 2024) Teaching Assistant @ AI Tech Boostcamp (NAVER Connect Foundation)
(Spring, 2024) Teaching Assistant @ CS575 AI Ethics (KAIST)