Muhammad Dehan Al Kautsar
I am a Research Engineer at MBZUAI working on multilinguality and dialogue in NLP. I like tweaking and tinkering with tokenization and representations to understand how LLMs handle underrepresented languages. I am also interested in language code-mixing and code-switching in NLP. My goal is to make those models more inclusive, especially for languages across the Global South.
I also enjoy playing piano and football in my spare time. If you’re interested in collaborating or discussing the research (or anything), feel free to get in touch!
Easter Egg🥚
Yep... this whole site is rocking a Red Velvet–themed vibe🍰.
The 🟡🟣 accents and the 🐻 (bear) + 🦄 (unicorn) icons are tiny tributes to Seulgi and Yeri, Dehan’s faves.
Yep... this whole site is rocking a Red Velvet–themed vibe🍰.
The 🟡🟣 accents and the 🐻 (bear) + 🦄 (unicorn) icons are tiny tributes to Seulgi and Yeri, Dehan’s faves.
That’s it. Just a fun small detail :)
Education
M.Sc. in Informatics
— Institut Teknologi Bandung
Final GPA: 3.96 / 4.00
Thesis: “End-to-end Fused Dialogue System in Open-Source Large Language Model”.
Supervised by Ayu Purwarianti, Samuel Cahyawijaya, and Genta Indra Winata.
Supervised by Ayu Purwarianti, Samuel Cahyawijaya, and Genta Indra Winata.
B.Sc. in Informatics
— Institut Teknologi Bandung
Final GPA: 3.93 / 4.00
Final Task: “End-to-end Task-oritented Dialogue System in Indonesia”.
Supervised by Ayu Purwarianti, Samuel Cahyawijaya, and Genta Indra Winata.
Supervised by Ayu Purwarianti, Samuel Cahyawijaya, and Genta Indra Winata.
Working Experiences
Research Engineer II
— Mohamed bin Zayed University of Artificial Intelligence
Dept: Natural Language Processing
AI Engineer Intern
— GLAIR
Dept: Computer Vision
AI Engineer Intern
— Prosa.ai
Dept: Natural Language Processing
Academic & Laboratory Assistant
— Institut Teknologi Bandung
Courses: Natural Language Processing, Programming Fundamentals, Introduction to Computation.
Publications
Selected papers & preprints (chronological order).
What Do Indonesians Really Need from Language Technology? A Nationwide Survey
In: EMNLP 2025 Main
Parallel Tokenizers: Rethinking Vocabulary Design for Cross-Lingual Transfer
Preprint — arXiv:2510.06128
SEADialogues: A Multilingual Culturally Grounded Multi-turn Dialogue Dataset on Southeast Asian Languages
Preprint — arXiv:2508.07069
Role-Aware Language Models for Secure and Contextualized Access Control in Organizations
In: AACL-IJCNLP 2025 Main
Evaluating Vision-Language and Large Language Models for Automated Student Assessment in Indonesian Classrooms
Preprint — arXiv:2506.04822
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation
In: Eval4NLP 2025 Workshop (Co-located with AACL-IJCNLP 2025)
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
In: EMNLP 2024 Main
IndoToD: A Multi-Domain Indonesian Benchmark for End-to-End Task-Oriented Dialogue Systems
In: SEALP 2023 Workshop (Co-located with AACL-IJCNLP 2023). BEST PAPER
News
Recent updates & highlights.
Oct 2025
Our papers titled
- Role-Aware Language Models for Secure and Contextualized Access Control in Organizations
- Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation
was accepted at AACL-IJCNLP and Eval4NLP 2025 in Mumbai, India!
- Role-Aware Language Models for Secure and Contextualized Access Control in Organizations
- Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation
was accepted at AACL-IJCNLP and Eval4NLP 2025 in Mumbai, India!
Jul 2025
Our papers titled
- IndoSafety: Culturally Grounded Safety for LLMs in Indonesian Languages
- What Do Indonesians Really Need from Language Technology? A Nationwide Survey
was accepted at EMNLP 2025 Main. I presented the later in Suzhou, China. Nice to meet you there!👋
- IndoSafety: Culturally Grounded Safety for LLMs in Indonesian Languages
- What Do Indonesians Really Need from Language Technology? A Nationwide Survey
was accepted at EMNLP 2025 Main. I presented the later in Suzhou, China. Nice to meet you there!👋
Nov 2024
Started a new position as Research Associate I (now: Research Engineer II) at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) under supervision of Fajri Koto.
Nov 2023
Our paper titled 'IndoToD: A multi-domain Indonesian benchmark for end-to-end task-oriented dialogue systems' is accepted and selected as the Best Paper on SEALP 2023 Workshop, co-located with AACL-IJCNLP 2023 in Bali, Indonesia.🏆
Oct 2023
I went to Tokyo, Japan to become a delegate of Institut Teknologi Bandung in a technology and cultural exchange program hosted by The University of Electro-Communications (UEC).🌸
Nov 2021
Became a finalist of Pusat Prestasi Nasional GEMASTIK - Smart City Division. We developed 'Virtual Hospital', an IoT-based technology to monitor the patient virtually because of the impact of COVID-19.
Contact
You can reach me via email or follow me on social media.