-
MBZUAI, IndoNLP
- Abu Dhabi
- fajrikoto.com
- @FajriKoto
Lists (1)
Sort Name ascending (A-Z)
Stars
[EACL 2026 Main] Framework to construct a Cultural Commonsense Knowledge Graph( CCKG) that have geographical context.
paper list, dataset, and tools for radiology report generation
A curated list of research papers and resources on Cultural LLM.
Open Implementations of LLM Analyses
A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.
CMMLU: Measuring massive multitask language understanding in Chinese
A Multilingual Replicable Instruction-Following Model
Discourse Probing of Pretrained Language Models. In Proceedings of NAACL 2021.
A framework for assessing and improving classification fairness.
High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)
Evaluating the Efficacy of Summarization Evaluation across Languages. In Findings of ACL 2021.
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)
Complete Web Scraping of TED.com for Metadata, Transcript, Audio, Video, Images using Parallel Programming
Classification of twitter user's personality based on their tweets. Big Five Model used to classify the personality.
The Dataset for Hate Speech Detection in Indonesian (Bahasa Indonesia)