-
KAIST AI
- Seongnam, South Korea
-
14:08
(UTC +09:00) - https://brightjade.github.io/
- in/brightjade
- @minseok__choi
Highlights
- Pro
Stars
Demonstration and Template Projects
A curated list of free/libre plugins, scripts and add-ons for Godot
Large Language Models for Software Engineering: A Systematic Literature Review
[ICLR 2026] LLM/VLM gaming agents and model evaluation through games.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Harbor is a framework for running agent evaluations and creating and using RL environments.
Must-read papers on Repository-level Code Generation & Issue Resolution 🔥
Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
The AILuminate v1.1 benchmark suite is an AI risk assessment benchmark developed with broad involvement from leading AI companies, academia, and civil society.
A simple evaluation of generative language models and safety classifiers.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
A curated list of awesome open-source libraries for production LLM
A curated list of awesome Multimodal studies.
Set of tools to assess and improve LLM security.
✨✨Latest Advances on Multimodal Large Language Models
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…
Official repository for "Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration, arXiv 2024.06" (https://arxiv.org/pdf/2406.16469)
Instruction Tuning with GPT-4
Holistic evaluation of multimodal foundation models
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
A resource repository for machine unlearning in large language models
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.