Skip to content
View iBibek's full-sized avatar
🧠
LLMs Alignment and Safety
🧠
LLMs Alignment and Safety

Organizations

@UNHSAILLab

Block or report iBibek

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Jupyter Notebook 1 Updated May 26, 2025

Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors

Python 280 17 Updated Jan 10, 2026

Training Sparse Autoencoders on Language Models

Python 1,425 238 Updated Jun 16, 2026

Large Concept Models: Language modeling in a sentence representation space

Python 2,363 210 Updated Jan 29, 2025

Unofficial implementation of "Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection"

Python 27 1 Updated Jul 6, 2024

Working Memory Attack on LLMs

Jupyter Notebook 18 5 Updated May 27, 2025

Stop configuring your AI stack. Start using it. One command brings a complete pre-wired LLM stack with hundreds of services to explore.

Python 3,086 207 Updated Jun 17, 2026
Python 13 Updated Sep 8, 2024

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

JavaScript 7,847 858 Updated Jun 6, 2026
Jupyter Notebook 48 15 Updated Sep 29, 2024

Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"

Python 152 12 Updated Oct 13, 2025

A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…

HTML 1,868 112 Updated Jun 16, 2026

RevLLM -- Reverse Engineering Tools for Large Language Models

Python 22 3 Updated Feb 29, 2024

🎨 ASCII art library for Python

Python 1 Updated Feb 9, 2024

Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".

Jupyter Notebook 1 Updated Mar 8, 2024

visualizing attention for LLM users

Python 1 Updated May 24, 2023

Public repo for HF blog posts

Jupyter Notebook 1 Updated Jun 9, 2022

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 1 1 Updated May 22, 2023

The project is build on Google colaboratory using Python. The scripts extract the first five feeds from the website and convert them to audio file using GTTS.

Jupyter Notebook 2 Updated Oct 31, 2019

Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".

Jupyter Notebook 84 19 Updated Mar 11, 2024

LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces

Jupyter Notebook 103 27 Updated Sep 21, 2023

Science Parse parses scientific papers (in PDF form) and returns them in structured form.

Java 700 88 Updated May 26, 2024

LLaMA 2 implemented from scratch in PyTorch

Python 371 72 Updated Sep 25, 2023

TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes

14 2 Updated Jul 1, 2025

a tool for calcualting character n-gram F score

Python 80 16 Updated Feb 4, 2023

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 4,556 337 Updated Mar 4, 2026
Python 15 2 Updated Jul 8, 2023

Inference Llama 2 in one file of pure C

C 19,648 2,565 Updated Aug 6, 2024
Next