-
PostDoc Fellow
- Montreal,Canada
- @Hussein_Abdala1
Highlights
- Pro
-
andrej-karpathy-skills Public
Forked from forrestchang/andrej-karpathy-skillsA single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.
UpdatedApr 13, 2026 -
-
inference-hive Public
Forked from ellamind/inference-hiveinference-hive is a toolkit to run distributed LLM inference on SLURM clusters. Configure a few cluster, inference server and data settings, and scale your inference workload across thousands of GPUs.
Python Apache License 2.0 UpdatedMar 10, 2026 -
web-languages-eg Public
Forked from commoncrawl/web-languagesCrowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ for the code
UpdatedDec 24, 2025 -
web-content-extraction-benchmark Public
Forked from chatnoir-eu/web-content-extraction-benchmarkWeb Content Extraction Benchmark
Python Apache License 2.0 UpdatedDec 16, 2025 -
openpagerank-fetcher Public
Forked from KavehKadkhoda/openpagerank-fetcherUtility to fetch Open PageRank scores for many domains at once, with caching, batching, and API quota handling.
Jupyter Notebook MIT License UpdatedNov 24, 2025 -
datatrove Public
Forked from huggingface/datatroveFreeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Python Apache License 2.0 UpdatedNov 17, 2025 -
MedQA Public
A Fine tunning + RAG pipline for Medical QA using the MedQA dataset
-
domain-quality-evaluation Public
Forked from KavehKadkhoda/domain-quality-evaluationThe project introduces a machine learning framework for automatically evaluating the trustworthiness of online domains using link-based and domain-level features. It predicts credibility scores for…
Jupyter Notebook MIT License UpdatedNov 7, 2025 -
News-Data-Online Public
Forked from KavehKadkhoda/News-Data-Onlineglobal news data collection
Jupyter Notebook UpdatedSep 29, 2025 -
dclm Public
Forked from mlfoundations/dclmDataComp for Language Models
HTML MIT License UpdatedSep 9, 2025 -
AI4trust-News-observatory Public
Forked from KavehKadkhoda/AI4trust-News-observatoryThese notebooks analyze daily trends in online news coverage, examining news volume, topic distribution, source reliability, disinformation tactics, check-worthy claims, and visual-text alignment.
Jupyter Notebook UpdatedAug 1, 2025 -
PCoT Public
Forked from ArkadiusDS/PCoTPersuasion-Augmented Chain-of-Thought for Disinformation Detection
Python Other UpdatedJul 29, 2025 -
GraphRAG_Bench Public
Forked from JayLZhou/GraphRAGIn-depth study of the graphrag
-
Domain-Specific-Small-Language-Models Public
Forked from virtualramblas/Domain-Specific-Small-Language-ModelsRepository for the companion Colab notebook of the Domain-Specific Small Language Models book.
Jupyter Notebook Apache License 2.0 UpdatedJun 17, 2025 -
-
Misinformation-Resilient-Search-Rankings Public
Forked from CASOS-IDeaS-CMU/Misinformation-Resilient-Search-RankingsDesign and evaluate search engine interventions for safe and fair search rankings.
Jupyter Notebook UpdatedMay 29, 2025 -
commoncrawl-cc-pyspark Public
Forked from commoncrawl/cc-pysparkProcess Common Crawl data with Python and Spark
Python MIT License UpdatedMay 27, 2025 -
granite-snack-cookbook Public
Forked from ibm-granite-community/granite-snack-cookbookGranite Snack Cookbook -- easily consumable recipes (python notebooks) that showcase the capabilities of the Granite models
Jupyter Notebook Creative Commons Attribution 4.0 International UpdatedMay 13, 2025 -
news-please Public
Forked from KavehKadkhoda/news-pleasenews-please - an integrated web crawler and information extractor for news that just works
Python Apache License 2.0 UpdatedMar 25, 2025 -
trust-align Public
Forked from declare-lab/trust-alignCodes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Python UpdatedMar 3, 2025 -
-
usc-tg-24-us-election Public
Forked from leonardo-blas/usc-tg-24-us-electionPython Creative Commons Attribution 4.0 International UpdatedFeb 22, 2025 -
-
-
CAG Public
Forked from hhhuang/CAGCache-Augmented Generation
Python MIT License UpdatedDec 31, 2024 -
newswire-graphs Public
Forked from poonamsahoo/newswire-graphsA graph analysis of the Dell Research Newswire Dataset, specifically focused on characterizing newspaper similarities and behaviors
Jupyter Notebook UpdatedDec 13, 2024 -
-
NExT-GPT Public
Forked from NExT-GPT/NExT-GPTCode and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Python BSD 3-Clause "New" or "Revised" License UpdatedNov 3, 2024 -
LightRAG Public
Forked from HKUDS/LightRAG"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Python MIT License UpdatedOct 20, 2024