-
synthid-text Public
Forked from google-deepmind/synthid-textPython Apache License 2.0 UpdatedDec 13, 2024 -
AI-Data-Analysis-MultiAgent Public
Forked from starpig1129/DATAGENAI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation,…
Python MIT License UpdatedDec 4, 2024 -
ai-workshop-code Public
Forked from trancethehuman/ai-workshop-codeCode I wrote for my AI & LLM workshops
Jupyter Notebook UpdatedDec 4, 2024 -
MasteringRAG Public
Forked from Steven-Luo/MasteringRAG企业级RAG系统从入门到精通
Jupyter Notebook MIT License UpdatedNov 27, 2024 -
nsfw_detector Public
Forked from tmplink/nsfw_detectorSolution for checking file if contain NSFW content.
Python Apache License 2.0 UpdatedNov 20, 2024 -
promptimizer Public
Forked from hinthornw/promptimizerPrompt optimization scratch
Python MIT License UpdatedNov 18, 2024 -
paper-reading Public
Forked from mli/paper-reading深度学习经典、新论文逐段精读
Apache License 2.0 UpdatedNov 17, 2024 -
Awesome-LLM-Synthetic-Data Public
Forked from wasiahmad/Awesome-LLM-Synthetic-DataA reading list on LLM based Synthetic Data Generation 🔥
MIT License UpdatedNov 5, 2024 -
chunkr Public
Forked from lumina-ai-inc/chunkrVision model based PDF chunking.
Python GNU Affero General Public License v3.0 UpdatedOct 13, 2024 -
Pyramid-Flow Public
Forked from jy0205/Pyramid-FlowCode of Pyramidal Flow Matching for Efficient Video Generative Modeling
Python MIT License UpdatedOct 13, 2024 -
swarm Public
Forked from openai/swarmEducational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Python MIT License UpdatedOct 12, 2024 -
Aria Public
Forked from rhymes-ai/AriaCodebase for Aria - an Open Multimodal Native MoE
Jupyter Notebook Apache License 2.0 UpdatedOct 10, 2024 -
libcom Public
Forked from bcmi/libcomImage composition toolbox: everything you want to know about image composition or object insertion
Python Apache License 2.0 UpdatedOct 8, 2024 -
ProX Public
Forked from GAIR-NLP/ProXOffical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"
Python Apache License 2.0 UpdatedSep 26, 2024 -
flux_triton Public
Forked from timudk/flux_tritonWriting FLUX in Triton
Python Apache License 2.0 UpdatedSep 22, 2024 -
MegaParse Public
Forked from QuivrHQ/MegaParseFile Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Python Apache License 2.0 UpdatedJun 14, 2024 -
datatrove Public
Forked from huggingface/datatroveFreeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Python Apache License 2.0 UpdatedJun 14, 2024 -
open-speech-corpora Public
Forked from coqui-ai/open-speech-corpora💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
MIT License UpdatedJun 6, 2024 -
Adala Public
Forked from HumanSignal/AdalaAdala: Autonomous DAta (Labeling) Agent framework
Python Apache License 2.0 UpdatedMay 31, 2024 -
khoj Public
Forked from khoj-ai/khojYour AI second brain. Get answers to your questions, whether they be online or in your own notes. Use foundation models or private, local LLMs. Self-host locally or use our cloud instance. Access f…
Python GNU Affero General Public License v3.0 UpdatedMay 27, 2024 -
The-Ph.D.-journey-scenery Public
Forked from CHAOZHAO-1/The-Ph.D.-journey-scenery收录了若干读博中所遇到的问题和相关资料
UpdatedMay 20, 2024 -
OpenRefine Public
Forked from OpenRefine/OpenRefineOpenRefine is a free, open source power tool for working with messy data and improving it
Java BSD 3-Clause "New" or "Revised" License UpdatedApr 29, 2024 -
unstructured Public
Forked from Unstructured-IO/unstructuredOpen source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
HTML Apache License 2.0 UpdatedApr 19, 2024 -
Open-Sora Public
Forked from hpcaitech/Open-SoraOpen-Sora: Democratizing Efficient Video Production for All
Python Apache License 2.0 UpdatedApr 15, 2024 -
Douyin_TikTok_Download_API Public
Forked from Evil0ctal/Douyin_TikTok_Download_API🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Python MIT License UpdatedMar 27, 2024 -
lilac Public
Forked from databricks/lilacCurate better data for LLMs
Python Apache License 2.0 UpdatedMar 19, 2024 -
-
sensitive-word Public
Forked from houbb/sensitive-word👮♂️The sensitive word tool for java.(敏感词/违禁词/违法词/脏词。基于 DFA 算法实现的高性能 java 敏感词过滤工具框架。请勿发布涉及政治、广告、营销、翻墙、违反国家法律法规等内容。高性能敏感词检测过滤组件,附带繁体简体互换,支持全角半角互换,汉字转拼音,模糊搜索等功能。)
Java Apache License 2.0 UpdatedFeb 28, 2024 -
OLMo-Eval Public
Forked from allenai/OLMo-EvalEvaluation suite for LLMs
Python Apache License 2.0 UpdatedJan 31, 2024 -
Prompt-Engineering-Guide Public
Forked from dair-ai/Prompt-Engineering-Guide🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
MDX MIT License UpdatedJan 22, 2024