• Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Get the most trusted enterprise browser Icon
    Get the most trusted enterprise browser

    Advanced built-in security helps IT prevent breaches before they happen

    Defend against security incidents with Chrome Enterprise. Create customizable controls, manage extensions and set proactive alerts to keep your data and employees protected without slowing down productivity.
    Download Chrome
  • 1
    bge-large-en-v1.5

    bge-large-en-v1.5

    BGE-Large v1.5: High-accuracy English embedding model for retrieval

    BAAI/bge-large-en-v1.5 is a powerful English sentence embedding model designed by the Beijing Academy of Artificial Intelligence to enhance retrieval-augmented language model systems. It uses a BERT-based architecture fine-tuned to produce high-quality dense vector representations optimized for sentence similarity, search, and retrieval. This model is part of the BGE (BAAI General Embedding) family and delivers improved similarity distribution and state-of-the-art results on the MTEB benchmark. It is recommended for use in document retrieval tasks, semantic search, and passage reranking, particularly when paired with a reranker like BGE-Reranker. The model supports inference through multiple frameworks, including FlagEmbedding, Sentence-Transformers, LangChain, and Hugging Face Transformers. It accepts English text as input and returns normalized 1024-dimensional embeddings suitable for cosine similarity comparisons.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    bge-small-en-v1.5

    bge-small-en-v1.5

    Compact English sentence embedding model for semantic search tasks

    BAAI/bge-small-en-v1.5 is a lightweight English sentence embedding model developed by the Beijing Academy of Artificial Intelligence (BAAI) as part of the BGE (BAAI General Embedding) series. Designed for dense retrieval, semantic search, and similarity tasks, it produces 384-dimensional embeddings that can be used to compare and rank sentences or passages. This version (v1.5) improves similarity distribution, enhancing performance without the need for special query instructions. The model is optimized for speed and efficiency, making it suitable for resource-constrained environments. It is compatible with popular libraries such as FlagEmbedding, Sentence-Transformers, and Hugging Face Transformers. The model achieves competitive results on the MTEB benchmark, especially in retrieval and classification tasks. With only 33.4M parameters, it provides a strong balance of accuracy and performance for English-only use cases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    fairseq-lua

    fairseq-lua

    Facebook AI Research Sequence-to-Sequence Toolkit

    fairseq-lua is the original Lua/Torch7 version of Facebook AI Research’s sequence modeling toolkit, designed for neural machine translation (NMT) and sequence generation. It introduced early attention-based architectures and training pipelines that later evolved into the modern PyTorch-based fairseq. The framework implements sequence-to-sequence models with attention, beam search decoding, and distributed training, providing a research platform for exploring translation, summarization, and language modeling. Its modular design made it easy to prototype new architectures by modifying encoders, decoders, or attention mechanisms. Although now deprecated in favor of the PyTorch rewrite, fairseq-lua played a key role in advancing large-scale NMT systems, such as early versions of Facebook’s production translation models. It remains an important historical reference for neural sequence learning frameworks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    fashion-clip

    fashion-clip

    CLIP model fine-tuned for zero-shot fashion product classification

    FashionCLIP is a domain-adapted CLIP model fine-tuned specifically for the fashion industry, enabling zero-shot classification and retrieval of fashion products. Developed by Patrick John Chia and collaborators, it builds on the CLIP ViT-B/32 architecture and was trained on over 800K image-text pairs from the Farfetch dataset. The model learns to align product images and descriptive text using contrastive learning, enabling it to perform well across various fashion-related tasks without additional supervision. FashionCLIP 2.0, the latest version, uses the laion/CLIP-ViT-B-32-laion2B-s34B-b79K checkpoint for improved accuracy, achieving better F1 scores across multiple benchmarks compared to earlier versions. It supports multilingual fashion queries and works best with clean, product-style images against white backgrounds. The model can be used for product search, recommendation systems, or visual tagging in e-commerce platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    gpt-oss-120b

    gpt-oss-120b

    OpenAI’s open-weight 120B model optimized for reasoning and tooling

    GPT-OSS-120B is a powerful open-weight language model by OpenAI, optimized for high-level reasoning, tool use, and agentic tasks. With 117B total parameters and 5.1B active parameters, it’s designed to fit on a single H100 GPU using native MXFP4 quantization. The model supports fine-tuning, chain-of-thought reasoning, and structured outputs, making it ideal for complex workflows. It operates in OpenAI’s Harmony response format and can be deployed via Transformers, vLLM, Ollama, LM Studio, and PyTorch. Developers can control the reasoning level (low, medium, high) to balance speed and depth depending on the task. Released under the Apache 2.0 license, it enables both commercial and research applications. The model supports function calling, web browsing, and code execution, streamlining intelligent agent development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    gpt-oss-20b

    gpt-oss-20b

    OpenAI’s compact 20B open model for fast, agentic, and local use

    GPT-OSS-20B is OpenAI’s smaller, open-weight language model optimized for low-latency, agentic tasks, and local deployment. With 21B total parameters and 3.6B active parameters (MoE), it fits within 16GB of memory thanks to native MXFP4 quantization. Designed for high-performance reasoning, it supports Harmony response format, function calling, web browsing, and code execution. Like its larger sibling (gpt-oss-120b), it offers adjustable reasoning depth and full chain-of-thought visibility for better interpretability. It’s released under a permissive Apache 2.0 license, allowing unrestricted commercial and research use. GPT-OSS-20B is compatible with Transformers, vLLM, Ollama, PyTorch, and other tools. It is ideal for developers building lightweight AI agents or experimenting with fine-tuning on consumer-grade hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    granite-timeseries-ttm-r2

    granite-timeseries-ttm-r2

    Tiny pre-trained IBM model for multivariate time series forecasting

    granite-timeseries-ttm-r2 is part of IBM’s TinyTimeMixers (TTM) series—compact, pre-trained models for multivariate time series forecasting. Unlike massive foundation models, TTM models are designed to be lightweight yet powerful, with only ~805K parameters, enabling high performance even on CPU or single-GPU machines. The r2 version is pre-trained on ~700M samples (r2.1 expands to ~1B), delivering up to 15% better accuracy than the r1 version. TTM supports both zero-shot and fine-tuned forecasting, handling minutely, hourly, daily, and weekly resolutions. It can integrate exogenous variables, static categorical features, and perform channel-mixing for richer multivariate forecasting. The get_model() utility makes it easy to auto-select the best TTM model for specific context and prediction lengths. These models significantly outperform benchmarks like Chronos, GPT4TS, and Moirai while demanding a fraction of the compute.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    layoutlm-base-uncased

    layoutlm-base-uncased

    Multimodal Transformer for document image understanding and layout

    layoutlm-base-uncased is a multimodal transformer model developed by Microsoft for document image understanding tasks. It incorporates both text and layout (position) features to effectively process structured documents like forms, invoices, and receipts. This base version has 113 million parameters and is pre-trained on 11 million documents from the IIT-CDIP dataset. LayoutLM enables better performance in tasks where the spatial arrangement of text plays a crucial role. The model uses a standard BERT-like architecture but enriches input with 2D positional embeddings. It achieves state-of-the-art results in form understanding and information extraction benchmarks. This model is particularly useful for document AI applications like document classification, question answering, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    minGPT

    minGPT

    A minimal PyTorch re-implementation of the OpenAI GPT

    minGPT is a minimalist, educational re-implementation of the GPT (Generative Pretrained Transformer) architecture built in PyTorch, designed by Andrej Karpathy to expose the core structure of a transformer-based language model in as few lines of code as possible. It strips away extraneous bells and whistles, aiming to show how a sequence of token indices is fed into a stack of transformer blocks and then decoded into the next token probabilities, with both training and inference supported. Because the whole model is around 300 lines of code, users can follow each step—from embedding lookup, positional encodings, multi-head attention, feed-forward layers, to output heads—and thus demystify how GPT-style models work beneath the surface. It provides a practical sandbox for experimentation, letting learners tweak the architecture, dataset, or training loop without being overwhelmed by framework abstraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The All-in-One Commerce Platform for Businesses - Shopify Icon
    The All-in-One Commerce Platform for Businesses - Shopify

    Shopify offers plans for anyone that wants to sell products online and build an ecommerce store, small to mid-sized businesses as well as enterprise

    Shopify is a leading all-in-one commerce platform that enables businesses to start, build, and grow their online and physical stores. It offers tools to create customized websites, manage inventory, process payments, and sell across multiple channels including online, in-person, wholesale, and global markets. The platform includes integrated marketing tools, analytics, and customer engagement features to help merchants reach and retain customers. Shopify supports thousands of third-party apps and offers developer-friendly APIs for custom solutions. With world-class checkout technology, Shopify powers over 150 million high-intent shoppers worldwide. Its reliable, scalable infrastructure ensures fast performance and seamless operations at any business size.
    Learn More
  • 10
    mms-300m-1130-forced-aligner

    mms-300m-1130-forced-aligner

    CTC-based forced aligner for audio-text in 158 languages

    mms-300m-1130-forced-aligner is a multilingual forced alignment model based on Meta’s MMS-300M wav2vec2 checkpoint, adapted for Hugging Face’s Transformers library. It supports forced alignment between audio and corresponding text across 158 languages, offering broad multilingual coverage. The model enables accurate word- or phoneme-level timestamping using Connectionist Temporal Classification (CTC) emissions. Unlike other tools, it provides significant memory efficiency compared to the TorchAudio forced alignment API. Users can integrate it easily through the Python package ctc-forced-aligner, and it supports GPU acceleration via PyTorch. The alignment pipeline includes audio processing, emission generation, tokenization, and span detection, making it suitable for speech analysis, transcription syncing, and dataset creation. This model is especially useful for researchers and developers working with low-resource languages or building multilingual speech systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    pyllama

    pyllama

    LLaMA: Open and Efficient Foundation Language Models

    📢 pyllama is a hacked version of LLaMA based on original Facebook's implementation but more convenient to run in a Single consumer grade GPU.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    roberta-base

    roberta-base

    Robust BERT-based model for English with improved MLM training

    roberta-base is a robustly optimized variant of BERT, pretrained on a significantly larger corpus of English text using dynamic masked language modeling. Developed by Facebook AI, RoBERTa improves on BERT by removing the Next Sentence Prediction objective, using longer training, larger batches, and more data, including BookCorpus, English Wikipedia, CC-News, OpenWebText, and Stories. It captures contextual representations of language by masking 15% of input tokens and predicting them. RoBERTa is designed to be fine-tuned for a wide range of NLP tasks such as classification, QA, and sequence labeling, achieving strong performance on the GLUE benchmark and other downstream applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    rwkv.cpp

    rwkv.cpp

    INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

    Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported. RWKV is a novel large language model architecture, with the largest model in the family having 14B parameters. In contrast to Transformer with O(n^2) attention, RWKV requires only state from the previous step to calculate logits. This makes RWKV very CPU-friendly on large context lengths.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    t5-base

    t5-base

    Flexible text-to-text transformer model for multilingual NLP tasks

    t5-base is a pre-trained transformer model from Google’s T5 (Text-To-Text Transfer Transformer) family that reframes all NLP tasks into a unified text-to-text format. With 220 million parameters, it can handle a wide range of tasks, including translation, summarization, question answering, and classification. Unlike traditional models like BERT, which output class labels or spans, T5 always generates text outputs. It was trained on the C4 dataset, along with a variety of supervised NLP benchmarks, using both unsupervised denoising and supervised objectives. The model supports multiple languages, including English, French, Romanian, and German. Its flexible architecture and consistent input/output format simplify model reuse and transfer learning across different NLP tasks. T5-base achieves competitive performance across 24 language understanding tasks, as documented in its research paper.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    t5-small

    t5-small

    T5-Small: Lightweight text-to-text transformer for NLP tasks

    T5-Small is a lightweight variant of the Text-To-Text Transfer Transformer (T5), designed to handle a wide range of NLP tasks using a unified text-to-text approach. Developed by researchers at Google, this model reframes all tasks—such as translation, summarization, classification, and question answering—into the format of input and output as plain text strings. With only 60 million parameters, T5-Small is compact and suitable for fast inference or deployment in constrained environments. It was pretrained on the C4 dataset using both unsupervised denoising and supervised learning on tasks like sentiment analysis, NLI, and QA. Despite its size, it performs competitively across 24 NLP benchmarks, making it a strong candidate for prototyping and fine-tuning. T5-Small is compatible with major deep learning frameworks including PyTorch, TensorFlow, JAX, and ONNX. The model is open-source under the Apache 2.0 license and has wide support across Hugging Face's ecosystem.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    unidepth-v2-vitl14

    unidepth-v2-vitl14

    Metric monocular depth estimation (vision model)

    Estimates absolute (metric) depth from single RGB images, along with camera intrinsics and uncertainty. Designed to generalize across domains (zero-shot) using a self‑prompting camera module and pseudo-spherical prediction space.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    wav2vec2-large-xlsr-53-portuguese

    wav2vec2-large-xlsr-53-portuguese

    Portuguese ASR model fine-tuned on XLSR-53 for 16kHz audio input

    wav2vec2-large-xlsr-53-portuguese is an automatic speech recognition (ASR) model fine-tuned on Portuguese using the Common Voice 6.1 dataset. It is based on Facebook’s wav2vec2-large-xlsr-53, a multilingual self-supervised learning model, and is optimized to transcribe Portuguese speech sampled at 16kHz. The model performs well without a language model, though adding one can improve word error rate (WER) and character error rate (CER). It achieves a WER of 11.3% (or 9.01% with LM) on Common Voice test data, demonstrating high accuracy for a single-language ASR model. Inference can be done using HuggingSound or via a custom PyTorch script using Hugging Face Transformers and Librosa. Training scripts and evaluation methods are open source and available on GitHub. It is released under the Apache 2.0 license and intended for ASR tasks in Brazilian Portuguese.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    wav2vec2-large-xlsr-53-russian

    wav2vec2-large-xlsr-53-russian

    Russian ASR model fine-tuned on Common Voice and CSS10 datasets

    wav2vec2-large-xlsr-53-russian is a fine-tuned automatic speech recognition (ASR) model based on Facebook’s wav2vec2-large-xlsr-53 and optimized for Russian. It was trained using Mozilla’s Common Voice 6.1 and CSS10 datasets to recognize Russian speech with high accuracy. The model operates best with audio sampled at 16kHz and can transcribe Russian speech directly without a language model. It achieves a Word Error Rate (WER) of 13.3% and Character Error Rate (CER) of 2.88% on the Common Voice test set, with even better results when used with a language model. The model supports both PyTorch and JAX and is compatible with the Hugging Face Transformers and HuggingSound libraries. It is ideal for Russian voice transcription tasks in research, accessibility, and interface development. The training was made possible with compute support from OVHcloud, and the training scripts are publicly available for replication.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    xFormers

    xFormers

    Hackable and optimized Transformers building blocks

    xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily. One of its key goals is efficient attention: it supports dense, sparse, low-rank, and approximate attention mechanisms (e.g. FlashAttention, Linformer, Performer) via interchangeable modules. The library includes memory-efficient operator implementations in both Python and optimized C++/CUDA, ensuring that performance isn’t sacrificed for modularity. It also integrates with PyTorch seamlessly so you can drop in its blocks to existing models, replace default attention layers, or build new architectures from scratch. xformers includes training, deployment, and memory profiling tools.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.