Skip to content
View rishiswethan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report rishiswethan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS'22] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Jupyter Notebook 22 3 Updated Jan 19, 2024

This CNN is capable of diagnosing breast cancer from an eosin stained image. This model was trained using 400 images. It has an accuracy of 80%

Python 65 36 Updated Apr 15, 2023

Python3 library for downloading YouTube Videos.

Python 1,338 171 Updated Oct 7, 2025

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 2,071 127 Updated Aug 7, 2025

The official Meta Llama 3 GitHub site

Python 29,022 3,469 Updated Jan 26, 2025

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 3,077 281 Updated Jun 4, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 5,902 558 Updated Feb 26, 2025

OpenMMLab YOLO series toolbox and benchmark. Implemented RTMDet, RTMDet-Rotated,YOLOv5, YOLOv6, YOLOv7, YOLOv8,YOLOX, PPYOLOE, etc.

Python 3,312 607 Updated Jul 14, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 59,695 10,578 Updated Oct 9, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,689 2,640 Updated Aug 12, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 9,013 922 Updated Aug 12, 2024

The official implementation of "Divergence of Features and Mean: A BatchNorm-based Abnormality Criterion for Weakly Supervised Video Anomaly Detection"

Python 65 15 Updated Nov 30, 2023

Images to inference with no labeling (use foundation models to train supervised models).

Python 2,412 197 Updated May 14, 2025

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,671 441 Updated May 29, 2024

Classification of Fundus Images into 5 stages of Diabetic Retinopathy, and segmentation of blood vessels in fundus images

Python 16 2 Updated Sep 18, 2023

Refine high-quality datasets and visual AI models

Python 9,927 673 Updated Oct 9, 2025

Zero-shot crack detection with SAM and Grounding DINO.

Python 4 1 Updated Nov 9, 2023

PLVS is a real-time SLAM system with points, lines, volumetric mapping and 3D unsupervised incremental segmentation.

C++ 522 77 Updated Sep 21, 2025

Efficient vision foundation models for high-resolution generation and perception.

Python 3,093 232 Updated Sep 5, 2025

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.

Jupyter Notebook 4,924 565 Updated Sep 17, 2024

The repo contains an audio emotion detection model, facial emotion detection model, and a model that combines both these models to predict emotions from a video

Jupyter Notebook 85 23 Updated Sep 13, 2023

Package for imputing the arterial blood pressure (ABP) waveform from non-invasive physiological waveforms (PPG & ECG) using a deep neural network

Python 32 6 Updated Jul 24, 2022

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.

Python 6,832 501 Updated May 31, 2024

Code repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)

Jupyter Notebook 281 31 Updated Jan 19, 2024

Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.

Python 9,106 1,828 Updated Apr 22, 2022

ImageBind One Embedding Space to Bind Them All

Python 8,809 827 Updated Oct 3, 2025

Feature rich WhatsApp Client for Desktop Linux

C++ 2,600 71 Updated Nov 1, 2024

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 52,030 6,094 Updated Sep 18, 2024

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Python 9,271 2,769 Updated Aug 13, 2024
Next