Build software better, together

maestrosalah-dev / relational-time-engine

Relational Time Engine (RTE): runtime density regulation for compute-efficient transformer inference. Demonstrates up to 75% layer reduction with improved latency and throughput.

python benchmark machine-learning deep-learning transformer event-driven energy-efficiency ai-systems inference-optimization early-exit transformer-architecture green-ai ai-optimization runtime-systems efficient-ai energy-efficient-ai transformer-inference-overhead runtime-gating relational-time

Updated Mar 12, 2026
Python

gbyuvd / Mo2BERTa-v2-proto

Star

Frozen KV Context for Mixture-of-Recursions on a Modernized BERT

research representation-learning bert encoder-model adaptive-computation masked-language-modeling efficient-ai mixture-of-recursions frozen-kv

Updated Mar 31, 2026
Jupyter Notebook

rafaelmirzoyan / edge-ai-person-detection

Star

End-to-end model compression pipeline using architecture reduction, knowledge distillation, pruning, and INT8 quantization. Achieves 83.43% accuracy, 3.31 ms latency, and 0.220 MB size, optimized for efficient edge inference.

computer-vision model-compression efficient-ai

Updated Mar 19, 2026

sujin-1013 / task-aware-DMO

Star

Task-Aware Dynamic Model Optimization for Multi-Task Learning (IEEE Access 2023)

deep-learning mtl multi-task-learning model-compression decathlon ai-research lightweight-model efficient-ai

Updated Nov 11, 2025

priyanshujiiii / awesome-Quantization

Sponsor

Star

In this repo you will understand .The process of reducing the precision of a model’s parameters and/or activations (e.g., from 32-bit floating point to 8-bit integers) to make neural networks smaller, faster, and more energy-efficient with minimal accuracy loss.

deep-learning neural-networks quantization zero-shot model-compression mixed-precision edge-ai hardware-aware data-free model-optimization quantization-aware-training post-training-quantization efficient-ai

Updated Aug 11, 2025

tripptytrip / Symbolic-Transformers

Star

Symbolic Transformers: 2.2MB models for logical reasoning. Achieves 47% accuracy with 566K parameters—220× smaller than GPT-2. Proves data quality > model size for symbolic AI. 🔬 Novel base-625 symbolic encoding | 🚀 Edge-deployable | 📊 Open research

machine-learning research first-order-logic transformers edge-ai symbolic-reasoning tinyml neuro-symbolic-ai efficient-ai

Updated Dec 23, 2025
Python

fangvv / ACS

Star

Code for paper "Dynamic Deep Neural Network Inference via Adaptive Channel Skipping"

deep-neural-networks bigdata dnn neural-networks iot-application edge-computing group-convolution algorithm-optimization edge-intelligence ai-algorithms edge-inference efficient-ai

Updated Aug 22, 2025
Python

LumGenLab / LumGPT

Star

Transformer (GPT) implemented from scratch in C++. Runs on modest hardware with complete mathematical derivations and optimized tensor operations.

deep-learning transformer cpp17 gpt language-model efficient-ai opensource-llm lumgenlab

Updated Jan 6, 2026
C++

EGen-V / Transformer-Hierarchical-Layers

Star

A non-Transformer hierarchical recurrent network with differentiable Gumbel-Softmax routing and bounded memory slots. Runs 7B+ parameter models layer-by-layer on low-budget GPUs.

deep-learning pytorch recurrence attention-mechanism hardware-acceleration memory-augmented-neural-networks llm efficient-ai memory-augmented infinite-context

Updated Jan 15, 2026
Python

diesimo-ai / edge-language

Star

An open and practical guide to Edge Language

slm smol embedded-ai edge-ai edge-intelligence green-ai llm small-language-models frugal-ai efficient-ai ai-efficiency greener-ai

Updated Nov 14, 2025

sebasmos / curious-qmoe

Star

🔬 Curiosity-Driven Quantized Mixture of Experts

pytorch audio-classification mixture-of-experts model-quantization efficient-ai

Updated Mar 24, 2026
Python

paredezadrian / mocanet

Star

MOCA-Net: Novel neural architecture with sparse MoE, external memory, and budget-aware computation. Real Stanford SST-2 integration, O(L) complexity, 96.40% accuracy. Built for efficient sequence modeling.

deep-learning sentiment-analysis pytorch neural-networks research-tool external-memory mixture-of-experts sequence-modeling budget-optimization sst2 efficient-ai

Updated Aug 16, 2025
Python

Pro-GenAI / Gen-UI-Lang

Star

⚡ Fast, concise, LLM-first Generative UI language

ai llms generative-ai gen-ai genai generative-ui efficient-ai ai-scalability gen-ai-app efficient-llms

Updated Dec 20, 2025
Python

arutovan-droid / symbion-trm-integration

Star

"TRM (Tiny Recursive Model) integration architecture for Symbion.space ecosystem"

trm symbiont reasoning-systems ai-orchestration efficient-ai tiny-recursive-model recursive-reasoning problem-structured-language geobench

Updated Oct 24, 2025

gbyuvd / Mo2BERTa-proto

Star

Mixture-of-Recursions on a Modernized BERT (Prototype)

research representation-learning bert encoder-model adaptive-computation masked-language-modeling efficient-ai mixture-of-recursions

Updated Mar 27, 2026
Jupyter Notebook

abdulvahapmutlu / quantlab-8bit

Star

QuantLab-8bit is a reproducible benchmark of 8-bit quantization on compact vision backbones. It includes FP32 baselines, PTQ (dynamic & static), QAT, ONNX exports, parity checks, ORT CPU latency, and visual diagnostics.

benchmarking computer-vision deep-learning pytorch reproducibility quantization model-compression onnx gradcam low-precision edge-ai onnxruntime streamlit model-optimization quantization-aware-training post-training-quantization efficient-ai

Updated Sep 25, 2025
Python

Shikha-code36 / early-exit-cnn

Star

A deep learning framework that implements Early Exit strategies in Convolutional Neural Networks (CNNs) using Deep Q-Learning (DQN). This project enhances computational efficiency by dynamically determining the optimal exit point in a neural network for image classification tasks on CIFAR-10.

reinforcement-learning deep-learning cnn pytorch dqn image-classification cifar10 cifar-10 pytorch-cnn cnn-pytorch cifar10-classification early-exit model-optimization efficient-ai

Updated Feb 23, 2025
Jupyter Notebook

sumeyye-agac / awesome-efficient-har

Star

An awesome list of papers, datasets, and tools for efficient sensor-based Human Activity Recognition (HAR), with a focus on lightweight and edge-friendly deep learning.

awesome awesome-list human-activity-recognition wearable-computing research-resources edge-ai tinyml efficient-deep-learning efficient-ai sensor-based-activity-recognition resource-efficient-ml

Updated Mar 5, 2026
Python

ResponsibleAILab / DAM

Star

Dynamic Attention Mask (DAM) generate adaptive sparse attention masks per layer and head for Transformer models, enabling long-context inference with lower compute and memory overhead without fine-tuning.

inference-optimization sparse-attention efficient-ai

Updated Jun 16, 2025
Python

techgyan123 / Transformer-Hierarchical-Layers

Star

🌟 Build efficient models with Transformer Hierarchical Layers for powerful text processing and enhanced performance in natural language tasks.

deep-learning pytorch recurrence attention-mechanism hardware-acceleration backbone-networks memory-augmented-neural-networks vision-transformer llm mamba-state-space-models efficient-ai aaai2025 memory-augmented infinite-context

Updated Apr 13, 2026
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

efficient-ai

Here are 37 public repositories matching this topic...

maestrosalah-dev / relational-time-engine

gbyuvd / Mo2BERTa-v2-proto

rafaelmirzoyan / edge-ai-person-detection

sujin-1013 / task-aware-DMO

priyanshujiiii / awesome-Quantization

tripptytrip / Symbolic-Transformers

fangvv / ACS

LumGenLab / LumGPT

EGen-V / Transformer-Hierarchical-Layers

diesimo-ai / edge-language

sebasmos / curious-qmoe

paredezadrian / mocanet

Pro-GenAI / Gen-UI-Lang

arutovan-droid / symbion-trm-integration

gbyuvd / Mo2BERTa-proto

abdulvahapmutlu / quantlab-8bit

Shikha-code36 / early-exit-cnn

sumeyye-agac / awesome-efficient-har

ResponsibleAILab / DAM

techgyan123 / Transformer-Hierarchical-Layers

Improve this page

Add this topic to your repo