Skip to content

shivamnegi92/AICave

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ”₯ Deep Learning Cave: A Stone Age Retreat

"You don't need to understand how it works, just use the API."
β€” Horrible advice I once believed


πŸ—Ώ The Origin Story

Building AI from First Principles

A Tale of Modern Cavemen

It all started with horrible advice.

"Just use the API," they said. "Why build from scratch?" they asked. "It's already solved!" they proclaimed. "You don't need to understand neural networks to use them!" they confidently assured me.

And I believed them.

For months, I happily typed:

model = API.get_magic_ai("gpt-9000-ultra-mega")
result = model.generate("solve world hunger")

Life was good. Until it wasn't.

When the model failed, I stared at error messages like ancient hieroglyphics. When it hallucinated, I shrugged and tweaked the prompt 47 times. When asked "But how does attention work?" in a meeting, I froze like a deer in headlights and mumbled something about "tokens" and "weights" before excusing myself to the bathroom.

I had become an API archaeologist β€” digging through documentation, praying to the gods of Stack Overflow, and offering sacrifices to the error log deities.

One fateful day, during my 3 AM debugging session (fueled by questionable coffee β˜• and existential dread), I had an epiphany:

"What if... I actually learned how this thing works?"

Revolutionary, I know.

And thus began my journey back to the cave. Not to rediscover fire, but to rediscover how deep learning actually works. Chiseling neural networks onto stone tablets. Building Transformers with my bare hands. Creating CNNs from raw PyTorch ore.

Like my ancestors who didn't just use fire but learned to create it, I decided to retreat to fundamentals. No black boxes. No magic. No more horrible advice. Just pure mathematics, code, and the stubborn determination to understand every neuron, every gradient, every backprop.

This repository is that retreat. A stone age sanctuary where I build everything from scratch. Where "state-of-the-art" means understanding the art, not just using it. Where cave paintings become architecture diagrams, and stone tools become tensor operations.

Am I reinventing the wheel? Yes. But I'm learning why wheels are round, why square wheels fail, and how to craft better wheels for tomorrow.

Could I just use frameworks? Absolutely. But then I'd still be taking horrible adviceβ€”being a user instead of a creator, a consumer instead of a craftsman.

So grab your chisel πŸ”¨ (the PyTorch kind), light your torch πŸ”₯, and join me in this ancient-modern cave. Let's carve deep learning into stone, one implementation at a time.

Welcome to the retreat. Welcome to the cave. πŸ”οΈ


🌟 About This Repository

Welcome to Deep Learning Cave β€” my stone age retreat for mastering AI from first principles!

This isn't just another tutorial repository. It's a sanctuary for learning where we abandon modern conveniences and build everything from scratch. From basic neural networks to cutting-edge Transformers, from simple perceptrons to LLaMA architectures.

No fluff. No hand-waving. Just pure implementation.

Every line of code is explained. Every architecture decision is justified. Every notebook is executable. Every concept is built from raw materials.


🎯 What You'll Learn

By exploring this cave together, we'll master:

βœ… PyTorch fundamentals β€” The bedrock of modern deep learning
βœ… Neural network primitives β€” From perceptrons to deep architectures
βœ… Computer vision β€” CNNs, ResNets, Vision Transformers (coming soon)
βœ… Natural language processing β€” RNNs, Transformers, LLaMA
βœ… Modern architectures β€” Attention mechanisms, normalization techniques
βœ… Training strategies β€” Optimizers, schedulers, regularization
βœ… Production patterns β€” From research code to deployable models

Target Audience: Stone age learners who refuse horrible advice. Anyone who wants to truly understand AI, not just use it.


πŸ—ΊοΈ Learning Expeditions

This cave has many chambers, each teaching a different aspect of deep learning:

πŸ”₯ Chamber 1: The Fundamentals (Current - Complete)

PyTorch Essentials β†’ pytorch_functions_overview.ipynb

Master the 20 core PyTorch concepts essential for deep learning

  • Sections 1-8: Foundation (tensors, embeddings, attention mechanics)
  • Sections 9-16: Architecture (residuals, FFN, training loops)
  • Sections 17-20: Advanced (einsum, inference optimization)

Each section includes:

  • 🎯 What it does β†’ πŸ”§ Why it matters β†’ πŸ’» Code β†’ πŸ’‘ Key insight

Deep Neural Networks β†’ Included in pytorch_functions_overview.ipynb

Complete DNN training example with modern techniques

  • Multi-layer perceptrons
  • Batch normalization & dropout
  • Adam optimizer & training loops
  • Train/validation splits

πŸ›οΈ Chamber 2: Classical Architectures (Coming Soon)

Convolutional Networks β†’ cnn_from_scratch.ipynb

Visual pattern recognition

  • Convolution operations
  • Pooling layers
  • ResNet architecture
  • Image classification

Recurrent Networks β†’ rnn_from_scratch.ipynb

Sequential data processing

  • Vanilla RNNs
  • LSTMs & GRUs
  • Sequence-to-sequence models
  • Text generation

πŸ¦™ Chamber 3: Transformer Architectures (Current - Complete)

Vanilla Transformer β†’ transformer_from_scratch.ipynb

The "Attention Is All You Need" revolution

  • βœ… Complete encoder-decoder implementation
  • βœ… Multi-head attention from scratch
  • βœ… Sinusoidal positional encoding
  • βœ… Position-wise feed-forward networks
  • βœ… Layer normalization and residual connections

Key Learning: Understanding the foundational architecture that started it all.

Modern LLaMA β†’ llama_from_scratch.ipynb & llama_complete.ipynb

State-of-the-art language models

  • βœ… RoPE (Rotary Position Embeddings) β€” Better position encoding
  • βœ… RMSNorm β€” More efficient normalization than LayerNorm
  • βœ… Grouped Query Attention (GQA) β€” Memory-efficient attention
  • βœ… SwiGLU β€” Advanced activation function
  • βœ… Character-level tokenization β€” Simple but effective
  • βœ… Complete training pipeline β€” From data to generation

Key Learning: How modern LLMs differ from the original Transformer and why.

Vision Transformers β†’ vit_from_scratch.ipynb βœ…

Transformers for computer vision

  • βœ… Patch embeddings β€” Images as sequences
  • βœ… Self-attention for images β€” Global receptive field
  • βœ… Learned positional encoding β€” 1D positions for 2D images
  • βœ… CLS token classification β€” Global feature aggregation
  • βœ… Attention visualization β€” See what ViT looks at

Key Learning: How to apply Transformers to vision tasks without convolutions.

🧠 Chamber 4: Self-Supervised Learning (Current)

I-JEPA β†’ jepa_from_scratch.ipynb βœ…

Yann LeCun's vision for the future of AI

  • βœ… Multi-block masking β€” Semantic region prediction
  • βœ… EMA target encoder β€” Stable learning without collapse
  • βœ… Predictor network β€” Narrow Transformer for latent prediction
  • βœ… Smooth L1 loss β€” No pixels, no contrastive, just representations
  • βœ… Linear probing β€” Evaluate learned features

Key Learning: Predict abstract representations, not pixels β€” the next paradigm in self-supervised learning.

πŸŽ“ Chamber 5: Advanced Techniques (Planned)

Optimization Strategies

  • Adam, AdamW, Lion optimizers
  • Learning rate schedules
  • Gradient accumulation
  • Mixed precision training

Regularization Methods

  • Dropout variations
  • Data augmentation
  • Label smoothing
  • Weight decay

Model Compression

  • Quantization (8-bit, 4-bit)
  • Pruning techniques
  • Knowledge distillation
  • LoRA fine-tuning

πŸ”¬ Chamber 6: Research Frontiers (Future)

Efficient Architectures

  • Flash Attention
  • Linear attention variants
  • State Space Models (Mamba)
  • Mixture of Experts

Multi-Modal Learning

  • CLIP architecture
  • Text-to-image models
  • Cross-modal attention

Advanced Self-Supervised

  • V-JEPA (Video prediction)
  • Hierarchical JEPA
  • World models

πŸ“‚ Cave Layout (Repository Structure)

deep-learning-cave/
β”‚
β”œβ”€β”€ 1. pytorch_functions_overview.ipynb   # 20 essential PyTorch concepts + DNN example
β”œβ”€β”€ 2. transformer_from_scratch.ipynb     # Vanilla Transformer (Vaswani et al., 2017)
β”œβ”€β”€ 3. llama from scratch.ipynb           # Modern LLaMA implementation
β”œβ”€β”€ 4. vit_from_scratch.ipynb             # Vision Transformer (Dosovitskiy et al., 2020)
β”œβ”€β”€ 5. jepa_from_scratch.ipynb            # I-JEPA self-supervised learning (Assran et al., 2023)
β”‚
β”œβ”€β”€ requirements.txt                       # Project dependencies
β”œβ”€β”€ llama_checkpoint.pt                    # Trained model checkpoint
β”‚
β”œβ”€β”€ assets/
β”‚   └── origin.jpg                         # Origin story image
β”‚
└── .github/
    └── copilot-instructions.md            # Cave coding guidelines

πŸš€ Starting Your Retreat

Prerequisites

# Python 3.8+
# Install dependencies
pip install -r requirements.txt

# Or manually:
pip install torch torchvision numpy matplotlib scikit-learn

Enter the Cave

  1. Clone the repository:
git clone https://github.com/yourusername/deep-learning-cave.git
cd deep-learning-cave
  1. Start at the cave entrance (fundamentals):

    • Open pytorch_functions_overview.ipynb
    • Learn the ancient art of tensors and neural networks
  2. Explore deeper chambers:

    • Build your first Transformer: 2. transformer_from_scratch.ipynb
    • Master modern architectures: 3. llama from scratch.ipynb
    • Learn vision Transformers: 4. vit_from_scratch.ipynb
    • Explore self-supervised learning: 5. jepa_from_scratch.ipynb
  3. Carve your own path:

    • Modify examples to test understanding
    • Break things and fix them
    • Compare classical vs modern approaches

πŸ’‘ Cave Philosophy

✨ Our Stone Age Principles

  • Build everything from scratch β€” No external AI libraries (except PyTorch)
  • Understand every line β€” No magic, no "just trust me"
  • Progressive mastery β€” Start simple, earn complexity
  • Executable knowledge β€” Run and modify every example

πŸŽ“ Learning by Chiseling

  • Carve, don't copy β€” Implement, don't just read
  • Break things β€” Modify code, see what happens
  • Ask "why" β€” Every design choice has a reason
  • Compare eras β€” Classical vs modern approaches

πŸ”§ Craftsman Patterns

  • Proper training rituals β€” Gradient clipping, checkpointing, validation splits
  • Sacred geometry β€” Shape checking, dimension tracking
  • Tool mastery β€” Temperature sampling, beam search, optimization
  • Cave paintings β€” Visual diagrams, step-by-step traces

πŸ—Ώ Stone Tablets (Learning Paths)

🟒 Apprentice (Beginner)

Just arrived at the cave, knows basic Python

  1. pytorch_functions_overview.ipynb (sections 1-8)
  2. Run and modify the DNN example
  3. Build transformer_from_scratch.ipynb step-by-step
  4. Experiment with small modifications

Time investment: 2-3 weeks
Milestone: Successfully train a simple neural network

🟑 Craftsman (Intermediate)

Comfortable with PyTorch, ready for architectures

  1. Complete 1. pytorch_functions_overview.ipynb (all 20 sections)
  2. Build 2. transformer_from_scratch.ipynb independently
  3. Compare vanilla Transformer with 3. llama from scratch.ipynb
  4. Understand modern improvements (RoPE, GQA, SwiGLU)
  5. Build 4. vit_from_scratch.ipynb for vision understanding

Time investment: 1-2 months
Milestone: Implement Transformer without reference

πŸ”΄ Master (Advanced)

Deep understanding, ready to innovate

  1. Master all notebooks in the cave
  2. Implement 5. jepa_from_scratch.ipynb β€” self-supervised learning
  3. Optimize for speed and memory
  4. Contribute new tutorials or chambers

Time investment: 3-6 months
Milestone: Create a novel architecture variation


🀝 Join the Tribe (Contributing)

This cave grows with each visitor! Contributions welcome:

  • πŸ› Fix broken stones β€” Found a bug? Patch it!
  • πŸ“ Improve cave paintings β€” Better explanations
  • πŸŽ“ Add new chambers β€” New architectures or techniques (CNNs, RNNs, etc.)
  • πŸ’‘ Share wisdom β€” Better teaching methods

Open an issue to discuss major expeditions.


🌟 Support This Retreat

If this cave helped you, please:

  • ⭐ Star this repository β€” Help others find the cave
  • πŸ”„ Share your journey β€” Tell your tribe
  • πŸ’¬ Provide feedback β€” What chamber should we build next?
  • πŸͺ¨ Contribute β€” Add your own stone tablets

πŸ“« Find Me (The Cave Elder)

I carved this cave to make deep learning accessible to myself and others. Let's connect!

LinkedIn GitHub

Open to:

  • πŸ’Ό Collaborating on educational AI projects
  • 🎀 Speaking about deep learning fundamentals
  • πŸ’¬ Discussing the stone age approach to learning
  • πŸ”οΈ Organizing learning retreats

πŸ“š Ancient Scrolls (References)

Sacred Texts (Papers)

Fellow Cave Explorers


πŸ“„ Cave Laws (License)

MIT License β€” Share the knowledge freely, like cave paintings.


πŸ™ Gratitude to Fellow Travelers

  • Vaswani et al. for the Transformer revolution
  • Meta AI for open-sourcing LLaMA
  • PyTorch team for the ultimate stone age tools
  • The open-source tribe for endless learning resources
  • Every learner who refuses horrible advice

🎯 Expedition Status

Current Phase: βœ… Core chambers complete (PyTorch, Transformers, LLaMA, ViT, JEPA)
Next Expedition: 🚧 Building CNN and RNN chambers (Classical Architectures)
Long-term Vision: 🌟 Complete stone age retreat covering all deep learning


Carved with ❀️ by a stone age learner, for stone age learners

"In the beginner's mind there are many possibilities, in the expert's mind there are few." β€” Shunryu Suzuki

πŸ”₯πŸ—ΏπŸ”οΈ

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors