Skip to content
View davidkimai's full-sized avatar
💭
optimizing my reward function
💭
optimizing my reward function

Block or report davidkimai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. Context-Engineering Context-Engineering Public

    "Context engineering is the delicate art and science of filling the context window with just the right information for the next step." — Andrej Karpathy. A frontier, first-principles handbook inspi…

    Python 7.1k 792

  2. quant-lab quant-lab Public

    An AI Quant Trading Lab

    Python 2

  3. universal-deep-research universal-deep-research Public

    A minimal practical implementation of NVIDIA's Universal Deep Research system. https://arxiv.org/abs/2509.00244

    Python 4

  4. aisecforge aisecforge Public

    AISecForge is an experimental open-source resource for systematic adversarial testing, evaluation, and security hardening of large language models. This repository consolidates novel methodologies …

  5. RL101 RL101 Public

    Agentic Reinforcement Learning 101. A pragmatic course for AI/ML Engineers based on "The Landscape of Agentic Reinforcement Learning for LLMs: A Survey" https://arxiv.org/abs/2509.02547

    Roff 4

  6. google-logistic-regression google-logistic-regression Public

    A minimal, dataset-agnostic implementation for training, evaluating, and deploying logistic regression models. Built for enterprise production with configuration-driven experiments, comprehensive m…

    Python 1