Skip to content
View pyshka501's full-sized avatar
πŸ€”
Gaining new knowledge
πŸ€”
Gaining new knowledge

Block or report pyshka501

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pyshka501/README.md

Konstantin Pchelin (pyshka501) πŸš€

Machine Learning Researcher | Reinforcement Learning | LLM Systems

RL LLM Research


🧠 About Me

ML researcher focused on reinforcement learning and LLM-based systems.

  • Research: RL, RLHF, agent systems, vulnerability discovery
  • Building: production-grade ML systems & high-performance inference
  • Teaching: ML & RL courses with 500+ students

I work at the intersection of:

  • theory (RL, optimization, convergence)
  • systems (LLM infra, high-load serving)
  • applied research (agents, code intelligence, security)

🌐 Personal website: https://mountainai.tech/


πŸ”¬ Research Interests

  • Reinforcement Learning (from bandits β†’ PPO / GRPO / RLHF)
  • LLM agents & tool-use systems
  • Offline / Online RL for language models
  • Structural reasoning & context reconstruction in code

βš™οΈ Selected Work

  • πŸ“Š RL Atlas β€” interactive visualization of RL algorithms
    β†’ https://github.com/pyshka501/mountainai-rl-atlas

  • πŸ€— HuggingFace Spaces (interactive demos & experiments)
    β†’ https://huggingface.co/pyshka501/spaces

  • πŸ§ͺ Research on blind vulnerability discovery with LLMs
    (context reconstruction as core bottleneck)

  • ⚑ High-performance LLM inference systems
    (latency optimization, multi-token prediction, scaling)


πŸ›  Tech Stack

ML / Research

  • PyTorch, RL (TD, MC, PG, PPO, RLHF)
  • Transformers, LLM fine-tuning, evaluation
  • Statistical modeling & optimization

Systems / Infra

  • vLLM, Triton, high-load inference
  • Docker, FastAPI, distributed systems
  • Performance optimization & scaling

Data

  • pandas, NumPy, scikit-learn
  • Large-scale dataset processing

πŸ“š Teaching

Author and instructor of multiple courses:

  • Reinforcement Learning: from Bandits to PPO/GRPO & RLHF
  • Machine Learning (fundamentals β†’ production)
  • NLP & Semantic Search

πŸ“ˆ Courses launched and scaled to 500+ students

  • Mathematical rigor + intuition
  • Practical assignments & real implementations
  • Focus on modern ML systems and real-world use cases

🀝 Open to Collaboration

  • RL / LLM research
  • Agent systems & reasoning
  • ML infra & optimization
  • Early-stage AI products

πŸ“¬ Contact


Pinned Loading

  1. mountainai-rl-atlas mountainai-rl-atlas Public

    Interactive platform for exploring and visualizing reinforcement learning algorithms β€” from tabular methods to deep RL and RLHF. Compare methods, tune hyperparameters, and analyze training dynamics…

    Python 6

  2. rl-textbook rl-textbook Public

    Reinforcement Learning: From Bandits to LLM Alignment β€” Open textbook with 17 chapters, Colab notebooks, and exercises

    TeX 65 7