Skip to content
View winstonsmith1897's full-sized avatar
🍺
🍺

Block or report winstonsmith1897

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
winstonsmith1897/README.md

Marco Simoni

Ph.D. in Artificial Intelligence | LLM Alignment & Foundation Models

I hold a Ph.D. in Artificial Intelligence from Sapienza University of Rome, specializing in Reinforcement Learning (RL) for Large Language Model (LLM) post-training and alignment. My research and engineering work focuses on developing Transformer-based Foundation Models and building autonomous reasoning agents.


Core Focus

My work heavily focuses on:

  • LLM Alignment & RL: Designing policy optimization algorithms (PPO, GRPO, GTPO) to mitigate LLM policy collapse and improve reasoning capabilities.
  • LLM Architecture & Retrieval: Building LLM architectures from scratch (Sparse MoE, ROPE, GQA) and implementing complex RAG frameworks for knowledge extraction.
  • AI for Cybersecurity: Engineering Foundation Models for cyber attack prediction and modeling Knowledge Graphs for Cyber Threat Intelligence (CTI).

Technical Skills

  • Machine Learning & Frameworks: Python, PyTorch, TensorFlow, JAX/Flax, HuggingFace, LangChain, Unsloth, vLLM, TRL.
  • Cybersecurity & Data: NetworkX, MITRE ATT&CK, MBC, CAPEC, Metasploit, Pwngdb, SQL, Mongodb, Neo4j.
  • DevOps & Tools: Docker, Linux, Git.

Featured Projects

  • GTPO: Trajectory-Based Policy Optimization: Designed a KL-free policy optimization algorithm for LLM post-training that mitigates policy collapse. It boosted reasoning performance by up to 15% on OOD benchmarks (AIME2024, AIME2025, AMC) compared to GRPO.
  • MORSE: Mixture-of-RAG-Security-Experts: Developed a dual-cascaded RAG framework with 7 parallel retrievers tailored for cybersecurity Q&A. It outperformed GPT-4 by 15% in response accuracy for general and multi-hop cybersecurity questions.
  • DantinoX: From-Scratch LLM: Built an LLM architecture from scratch in JAX/Flax featuring Sparse MoE, ROPE, Attention gating and GQA. Maximized hardware throughput via Sliding Window Attention, Static KV Caching, and Gradient Checkpointing.
  • TITAN: Context-Aware Reasoning for CTI: Architected a Knowledge Graph reasoning framework for Cyber Threat Intelligence, automating complex threat analysis by modeling relationships across IoCs, TTPs, and CVEs.

Research Experience

  • CNR-IIT & NetGroup | AI Researcher: Engineered an LLM-driven framework to seamlessly automate the translation of natural language requirements into structured XACML access control policies.
  • Horus Project | AI Researcher: Architected and trained a custom Transformer-based Foundation Model from scratch, specifically designed for proactive cyber-attack prediction.

Technical Writing


Contact & Links

Popular repositories Loading

  1. GTPO GTPO Public

    Group-relative Trajectory-based Policy Optimization: Increasing Quality and Training Stability

    Jupyter Notebook 40 1

  2. DantinoX DantinoX Public

    DantinoX: A modular, memory-efficient Transformer implementation in JAX/Flax NNX. Includes Sparse MoE, GQA, Sliding Window Attention, Gradient Accumulation and Checkpointing

    HTML 4 1

  3. GRPO-Vulnerability-Detection GRPO-Vulnerability-Detection Public

    Code and experiments for the paper ‘Improving LLM Reasoning for Vulnerability Detection via Group Relative Policy Optimization

    Python 3 1

  4. Self-Supervised-BERT-Transformer-for-Android-Malware-Detection-and-Categorization Self-Supervised-BERT-Transformer-for-Android-Malware-Detection-and-Categorization Public

    Python 2 1

  5. FantaBookApp FantaBookApp Public

    Large Scale Project

    Java

  6. VortexGPU-Report VortexGPU-Report Public

    Review on Vortex GP-GPU

    2