Skip to content

jogonba2/jogonba2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 

Repository files navigation

Bio 🌱

Hi!👋😊

I'm José, an NLP researcher deeply passionate about exploring the limitless possibilities of natural language processing. My PhD focused on summarization and attention-based models, but my work spans a wide range of NLP topics, including: 📚 [Zero- and few-shot] Text Classification, 📜 Automatic Summarization, 😊 Sentiment and Emotion Analysis, 🌟 Figurative Language Understanding, 🗣️ Dialogue Systems, 📄 Information Extraction, or 🤖 Machine-Generated Multimodal Content Detection.

Since 2016, my research has centered on the intersection of deep learning and NLP, striving to develop efficient solutions for complex language challenges. I'm also dedicated to advancing NLP for Spanish and co-official languages in Spain, working on initiatives that bridge linguistic and technological gaps.

Over the years, I've been an active participant in shared tasks across a variety of NLP domains. I was part of the winning teams in several competitions, including TASS 2017 to 2020, IroSVA, COSET, and SemEval 2024 Task 8. I’ve also achieved strong results in other SemEval challenges, such as 2017, 2018 (1), 2018 (2), or 2019.

In 2023, I began organizing a line of shared tasks at the Iberian Languages Evaluation Forum (IberLEF), focusing on machine-generated text detection and attribution, such as AuTexTification, IberAuTexTification, and MIMIC. I was also on the program committee for the GenAI content detection task at COLING 2025, and I am one of the three organizers of IberLEF from 2025 to 2027.

Outside of research, I’m passionate about teaching. I currently teach courses on information retrieval, intelligent agents, and programming at Universidad Europea, as well as advanced machine learning techniques in the Master’s in Big Data program at Universidad de Barcelona. Besides, I am a recurrent invited speaker to the Master's in Artificial Intelligence of the UPV to give a talk about language modeling and embeddings.

I'm also proud to share that my PhD thesis was awarded cum laude and received the best NLP thesis award from the Spanish Society for Natural Language Processing.

Works 👨🏻‍🔧

Here are some of my works with public source code and (a few) publications during these years:

Work Repo Paper Journal/Conference
BERT for tweets before HuggingFace's era Link Link Neurocomputing
Hierarchical attention-based models for summarization Link, Link Link, Link, Link Intelligent & Fuzzy Systems, IberSpeech 2022
Spanish and Catalan datasets for summarization Link Link NAACL
Source summary entity aggregations in abstractive summarization Link Link COLING
Transformer-based contextualization for irony detection Link Link Information Processing & Management
LLMixtic, winning system at SemEval 2024 Task 8 Link Link Proceedings of SemEval 2024
TextMachina, a framework to build MGT datasets Link Link KES 2024
Text & Multimodal machine-generated content detection & attribution Link, Link, Link Link, Link SEPLN
IberBench, a benchmark of LLMs in Iberian languages Link Link Computer Speech & Language
Copy mechanism for Transformers Link N/A N/A
MinGRU implementation Link N/A N/A
Tuning LLMs by Proxy implementation Link N/A N/A
Implementation of Group Relative Policy Optimization from DeepSeek R1-zero Link, Link N/A N/A

Stack and stats 🛠️

Python   Torch   Tensorflow   Transformers   Transformers   Transformers   Transformers   SentenceTransformers   Spacy   SkLearn   Pandas  

LangChain   LangGraph   LLamaIndex  

Triton   VLLM   Docker   Azure   Streamlit   FastAPI  

Git   GitHub   LATEX  

Rashmi's Github Stats Top Langs

Reach me! 🤙

I'm looking forward to collaborate in any NLP field. Feel free to reach me through Linkedin, Google Scholar, ResearchGate, and HuggingFace!

About

Hi there!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published