You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implementation of an interactive chatbot for summarizing legal and policy documents. Includes data preprocessing (cleaning, tokenization, chunking), extractive summarization baselines, and fine-tuned abstractive models (PEGASUS and LED). Integrates a retrieval layer for document relevance and uses ROUGE, BLEU, and cosine similarity for evaluation.
This project uses OCR and machine learning to extract CBC values from reports and predict urgency levels. As of now, it supports image/pdf inputs, manual corrections, and SHAP explainability. Ideal for medical AI, healthcare OCR, and automated lab report analysis.
nlp text mining dashboard to explore current trends and extract most used keywords on software engineering and data science articles. Tech Stack: Django, Python, PostgreSQL, HTML/CSS, JavaScript, Docker, AWS
A multi-source supplier risk dataset combining KG scores, semantic similarity, and external disruption signals ideal for supply chain resilience research. Useful for researchers working on supplier risk assessment, multi-objective optimization, or reinforcement learning in dynamic sourcing environments.
Social network analysis and content moderation system that detects harmful content, identifies influencers, and analyzes information spread using graph neural networks and natural language processing.
Advanced OCR and document understanding system that extracts, classifies, and analyzes complex documents. Handles tables, forms, invoices, and contracts using transformer-based models and layout understanding.
A web application that extracts text from PDF files, processes it using OpenAI's GPT-4o model to identify structured information, and displays the results in a tabulated format with export options.
How to build a baby-BERT : I analyze BiLSTMs combined with Conditional Random Fields for Named Entity Recognition & contrasts a Neural-CRF tagger against a baseline BiLSTM model, exploring how probabilistic sequence dependencies improve contextual understanding beyond token-level classification.
An NLP-integrated Visual Novel Game that redefines the mystery solving genre with dynamic deduction. Go beyond static puzzles and branching narratives. This project focuses on creating a living investigation where player input, processed via Natural Language Processing (NLP), genuinely drives the narrative, beyond static gameplay.
KeywordX is a lightweight Python library for extracting and matching keywords from text using semantic similarity and entity-based boosting. Perfect for NLP pipelines, chatbots, search systems, and event extraction.
Enhancing Transparency in Medical Text Transcription Classification: Assessing BERT Pre-trained Language Model Performance and Decision-Making Using Explainable AI (XAI) Techniques
Corpus-based study using Latent Dirichlet Allocation (LDA) to analyze how "innovation" is contextualized in English and Spanish company documents with NLP techniques.