Thanuja B S
Senior System Engineer -Infosys Limited | Aspiring Data Scientist /AI ML Engineer
thanuja1bs2000@gmail.com | + 91 8431605597 | LinkedIn | Bangalore
Profile Summary
Passionate engineer with 3.2 years of experience in specializing in AL and ML, seeking to contribute technical expertise and innovative
solutions to a dynamic team. Proven experience in developing AI solutions, synthetic data generations, and building RAG pipeline.
Professional Experience
Infosys (April 2022- Present)
Role: Senior System Engineer – AI/ML
Roles and Responsibilities:
1. Synthetic Data Generation:
• Created a synthetic data generator to overcome test data limitations, ensuring compliance with GDPR and PII regulations.
• Leveraged Azure Document Intelligence for document parsing and Azure Custom Vision for image-based classification models.
• Applied OpenCV to overlay synthetic data onto original templates for various client projects.
2. Chatbot Development with RAG:
• Built a Retrieval-Augmented Generation (RAG) pipeline for a personalized document chatbot, achieving 90% accuracy inresponses.
• Supported multi-modal inputs and multi-tenancy for enhanced document search and retrieval.
• Reduced manual processing efforts by 40% by providing AI-generated citations.
3. Document Analysis and Natural Language Processing (NLP):
• Integrated NLP solutions with Lang chain’s Map Reduce Chain for summarizing legal contracts and financial reports.
• Conducted similarity searches on document embeddings, improving retrieval accuracy.
4. AI Security and Compliance:
• Implemented prompt defence techniques to ensure AI solutions adhere to responsible and secure AI practices.
• Ensured compliance with ethical AI standards and reduced risk in AI-driven decision-making processes.
5. Project Management and Leadership:
• Collaborated with cross-functional teams to manage tasks and lead data science projects.
• Demonstrated ability to handle multiple responsibilities, including data preprocessing, feature engineering, and model deployment.
6. Legacy Code Documentation:
• Developed an AI Assistant to automate documentation for legacy code, reducing documentation time by 50%.
• Generated method, class, and application-level summaries across multiple programming languages (Python, Java, C++). 7. AI Solution
Development:
• Collaborated and optimized AI solutions using machine learning algorithms.
• Worked with transformer-based models, YOLO models for various deep learning tasks, including image and text processing.
Technical Skills
Programming Language Python, HTML, CSS Databases: MySQL, MongoDB
Libraries and Framework Panadas.Numpy, Matplotlib, Flask, Data Visualization: Power BI
Pytest, Pytest OpenCV, Langchain Project Management Tools: Azure
IDE’s VS Code, Google Colab, Jupyter Vector Databases: FAISS, Chroma DB, Weaviate
Notebook
MS-Office Word, Excel, PPT Others AL, ML, DI, NLP, LLM, GenAI
Soft Skills Certifications
• Excellent Communication Azure 900 Certified | Issued by Microsoft
• Problem-Solving Azure 204 Certified | Issued by Microsoft
• Analytical Thinking and Time-Management Azure 400 Certified | Issued by Microsoft
Key Projects
1. Infosys Document Intelligence Platform:
• Developed a personalized document chatbot with Retrieval-Augmented Generation (RAG) pipeline, supporting multi-modal inputs,
multitenancy and chat history. Achieving 90% accuracy in query response. Citations provided which reduced manual efforts by 40%
• Ensured the AI solution adhered to responsible and secure AI practices by implementing prompt defense techniques.
• Utilized Azure Document Intelligence for document parsing and analysis.
2. Infosys Cognitive Tagging Platform:
• Integrated a Python-based summarization microservice into ICTP (a Java application) for classification and entity extraction.
• Employed Langchain’s Map Reduce Chain to implement context-aware, sequential summarization techniques for legal contracts and
financial reports.
3. Infosys Quality Automation Group Migration Assistant:
• Developed an AI Assistant to generate documentation for legacy code applications, reducing manual documentation time by 50%.
• Created method, class, module, and application-level summaries, and built an insight graph.
• Supported multiple programming languages, including Python, Java, C++, Markdown, and HTML. 4. Synthetic Data Generation POC for
BankClient:
• Created a Synthetic Data Generator to tackle the challenge of limited test data availability, complying with GDPR and PII restrictions.
• Leveraged Azure Document Intelligence for document parsing and Azure Custom Vision for image-based classification models.
• Generated synthetic images using DALL-E and SDXL, and applied OpenCV to overlay synthetic data onto original templates.
4. Solution Creation Project:
• Worked on deploying Commercial Off-The-Shelf (COTS) products like Alfresco, Nuxeo, and OpenText in Microsoft Azure.
• Assisted in setting up and managing Azure Kubernetes Service (AKS) clusters for containerized deployments.
• Gained hands-on experience with basic Kubernetes configuration and deployment using YAML and CLI.
• Supported application deployment, monitoring, and basic troubleshooting in a cloud-based environment.
5. Intelligent Clause Matching and Gap Analysis Using NLP & LLMs
• Extracted key clauses from contract and submission documents using SpaCy NLP pipelines, enabling structured comparison of legal and
technical content.
• Employed a hybrid approach combining TF-IDF vectorization and semantic embeddings to identify and match clauses with high accuracy.
• Integrated a Large Language Model (LLM) to analyze the priority and criticality of missing clauses, supporting decision-making during
contract reviews.
• Automated the clause comparison workflow, significantly improving the speed and consistency of document analysis.
Awards and Recognitions
• Infosys Insta Award, July 2023
• Infosys WOW Award, Q3 FY24
• Infosys Rise Award, Q3 FY24
Education
B.E (Bachelor of Engineering) Electrical and Electronics Engineering | Rajarajeshwari college of Engineering (2017-2021) with 8.3 CGPA