AI Mini Project
AI Mini Project
SUBMITTED BY:
Adi Golandaj
Roll No.: 02, Division: A
This is to certify that the Mini Project report entitled Artificial Intelligence
Tech Assistant, submitted by Adi Golandaj, Roll No: 02, Division: A is a
bonafide student of this institute and the work has been carried out under the
supervision of Prof. Reshma Naiknaware It is approved for the partial fulfill-
ment of the requirements for the Third Year degree in Computer Engineering,
Savitribai Phule Pune University.
Place: Pune
Date:
Movie Recommender Computer Engineering
Abstract
In today’s digital ecosystem, where multimedia content is vast and ever-expanding, delivering
personalized experiences has become crucial for user engagement. Recommendation systems
play a vital role in tailoring content to individual preferences, helping users navigate through
massive libraries of options. This project aims to develop a movie recommendation system us-
ing Python’s scikit-learn library. The model utilizes a content-based filtering approach, which
focuses on analyzing the descriptive metadata of movies—including cast, genres, director, and
keywords—rather than relying on collaborative filtering based on user ratings or behavior. This
method ensures relevant recommendations, particularly when user interaction data is limited or
nonexistent.
The dataset used in this project was acquired from a public source and includes multiple fea-
tures for each movie entry, such as the title, cast list, genre tags, directing credits, and associated
keywords. These features were chosen due to their strong representation of a movie’s core con-
tent and thematic structure. Before building the recommendation engine, the data underwent
preprocessing to handle missing or null values, which were replaced with empty strings to
maintain uniformity. These selected features were then concatenated into a single textual string
for each movie to form a comprehensive content descriptor.
To enable machine learning models to interpret the text, the CountVectorizer tool from the
scikit-learn library was used. This tool transforms textual data into a numerical count matrix
that reflects how often each term appears across all movie entries. After vectorizing the data,
the cosine similarity metric was applied to measure how closely related any two movies are,
based on the angle between their respective vectors in multi-dimensional space. This metric is
especially effective in text analysis, as it accounts for the direction and composition of word
usage rather than the sheer number of words.
The recommendation process is activated when a user inputs the title of a movie they like. The
system identifies the corresponding index of this movie within the dataset, calculates similarity
scores between it and all other movies, sorts these scores in descending order, and retrieves the
top five most similar entries. The result is a list of movie recommendations that share similar
themes, genres, or creative contributors with the selected movie. This ensures that recommen-
dations are not only relevant but also aligned in tone, style, or subject matter.
Acknowledgement
I would like to take this opportunity to express my heartfelt gratitude to all the individ-
uals, institutions, and resources that have played a vital role in the successful completion of
this project titled “Movie Recommendation System using Scikit-learn.” This journey has been
a fulfilling learning experience, and it would not have been possible without the unwavering
support, valuable guidance, and numerous contributions from various quarters.
First and foremost, I am deeply grateful to my project supervisor, [Prof. Reshma Naiknaware],
for their constant encouragement, expert guidance, and constructive feedback throughout ev-
ery stage of this project. Their insightful suggestions and timely interventions helped me gain
clarity and maintain direction, particularly during critical phases of development and analysis.
The mentorship I received was instrumental in shaping the technical foundation and analytical
approach of this work.
I extend my sincere thanks to the creators and maintainers of the comprehensive movie dataset
available on GitHub. This dataset, rich in metadata including keywords, cast information, gen-
res, and director names, served as the backbone for building and evaluating the content-based
filtering model. Without access to such high-quality and well-structured data, the practical im-
plementation of this recommendation system would have faced significant limitations.
I am also profoundly thankful to the open-source community for developing and maintaining
the Python libraries that powered this project. Libraries such as Scikit-learn, Pandas, NumPy,
Matplotlib, and others provided robust tools for data preprocessing, feature extraction, model
building, and visualization. Their simplicity, versatility, and extensive documentation enabled
smooth and efficient development, even when dealing with complex data manipulation and
similarity computations.
In addition, I wish to acknowledge the immense support provided by global developer com-
munities like Stack Overflow, GitHub Discussions, and various technical blogs and forums.
These platforms acted as invaluable repositories of collective knowledge, where I found quick
solutions, relevant code snippets, and in-depth discussions on issues that arose during debug-
ging and optimization. The spirit of collaboration and knowledge-sharing that exists in these
communities is truly inspiring and greatly benefited this project.
Adi Golandaj
(T.E. COMPUTER ENGG.)
Table of Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Purpose of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Objectives 2
2.1 Primary Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Specific Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Expected Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4 Program Output 7
5 Future Scope 9
5.1 Model Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.2 System Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3 User Experience Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4 Scalability and Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.5 Security and Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.6 Integration with Other AI Systems . . . . . . . . . . . . . . . . . . . . . . . . 11
6 Limitations 12
6.1 Model Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.2 Limited Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.3 Dependence on External API . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.4 Scalability Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.5 Security and Privacy Concerns . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.6 Dependence on Structured Input . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.7 Limited Adaptability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7 Conclusion 15
8 References 16
Chapter 1
Introduction
1.1 Background
In today’s fast-paced world, the use of Artificial Intelligence (AI) has revolutionized multiple
industries. One such application of AI is in the field of IT support, where AI systems help
troubleshoot common technical problems faced by users. The rise in complexity of everyday
devices and the increasing dependence on technology have resulted in a greater demand for
efficient and accessible support solutions. Traditional IT help desks rely on human intervention
to resolve issues, but this process can be time-consuming and limited by availability. An AI-
driven solution offers an immediate and scalable alternative to provide real-time assistance and
troubleshooting support.
Chapter 2
Objectives
• Efficiently assisting users with common technical problems without the need for human
intervention.
• Reducing response times and enhancing user satisfaction by providing instant support.
The project will also lay the foundation for future developments in AI-driven support systems
for a wide range of applications.
Chapter 3
3.1 Overview
The project aims to develop an AI-driven IT support assistant that utilizes the Zephyr 7B model
hosted on Hugging Face. The system provides real-time troubleshooting solutions to users
by analyzing reported IT issues and generating appropriate responses. The system follows a
conversational structure to simulate an AI helpdesk assistant that can process and respond to
user input effectively.
”You are a helpful IT support assistant. A user has reported the following issue::
user input:”
This format is designed to guide the model in generating a focused response related to the issue
at hand.
3. The model generates a troubleshooting response, which is parsed and displayed to the
user.
Chapter 4
Program Output
Figure 4.1: Initializing the AI IT Help Desk and showing the welcome message
Figure 4.2: User input for reporting an IT issue (e.g., ’My PC is overheating’)
Figure 4.4: The token on Hugging face used to make this mini project
Chapter 5
Future Scope
The current implementation of the AI IT Help Desk Assistant is a functional and effective
solution for addressing IT support queries, but there are several areas where improvements
can be made to enhance the system’s capabilities and broaden its applicability. In the future,
the system can be expanded in various ways, from enhancing the AI model to improving user
interaction methods and increasing the scalability of the platform.
• Advanced Model Integration: The system could integrate more advanced models like
GPT-4 or domain-specific models for IT troubleshooting to provide more nuanced and
accurate solutions.
• Custom Fine-Tuning: By fine-tuning the model with data specifically related to IT is-
sues, the system can generate more precise solutions tailored to common technical chal-
lenges.
technicians. These system enhancements would allow the AI assistant to handle a larger volume
of requests while maintaining the efficiency of human technicians.
• Voice Interaction Support: Enabling voice-based queries and solutions would make
the system more user-friendly, especially for non-technical users and those who need
hands-free interaction.
• Integration with IT Ticketing Systems: The system could automatically escalate un-
solved issues by creating tickets in IT management platforms, streamlining workflows.
• Personalized Solutions: The system could learn from previous interactions to provide
customized solutions, increasing efficiency and user satisfaction.
• Feedback Integration: Implementing a feedback system would allow users to rate re-
sponses, helping the AI learn from its mistakes and continuously improve.
• Cloud Deployment: Transitioning the system to the cloud would improve scalability
and availability, making it more efficient and capable of handling a larger user base.
• Mobile Application: Developing a mobile app would provide users with the flexibility
to use the AI assistant from their smartphones, extending the reach of the system.
• Data Encryption: End-to-end encryption would ensure that all data exchanged between
users and the AI system is secure.
• User Authentication: Adding user authentication could restrict access to sensitive in-
formation and provide a more secure environment for users.
• Anomaly Detection: By integrating anomaly detection models, the assistant could proac-
tively address potential IT issues before they escalate.
In conclusion, the future of the AI IT Help Desk Assistant holds significant potential for
growth and improvement. By incorporating advanced models, enhancing user interactions,
expanding scalability, and ensuring robust security, the system can become a more powerful
and versatile tool for IT support across various industries.
Chapter 6
Limitations
While the AI IT Help Desk Assistant offers a useful and efficient solution for addressing IT
issues, there are several limitations that must be acknowledged. These constraints present areas
for improvement and offer insights into the challenges encountered during the development of
the system. Despite its capabilities, the system is not free from shortcomings that may impact
its overall effectiveness.
• Context Understanding: The model sometimes lacks a deep understanding of the con-
text or complexity of the issues being discussed. This can result in responses that are
overly simplistic or fail to capture the full scope of the problem.
• Dependence on Training Data: Since the model is trained on general data, its ability to
provide IT-specific solutions is limited by the quality and relevance of the data it has been
exposed to. As such, it might not be as effective as a human technician with hands-on
experience.
• Lack of Real-Time Data Access: The assistant does not have direct access to the user’s
system or device to gather real-time data (e.g., system logs, error messages). Without
such access, the AI can only offer general advice, limiting its ability to diagnose and
resolve issues more effectively.
• API Downtime and Reliability: Since the system relies on an external API, any down-
time or interruptions in service could disrupt the AI assistant’s functionality, affecting its
availability.
• API Rate Limiting: There may be restrictions on the number of API calls that can be
made within a specific time frame. This limitation could affect the speed and efficiency
of the assistant, especially during periods of high user demand.
• Latency: The communication with an external server introduces latency, which may
cause slight delays in receiving responses from the AI model. This could impact user
experience, particularly when users are seeking rapid assistance.
• Resource Constraints: Handling a larger volume of users may require additional com-
putational resources, particularly when multiple requests are being processed simulta-
neously. Cloud infrastructure could alleviate this issue, but it would require additional
investment and management.
• Handling Diverse IT Issues: As the user base grows, the diversity and complexity of
IT issues may increase. The current system may not be able to handle every possible
scenario, especially if the issues fall outside the domain of the model’s training data or
the available knowledge base.
• Data Privacy: The system processes user inputs to generate solutions, and while the
data is not stored long-term, there may be concerns regarding the privacy of sensitive
information shared by users, such as personal details or private company data.
• Lack of Authentication: The current system does not include authentication or access
control, meaning that anyone with access to the system can submit requests. This poses
potential security risks, especially in sensitive organizational environments.
• Unstructured Queries: The system might struggle to provide accurate solutions if the
user input is unclear or lacks sufficient detail. For example, vague statements like ”My
computer is broken” may result in the assistant offering generic troubleshooting steps
that may not apply to the specific issue.
• Misunderstanding User Intent: The assistant may not always fully understand the
user’s intent, especially when dealing with complex or multi-step IT issues. This could
lead to irrelevant or incomplete solutions being suggested.
• No Continuous Learning: The model does not improve or adapt based on past interac-
tions, meaning that the quality of responses remains static. A feedback loop that enables
continuous learning would help improve the assistant’s ability to provide accurate solu-
tions.
• No Personalization: The assistant currently lacks the ability to remember user prefer-
ences or customize responses based on past issues, which could limit its effectiveness in
providing personalized solutions.
In conclusion, while the AI IT Help Desk Assistant provides valuable support for IT trou-
bleshooting, it has several limitations that need to be addressed in future versions. Improve-
ments in the areas of model capabilities, interactivity, scalability, and security will help elevate
the system to a more advanced level, ensuring its utility across a wider range of use cases and
environments.
Chapter 7
Conclusion
The AI IT Help Desk Assistant project aimed to develop a practical, text-based solution to
assist users with IT troubleshooting. By leveraging the Hugging Face API, specifically the
zephyr-7b-beta model, the system was able to generate responses for a wide range of user-
reported IT issues. Despite several limitations, the project has demonstrated the potential of
artificial intelligence in automating problem-solving tasks and providing users with immediate,
automated support.
Throughout the development of the system, key challenges such as ensuring accurate re-
sponses, handling a variety of user inputs, and integrating an external API were addressed. The
use of the Hugging Face API has proven effective for generating high-quality responses, though
the system remains dependent on the API’s availability and reliability. Moreover, the project
highlights the importance of structured input for the model to function optimally, while also
underlining the need for future improvements, such as enhanced interactivity and adaptability.
The primary strength of this project lies in its simplicity and scalability. As a prototype,
the AI IT Help Desk Assistant provides a solid foundation for further development. Future
iterations could expand its capabilities by incorporating more specialized models, integrating
multimedia support, and enhancing the personalization of responses. Moreover, implement-
ing features like real-time system data access, feedback loops, and continuous learning could
significantly improve the system’s performance and user experience.
In conclusion, while this project represents a significant step toward automating IT support
through artificial intelligence, there are several areas for improvement. Future research and
development can address these limitations, ultimately leading to a more robust and versatile IT
help desk assistant capable of providing even more accurate and efficient solutions. The project
offers valuable insights into the practical applications of machine learning and AI in real-world
problem-solving scenarios, particularly in the realm of IT support.
Chapter 8
References
1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser,
Ł., and Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st Inter-
national Conference on Neural Information Processing Systems, 5998-6008.
2. Devlin, J., Chang, M. W., Lee, K., and Toutanova, K. (2018). BERT: Pre-training of
deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
3. Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2018). Improving
language understanding by generative pre-training. OpenAI Blog.
4. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., and
Amodei, D. (2020). Language models are few-shot learners. Proceedings of Advances
in Neural Information Processing Systems, 33, 1877-1901.
5. Hugging Face (2021). Hugging Face: A platform for AI and machine learning models.
Retrieved from: https://huggingface.co/
6. Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Sequence to sequence learning with
neural networks. Advances in Neural Information Processing Systems, 27.
7. Kingma, D. P., and Ba, J. (2014). Adam: A method for stochastic optimization. Pro-
ceedings of the 3rd International Conference on Learning Representations.