SURVEY SPARROW:ASSIGNMENT
Problem Statement: Design a system to personalize survey questions
dynamically based on previous user responses, optimizing the survey
experience and response quality.
Dataset: https://www.kaggle.com/datasets/tunguz/big-five-personality-
test?resource=download
There were totally 5 iterations done,and they are clearly explained below
Project Overview:
The objective of this project was to design a dynamic survey system that
personalizes the flow of questions based on user responses. By adjusting the
survey path in real-time, the system aims to enhance user engagement and
improve the quality of the data collected. The project was based on the "Big 5
Personality Test" dataset from Kaggle.
Methodology:
1. Data Preprocessing:
o The dataset was first cleaned by checking for null values and
categorizing questions into five personality traits: Openness,
Conscientiousness, Extraversion, Agreeableness, and Neuroticism.
o An initial exploratory analysis was conducted to understand the
distribution of ratings across these traits.
2. Clustering and Classification:
o KMeans Clustering: Applied KMeans clustering to segment the
respondents into five distinct personality clusters. This
unsupervised learning method grouped individuals based on their
responses, revealing patterns within the data.
o Random Forest Classification: A Random Forest classifier was
also trained to predict personality clusters with an accuracy of
around 89%. This supervised approach provided a secondary
validation of the clustering results.
3. Dynamic Questioning:
o Iteration 1: Implemented basic dynamic questioning by skipping
questions if certain responses met predefined thresholds.
o Iteration 2: Refined the skipping logic based on response
thresholds (e.g., ratings below 2 or above 4), leading to a more
personalized and efficient survey experience.
o Iteration 3: Attempted to integrate NLP models, but the numerical
nature of the dataset limited the effectiveness of this approach.
4. Web Application Development:
o Developed a user-friendly web application using Flask, allowing
users to interact with the dynamic survey system online.
o The application generates a dynamic question flow, minimizes
irrelevant questions, and provides real-time feedback on the user’s
personality cluster, complete with visual analytics.
Findings:
• The dynamic survey system successfully reduces the number of questions
presented to users, improving engagement without compromising the
accuracy of the personality assessment.
• KMeans clustering revealed five distinct personality clusters, each
corresponding to a different combination of personality traits.
• The Random Forest classifier provided robust predictions, reinforcing the
validity of the clustering results.
Recommendations:
• Further Refinement of Dynamic Questioning: Consider incorporating
more sophisticated logic, possibly leveraging reinforcement learning, to
adapt question paths more effectively based on user responses.
• Expand Data Collection: To improve the model’s robustness, additional
data should be collected, particularly across diverse demographics.
• Enhance User Interface: While the current interface is functional, there
is potential to improve user experience through more interactive and
visually appealing designs.
• Explore Advanced Analytics: Investigate the use of more advanced
machine learning models or ensemble methods to potentially increase the
accuracy and interpretability of personality predictions.
Conclusion:
The project demonstrated the potential of dynamic survey systems to enhance
user engagement and data quality. The developed system is a step forward in
creating more personalized and efficient survey experiences, with opportunities
for further improvement and expansion.