An automated Exploratory Data Analysis tool that transforms your raw data into actionable insights within seconds.
- Interactive Data Analysis: Upload your CSV file or use pre-loaded datasets for instant analysis
- Comprehensive Visualizations: Automatic generation of relevant plots and charts
- Statistical Insights: Detailed statistical analysis of your data
- Missing Data Analysis: Visual and numerical representation of missing data patterns
- Correlation Analysis: Interactive correlation heatmaps and scatter plots
- Distribution Analysis: Automated distribution plots for numerical variables
- PCA Analysis: Principal Component Analysis with interactive scree plots
- Downloadable Reports: Generate and download comprehensive EDA reports
- Frontend: Streamlit
- Data Processing: Pandas, NumPy
- Visualization: Plotly, Matplotlib, Seaborn
- Machine Learning: Scikit-learn
- Statistical Analysis: SciPy
- Clone the repository
git clone https://github.com/sidesshmore/Auto-EDA.git
cd Auto-EDA- Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txt- Run the application
streamlit run app.pyThe application comes with three pre-loaded datasets:
- Iris Dataset: Perfect for classification tasks
- Titanic Dataset: Great for both classification and EDA
- Boston Housing Dataset: Ideal for regression analysis
- Data Loading: Support for CSV files with automatic type inference
- Basic Statistics: Summary statistics, data types, missing values
- Visualization Suite:
- Distribution plots
- Correlation heatmaps
- Scatter plots
- Missing data visualizations
- Categorical data pie charts
- Advanced Analysis:
- Principal Component Analysis (PCA)
- Automated report generation
- Visit the application URL or run locally
- Choose between demo datasets or upload your own CSV file
- Explore automatically generated visualizations and insights
- Download comprehensive reports for further analysis