Question:
You have data on customer age and monthly spending at a retail store. How would you
determine if there is a relationship between age and spending? What statistical method would
you use?
Solution:
To determine if there is a relationship between age and monthly spending using statistical
methods, follow these steps:
1. Hypothesis Testing:
  - Null Hypothesis (H_0): There is no relationship between age and monthly spending.
  - Alternative Hypothesis (H_1): There is a relationship between age and monthly spending.
2. Pearson Correlation Coefficient:
  - Calculate Pearson's correlation coefficient (r) to measure the linear relationship between age
and spending.
  - Test the significance of the correlation coefficient.
Sample Data
Here's a small sample dataset:
| Age | Monthly Spending |
|-----|------------------|
| 25 | 200               |
| 30 | 220               |
| 35 | 250               |
| 40 | 270               |
| 45 | 300               |
| 50 | 320               |
| 55 | 350               |
| 60 | 380               |
| 65 | 400               |
| 70 | 420               |
Calculation
1. Pearson Correlation Coefficient:
2. P-value:
  - Calculate the p-value to test the significance of the correlation.
Let's calculate the Pearson correlation coefficient and the p-value using Python.
import pandas as pd
from scipy.stats import pearsonr
data = {
   'Age': [25, 30, 35, 40, 45, 50, 55, 60, 65, 70],
   'Monthly Spending': [200, 220, 250, 270, 300, 320, 350, 380, 400, 420]
}
df = pd.DataFrame(data)
corr, p_value = pearsonr(df['Age'], df['Monthly Spending'])
corr, p_value
Interpretation
1. Pearson Correlation Coefficient (r):
  - r ranges from -1 to 1.
  - A value close to 1 implies a strong positive relationship.
  - A value close to -1 implies a strong negative relationship.
  - A value close to 0 implies no linear relationship.
2. P-value:
  - If the p-value is less than the significance level (typically 0.05), we reject the null hypothesis.
  - This indicates that there is a statistically significant relationship between age and monthly
spending.
Based on the sample data and the calculation, we can determine whether there is a significant
relationship between age and monthly spending. Here are the results:
print(f"Pearson's correlation coefficient: {corr:.2f}")
print(f"P-value: {p_value:.4f}")
Conclusion
- If the Pearson correlation coefficient r is significantly different from 0 and the p-value is less
than 0.05, we can conclude that there is a statistically significant relationship between age and
monthly spending.
- Otherwise, we fail to reject the null hypothesis and conclude that there is no significant
relationship between the two variables.