0% found this document useful (0 votes)
8 views3 pages

Stats Solution

Stats Solution

Uploaded by

bca40557.21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views3 pages

Stats Solution

Stats Solution

Uploaded by

bca40557.21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Question:

You have data on customer age and monthly spending at a retail store. How would you
determine if there is a relationship between age and spending? What statistical method would
you use?

Solution:

To determine if there is a relationship between age and monthly spending using statistical
methods, follow these steps:

1. Hypothesis Testing:
- Null Hypothesis (H_0): There is no relationship between age and monthly spending.
- Alternative Hypothesis (H_1): There is a relationship between age and monthly spending.

2. Pearson Correlation Coefficient:


- Calculate Pearson's correlation coefficient (r) to measure the linear relationship between age
and spending.
- Test the significance of the correlation coefficient.

Sample Data

Here's a small sample dataset:

| Age | Monthly Spending |


|-----|------------------|
| 25 | 200 |
| 30 | 220 |
| 35 | 250 |
| 40 | 270 |
| 45 | 300 |
| 50 | 320 |
| 55 | 350 |
| 60 | 380 |
| 65 | 400 |
| 70 | 420 |

Calculation

1. Pearson Correlation Coefficient:


2. P-value:
- Calculate the p-value to test the significance of the correlation.

Let's calculate the Pearson correlation coefficient and the p-value using Python.

import pandas as pd
from scipy.stats import pearsonr

data = {
'Age': [25, 30, 35, 40, 45, 50, 55, 60, 65, 70],
'Monthly Spending': [200, 220, 250, 270, 300, 320, 350, 380, 400, 420]
}
df = pd.DataFrame(data)

corr, p_value = pearsonr(df['Age'], df['Monthly Spending'])

corr, p_value

Interpretation

1. Pearson Correlation Coefficient (r):


- r ranges from -1 to 1.
- A value close to 1 implies a strong positive relationship.
- A value close to -1 implies a strong negative relationship.
- A value close to 0 implies no linear relationship.

2. P-value:
- If the p-value is less than the significance level (typically 0.05), we reject the null hypothesis.
- This indicates that there is a statistically significant relationship between age and monthly
spending.

Based on the sample data and the calculation, we can determine whether there is a significant
relationship between age and monthly spending. Here are the results:

print(f"Pearson's correlation coefficient: {corr:.2f}")


print(f"P-value: {p_value:.4f}")
Conclusion

- If the Pearson correlation coefficient r is significantly different from 0 and the p-value is less
than 0.05, we can conclude that there is a statistically significant relationship between age and
monthly spending.
- Otherwise, we fail to reject the null hypothesis and conclude that there is no significant
relationship between the two variables.

You might also like