Basic Summation Notation
Basic Summation Notation
Below is an expanded and structured guide to the most common statistical formulas, complete
with in‐depth explanations and additional metrics (like effect sizes). Use this as a handy reference
for a wide range of statistical techniques.
1. Sum of values:
n
∑ xi (Add up all xi ).
i=1
2. Sum of squares:
n
∑ x2i (Add up the squares of each xi ).
i=1
n
∑ xi y i (Add up products xi ⋅ yi ).
i=1
Concept: These summations form the building blocks in the formulas for variance, correlation, and
regression.
n
1
ˉ =
x ∑ xi .
n i=1
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 1/16
n
1
∑(xi − x
2 2
s = ˉ) .
n−1
i=1
Interpretation: Measures the average squared deviation from the mean, using n − 1 in the
denominator for an unbiased estimate.
Interpretation: The typical (average) distance of points from the mean, in the same units as x.
Hypotheses (two‐sided):
H0 : μ = μ 0 ,
H1 : μ =
μ0 .
Test Statistic:
ˉ − μ0
x
z = .
σ/ n
Concept: A z‐test is like a t‐test but uses the population standard deviation σ directly (or uses a
large‐sample approximation).
4. T‐Tests
Hypotheses:
H0 : μ = μ 0 ,
H1 : μ =
μ0 .
Test Statistic:
ˉ
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 2/16
ˉ − μ0
x
t = ,
s/ n
with df = n − 1.
Concept: Compares the observed mean to the hypothesized mean in units of the estimated
standard error s/ n.
Hypotheses:
H0 : μd = 0,
H1 : μ d
= 0.
Test Statistic:
dˉ
t = ,
sd / n
where dˉ and sd are the mean and standard deviation of the differences.
Concept: Controls for “within‐subject” variability, which often increases power compared to an
independent‐samples test.
Pooled variance:
n1 + n2 − 2
Test Statistic:
ˉ1 − x
(x ˉ2 ) − (μ1 − μ2 )0
t = ,
1 1
sp
n1
+ n2
often (μ1 − μ 2 )0 = 0 .
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/16
Test Statistic:
ˉ1 − x
x ˉ2
t = .
s21 s22
+
n1 n2
Concept: The two‐sample t‐test (pooled or Welch) checks if two population means differ, allowing
for either equal or unequal variances.
5. One‐Way ANOVA
Used to compare more than two group means, under an assumption of normality and equal
variances.
Hypotheses:
H0 : μ 1 = μ 2 = ⋯ = μ a ,
H1 : at least one μj differs.
j=1
a nj
SSW = ∑ ∑(xij − x
2
ˉj ) .
j=1 i=1
MSB
F = , df = (a − 1, ∑(nj − 1)).
MSW
Concept: Checks if the variation between group means is significantly larger than random within‐
group variation.
Sums of Squares:
SSA: Variation due to factor A (rows).
SSB: Variation due to factor B (columns).
SSE: Residual (error) = SST − SSA − SSB .
Mean Squares:
Concept: Separates out the effect of each factor while controlling for the other.
3. Chi‐Square Statistic:
(Oij − Eij )2
χ = ∑2
.
Eij
all cells
Concept: Checks whether the pattern of counts deviates from what we’d expect if the two
categorical variables were truly independent.
1. Combine the two groups into one set, rank from smallest to largest.
2. Let W = sum of ranks for one group.
3. Compare W to the critical distribution (or compute a p‐value).
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 5/16
Concept: Uses ranks instead of raw values, making it more robust to non‐normal data.
n ∑ xi yi − (∑ xi )(∑ yi )
r = .
[n ∑ x2i − (∑ xi )2 ] [n ∑ yi2 − (∑ yi )2 ]
Range: −1 ≤ r ≤ 1.
r ≈ +1 → strong positive association; r ≈ −1 → strong negative; r ≈ 0 → weak linear
relationship.
1. Slope b:
n ∑ XY − (∑ X)(∑ Y )
b = .
n ∑ X 2 − (∑ X)2
2. Intercept a:
∑ Y − b (∑ X)
a = .
n
3. Predicted value at X = x:
Y^ = a + b x.
A measure of goodness‐of‐fit:
SSR
R2 = where SSR = ∑(Y^i − Yˉ )2 , SST = ∑(Yi − Yˉ )2 .
SST
Interpretation: R2 is the proportion of the variance in Y explained by the linear model with
X.
n
Interpretation: How far the sample mean will typically be from the true mean (the smaller, the
more precise).
s
ˉ ± tα/2, n−1 ×
x
.
n
Interpretation: A range of plausible values for μ. If the same procedure is repeated many
times, ~95% (for α = 0.05) of those intervals would contain the true mean.
ˉ − μ0
x dˉ
d = or .
s sd
Two‐sample t‐test:
ˉ1 − x
x ˉ2
d = ,
sp
One‐Way ANOVA:
SSB
η2 = .
SST
ηpartial
SSeffect + SSerror
Concluding Notes
1. Check Assumptions:
Normality for z/t‐tests and ANOVA (or at least approximate normality).
Equal variances for certain t‐tests (pooled) or standard one‐way ANOVA (though it can be
robust).
Random/independent samples whenever required.
2. Tables or Software:
Critical values (tα,df , Fα,df1 ,df2 , χ2α,df ) are found in statistical tables or software output.
A network error occurred. Please check your connection and try again. If this issue persists please
contact us through our help center at help.openai.com.
Feel free to scan for the particular test or concept you need.
1. Summation Notation
n
You will often see expressions like ∑i=1 xi . This is shorthand for “sum xi over i from 1 to n.”
n
∑i=1 xi means x1 + x2 + ⋯ + xn .
n
∑i=1 x2i means x21 + x22 + ⋯ + x2n .
n
∑i=1 xi yi means x1 y1 + x2 y2 + ⋯ + xn yn .
Why it matters: These summations are building blocks for almost every other formula in statistics.
2. Descriptive Statistics
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 8/16
ˉ
2.1 Sample Mean x
n
1
ˉ =
x ∑ xi .
n
i=1
Interpretation: Variance represents how spread out your data are. A larger s2 means more spread.
s= s2 .
Interpretation: s is the average distance of data points from the sample mean (roughly).
σ/ n
ˉ: Sample mean.
x
μ0 : Hypothesized population mean (under H0 ).
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 9/16
σ/ n: The standard error of the mean when σ is known.
4. T‐Tests
4.1 One‐Sample t‐Test
ˉ − μ0
x
t= .
s/ n
ˉ: Sample mean.
x
μ0 : Hypothesized population mean under the null H0 .
dˉ
t= .
sd / n
di : The difference for the ith subject (between two paired measurements).
Interpretation: Measures whether the average difference is significantly different from zero,
controlling for within‐subject variability.
1. Pooled Variance:
2 2
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 10/16
(n1 − 1)s21 + (n2 − 1)s22
s2p = .
n1 + n2 − 2
s2p : Weighted average of the two group variances (assumes both groups truly have the
(x
ˉ1 − x
ˉ2 ) − (μ1 − μ2 )0
t= .
1 1
sp
n1
+ n2
ˉ1 , x
x ˉ2 : Sample means in group 1 and group 2.
1 1
n1 + n2 : Summation inside a square root for the standard error of the difference in
means.
Interpretation: If ∣t∣ is large, it suggests the two groups have different means. Degrees of freedom
= n1 + n2 − 2.
ˉ1 − x
x ˉ2
t= .
s21 s22
+
n1 n2
s21 , s22 : Sample variances of group 1 and group 2 (not pooled because we assume they differ).
n1 , n2 : Sample sizes.
The denominator: Standard error for difference in means under unequal variances.
Degrees of freedom: Calculated via the Welch–Satterthwaite approximation (not simply n1
+
n2 − 2).
Hypothesis:
H0 : μ 1 = μ 2 = ⋯ = μ a ,
H1 : At least one μj
= others.
1. Between‐Groups (SSB):
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 11/16
a
SSB = ∑ nj (x
2
ˉj − x
ˉoverall ) .
j=1
a: Number of groups.
nj : Sample size in group j .
ˉj : Mean of group j .
x
Interpretation: The variation between group means (how far each group mean is from
the overall mean).
2. Within‐Groups (SSW):
a nj
SSW = ∑ ∑(xij − x
2
ˉj ) .
j=1 i=1
ˉj : Mean of group j .
x
Interpretation: The variation within each group (how spread out the data are inside each
group).
3. Total (SST):
MSB
F = with df = ( a − 1, ∑(nj − 1) ).
MSW
Interpretation: A large F means that the differences among group means likely exceed what we’d
expect by random chance.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 12/16
Then:
MSA MSB
FA = , FB = .
MSE MSE
Interpretation: Tests if Factor A levels differ on average, and if Factor B levels differ on average,
controlling for each other’s effect.
3. Chi‐Square Statistic:
(Oij − Eij )2
χ2 = ∑ .
Eij
all cells
Interpretation: If χ2 is large, there is evidence that row and column variables are not independent.
1. Combine all observations from both groups, rank them from smallest to largest.
2. Let W = sum of ranks for one group (or the smaller group).
3. Use special tables or software to determine significance.
Interpretation: Tests whether the distribution of one group tends to have larger or smaller values
than the other group, without assuming normality.
( to PDF
Printed using ChatGPT to PDF, powered by PDFCrowd HTML )( API. ) 13/16
n ∑ xi y i − ( ∑ xi ) ( ∑ y i )
r= .
[n ∑ x2i − (∑ xi ) ] [n ∑ yi2 − (∑ yi ) ]
2 2
Denominator: Product of the square roots of each variable’s sum‐of‐squares term, ensuring r
is normalized between –1 and +1.
r: The Pearson correlation.
Interpretation:
1. Slope b:
n ∑ XY − (∑ X)(∑ Y )
b= .
n ∑ X 2 − (∑ X)2
∑ Y − b (∑ X)
a= .
n
^ at X
3. Predicted value Y = x:
Y^ = a + b x.
SSR
where SSR = ∑(Y^i − Yˉ ) , SST = ∑(Yi − Yˉ ) .
2 2
R2 =
SST
Y^i : Predicted Y value from the model for the ith observation.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 14/16
Interpretation: R2 is the proportion of variance in Y explained by X . Closer to 1 ⇒ better fit.
n
n
tα/2, n−1 : Critical t‐value from the t distribution for a chosen confidence level (e.g., 95%).
Interpretation: Gives a range of plausible values for μ. If repeated many times, ~95% of such
intervals would contain the true μ (for 95% confidence).
One‐sample/Paired:
ˉ − μ0
x dˉ
d= or .
s sd
Two‐sample:
ˉ1 − x
x ˉ2
d= .
sp
ˉ1 , x
x ˉ2 : Means of the two groups.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 15/16
d ≈ 0.2 = small,
d ≈ 0.5 = medium,
d ≈ 0.8 = large.
SSB
η2 = .
SST
Interpretation: Fraction of total variation explained by the factor (like “proportion of variance
accounted for”).
Concluding Remarks
1. Know Your Assumptions
Normality: z‐tests, t‐tests, standard ANOVA typically assume normal distributions (or at
least approximate).
Equal variances: Some t‐tests (the pooled version) and standard one‐way ANOVA assume
equal variances across groups (though they’re somewhat robust if group sizes are
similar).
Independence: Observations typically must be independent unless you specifically use
paired or repeated‐measures designs.
2. Degrees of Freedom (df)
Always keep track of df for each test (e.g., n − 1, n1 + n2 − 2, or (r − 1)(c − 1)).
They dictate which distribution to use for obtaining p‐values and confidence intervals.
3. Effect Size vs. p‐Value
p‐Values tell you whether there’s statistically significant evidence for an effect.
Effect sizes (Cohen’s d, η 2 , R2 ) tell you how large that effect is, which can be more
meaningful in practical contexts.
By understanding each term in these formulas—what it represents and why it appears—you can
more confidently apply and interpret the corr
ChatGPT can make mistakes. Check important info.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 16/16