PnS
Final Syllabus;
Chapter 8
Geometric Distribution
Multinomial Distribution
Chapter 9
Chapter 10
Simple Linear Regression Model
Correlation
Book Part 2
Simple Regression and Correlation
What is a regression relation?
The relation between the expected value of the dependent variable and the
independent variable is called a regression relation.
Differentiate between simple regression and multiple regression.
The dependence of a variable on a single independent variable is called a simple
or two-variable regression.
The dependence of a variable on two or more than two independent variables is
called multiple regression.
When is the regression said to be linear?
The regression is said to be linear, if the dependence is represented by a straight
line equation. Otherwise, it is said to be curvilinear.
What are the different names for independent and dependent variables?
Dependent Variable; Regressand, Predictand, Response, or Explained variable.
Independent Variable; Regressor, Predictor, Regression variable or explanatory
variable.
How many types of relations/models are there?
There are two types of relations/models
Deterministic: A relation in which you can substitute a value of x in the
equation and completely determine a unique value of Y. For example, F = 32
+ 9/5 C (Fahrenheit to Celsius). Such relations can not be studied by
regression.
Non-Deterministic / Probabilistic: Linear relationships in some cases are not
exact. For example, you can not precisely determine a person's weight from
his height. It needs to include measurement of random errors (for different
possible situations). These kind of relations are called non-deterministic or
probabilistic.
What is a scatter diagram or scatter plot?
If you plot each pair of independent-dependent observations as a point on a graph
paper, using the X-axis for regression variable and the Y-axis for dependent
variable, such a diagram is called a scatter diagram or a scatter plot.
How to determine if a relation between two variables exist?
If the relation between two variables exist, the points on a scatter diagram will
show a tendency to cluster around a straight line or some curve, known as the
regression line or regression curve.
What is the principle of least squares?
The principle of least squares consists of determining the values of unknown
parameters (the slope/regression coefficient and Y-intercept) that will minimize the
sum of squares of errors (or residuals), where errors are defined as the differences
between observed values and the corresponding values predicted or estimated by
the fitted Model equation.
What is Least Squares?
What is the difference between population data and sample data?
Population data is the total number of measurements for every individual in a
group, while sample data is a subset of that data:
What are the properties of Least-Squares Regression Line?
Properties Of Least Squares Regression Line
The observed values of (X, Y) do not all fall on the regression line but they scatter away
from it. The degree of scatter (or dispersion) of the observed values about the
regression line is measured by what is called the standard deviation of regression or
the standard error of estimate of Y on X.
What is the coefficient of determination in linear regression?
The coefficient of determination measures the proportion of variability (changes) of
the values of the dependent variable (Y) explained by its linear relation with the
independent variable (X), and is defined by the ratio of explained variation to the
total variation. Simply put, it tells us what portion of the changes in Y can be
explained by changes in X.
Define correlation.
Correlation is a measure of the degree to which any two variables vary together.
If both variables tend to increase or decrease together, the correlation is said
to be direct or positive.
If one variable tends to increase while the other variable decreases, the
correlation is said to be negative or inverse.
What is the Pearson Product Moment Correlation Co-efficient?
It is the numerical measure of strength in the linear relationship between any two
variables, also known as the coefficient of simple correlation or total correlation.
It only measures the linear correlation i.e. if all the observed values lie exactly on a
circle, there is a perfect non-linear relationship between the variables, but PPMCC
will be 0, as it measures only linear correlation.