Statistics for Data Science -1
Statistics for Data Science -1
                                    Continuous Random Variables
                                              Usha Mohan
                                    Indian Institute of Technology Madras
                                                                            1/ 15
Statistics for Data Science -1
Learning objectives
                                 2/ 15
Statistics for Data Science -1
Learning objectives
          1. Define what is a continuous random variable.
                                                            2/ 15
Statistics for Data Science -1
Learning objectives
          1. Define what is a continuous random variable.
          2. Probability distribution function and examples
                                                              2/ 15
Statistics for Data Science -1
Learning objectives
          1. Define what is a continuous random variable.
          2. Probability distribution function and examples
          3. Cumulative distribution function, graphs, and examples.
                                                                       2/ 15
Statistics for Data Science -1
Learning objectives
          1. Define what is a continuous random variable.
          2. Probability distribution function and examples
          3. Cumulative distribution function, graphs, and examples.
          4. Expectation and variance of random variables.
                                                                       2/ 15
Statistics for Data Science -1
       Probability density function, graph, and examples
          Probability density function
                                                           3/ 15
Statistics for Data Science -1
Discrete and Continuous random variables
                                           4/ 15
Statistics for Data Science -1
Discrete and Continuous random variables
       Definition
       A random variable that can take on at most a countable number of
       possible values is said to be a discrete random variable.
                                                                          4/ 15
Statistics for Data Science -1
Discrete and Continuous random variables
       Definition
       A random variable that can take on at most a countable number of
       possible values is said to be a discrete random variable.
       Definition
       When outcomes for random event are numerical, but cannot be
       counted and are infinitely divisible, we have continuous random
       variables.
                                                                          4/ 15
Statistics for Data Science -1
Discrete and continuous random variable
                                          5/ 15
Statistics for Data Science -1
Discrete and continuous random variable
          I A discrete random variable is one that has possible values that
            are discrete points along the real number line.
                                                                              5/ 15
Statistics for Data Science -1
Discrete and continuous random variable
          I A discrete random variable is one that has possible values that
            are discrete points along the real number line.
          I A continuous random variable is one that has possible values
            that form an interval along the real number line. In other
            words, a continuous random variable can assume any value
            over an interval or intervals.
                                                                              5/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Probability density function (pdf)
          I Every continuous random variable X has a curve associated
            with it.
                                                                        6/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Probability density function (pdf)
          I Every continuous random variable X has a curve associated
            with it.
          I The probability distribution curve of a continuous random
            variable is also called its probability density function. It is
            denoted by f (x)
                                                                              6/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Area under a pdf
          I Consider any two points a and b, where a is less than b.
          I The probability that X assumes a value that lies between a
            and b is equal to the area under the curve between a and b.
            That is,
                P(X ∈ [a, b]) = P(a ≤ X ≤ b) is area under curve between a and b
                                                             Z   b
                                            P(a ≤ X ≤ b) =           f (x)dx
                                                             a
                                                                               7/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Properties of pdf
                                                       8/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Properties of pdf
          1. The area under the probability distribution curve of a
             continuous random variable between any two points is
             between 0 and 1.
                                                                      8/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Properties of pdf
                                                       9/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Properties of pdf
          2. Total area under the probability distribution curve of a
             continuous random variable is always 1.
                                                                        9/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Properties of pdf
          I The area under the graph of the probability density function
            between points a and b is the same regardless of whether the
            endpoints a and b are themselves included:
                                         P(a ≤ X ≤ b) = P(a < X < b)
                                                                           10/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Properties of pdf
          I The area under the graph of the probability density function
            between points a and b is the same regardless of whether the
            endpoints a and b are themselves included:
                                         P(a ≤ X ≤ b) = P(a < X < b)
          I The probability density curve of a random variable X is a
            curve that never goes below the x− axis
                                                                           10/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20 =0.29
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20 =0.29
                                                        2. Less than 40
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20 =0.29
                                                        2. Less than 40 =0.56
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20 =0.29
                                                        2. Less than 40 =0.56
                                                        3. More than 50
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20 =0.29
                                                        2. Less than 40 =0.56
                                                        3. More than 50 =0.33
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20 =0.29
                                                        2. Less than 40 =0.56
                                                        3. More than 50 =0.33
                                                        4. Between 40 and 70 minutes to
                                                           complete a repair?
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Example
       Figure below is a probability density function for the random
       variable that represents the time (in minutes) it takes a repairer to
       service a television. The numbers in the regions represent the areas
       of those regions.
                                                       What is the probability that the repairer
                                                       takes
                                                        1. Less than 20 =0.29
                                                        2. Less than 40 =0.56
                                                        3. More than 50 =0.33
                                                        4. Between 40 and 70 minutes to
                                                           complete a repair? =0.27
                                                                                               11/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Cumulative distribution function
       For a continuous random variable X
                                                          Z   a
                                     F (a) = P(X ≤ a) =           f (x)dx
                                                          −∞
       Since the probability that a continuous random variable X assumes
       a single value is always zero, we have
                                                Z a
                      P(X < a) = P(X ≤ a) =         f (x)dx
                                                                  −∞
                                                                            12/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Expectation and Variance
                                    R
          I Expected value: E (X ) = x f (x)dx.
          I Variance: Var (X ) = (x − E (X ))2 f (x)dx
                                R
                                                         13/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Section summary
          I Probability density function and its properties.
          I cdf, expectation, and variance of continuous random variables.
                                                                             14/ 15
Statistics for Data Science -1
   Probability density function, graph, and examples
      Probability density function
Introduction
          I A random variable is said to be uniformly distributed over the
            interval [0, 1] if its probability density function is given by                    
                        1 0≤x ≤1
            f (x) =
                        0 otherwise
                                                                              15/ 15