0% found this document useful (0 votes)

7 views9 pages

1 Introduction

The document provides an overview of statistics, emphasizing its importance in data analysis, decision-making, and its role in artificial intelligence. It distinguishes between descriptive and inferential statistics, detailing methods for data summarization, estimation, hypothesis testing, and types of data. Additionally, it covers scales of measurement, types of data, and the differences between population and sample in statistical studies.

Uploaded by

osama7abx

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views9 pages

1 Introduction

Uploaded by

osama7abx

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Introduction

Statistics
is the science of making sense of data and of how to gather data.

Reason to study Statistics

To be an informed “information consumer”.
Decision making, including personal decisions

Statistics is the branch of mathematics that involves collecting, analyzing, presenting, and organizing data.
It provides methodologies for making sense of numerical data and is crucial in various fields for decision-making
and predictions.

Importance of Statistics in AI
Statistics plays a vital role in AI for several reasons:

1. Data Analysis and Interpretation: AI models require vast amounts of data for training. Statistics helps in
analyzing this data to understand patterns, trends, and relationships.

2. Probability Theory: Many AI algorithms, such as Bayesian networks and Markov models, are grounded in
probability theory, which is a key component of statistics.

3. Model Evaluation: Statistical methods are used to evaluate the performance of AI models. Techniques such as
hypothesis testing, confidence intervals, and p-values help determine the significance of results.

4. Data Preprocessing: Statistics aids in data preprocessing tasks like handling missing values, outlier detection,
and normalization, which are crucial for building robust AI models.

5. Feature Selection: Statistical techniques help identify the most relevant (‫ )اﻷﻛﺜﺮ ﺻﻠﺔ‬features in a dataset, which
improves model performance and reduces complexity.

6. Predictive Modeling: Many AI applications involve predictive modeling, where statistical methods are used to
create models that can make accurate predictions based on historical data.

7. Uncertainty Quantification (‫)ﻋﺪم اﻟﯿﻘﯿﻦ اﻟﻜﻤﻲ‬: In AI, it is often necessary to quantify the uncertainty in ()predictions
(‫)ﻏﺎﻟﺒًﺎ ﻣﺎ ﯾﻜﻮن ﻣﻦ اﻟﻀﺮوري ﻗﯿﺎس ﻋﺪم اﻟﯿﻘﯿﻦ ﻓﻲ اﻟﺘﻨﺒﺆات‬. Statistical methods provide tools to measure and manage this
uncertainty.

Types of Statistics: Descriptive and Inferential

Descriptive Statistics (‫)اﻹﺣﺼﺎء اﻟﻮﺻﻔﻲ‬

Descriptive statistics involve methods for summarizing , analyzing and organizing data so that it can be
easily understood.

Purpose :

Descriptive statistics provide simple summaries about the sample and the measures, making large amounts of
data understandable using specialized methods.

These methods include:

Measures of Central Tendency :

Mean: The average of a dataset.

Median: The middle value when the data is ordered.
Mode: The most frequently occurring value in the dataset.

Measures of Dispersion (‫ )ﻣﻘﯿﺎس اﻟﻌﻤﻖ‬:**

Range: The difference between the highest and lowest values.

Variance: The average of the squared differences from the mean.
Standard Deviation: The square root of the variance, indicating the spread of the data around the mean.

Measures of Position :

Percentiles: Values below which a certain percentage of the data falls.

Quartiles: Values that divide the data into four equal parts.

Visualization Tools :

Histograms: Graphs showing the frequency distribution of a dataset.

Box Plots: Visual representations of the minimum, first quartile, median, third quartile, and maximum of a
dataset.
Scatter Plots: Graphs showing the relationship between two variables.

Inferential Statistics (‫)اﻹﺣﺼﺎﺋﯿﺎت اﻻﺳﺘﺪﻻﻟﯿﺔ‬

Inferential statistics involve methods for making predictions or inferences (‫ )اﺳﺘﺪﻻﻻت‬about a population based
on a sample of data drawn from that population.

Purpose :
Inferential statistics allow us to make predictions, decisions, or inferences about a population based on
sample data, helping in generalizing findings (‫ )ﺗﻌﻤﯿﻢ اﻟﻨﺘﺎﺋﺞ‬and understanding relationships between variables
using specialized methods.

These methods include:

Estimation (‫ )ﺗﻘﺪﯾﺮ‬:

Point Estimation (‫)ﺗﻘﺪﯾﺮ اﻟﻨﻘﻄﺔ‬: Providing a single value as an estimate of an unknown population parameter (e.g.,
sample mean as an estimate of population mean).
Interval Estimation: Providing a range of values within which the parameter is expected to lie (e.g., confidence
intervals).

Hypothesis Testing (‫ )اﺧﺘﺒﺎر اﻟﻔﺮﺿﯿﺎت‬:

Null Hypothesis (H0): A statement that there is no effect or no difference, which is tested for possible rejection.
Alternative Hypothesis (H1): A statement that indicates the presence of an effect or difference.
p-value: The probability of obtaining the observed results, or more extreme, assuming the null hypothesis is true.
Significance Level (α): The threshold for rejecting the null hypothesis, typically set at 0.05 or 5%.

Regression Analysis :

Simple Linear Regression: Analyzing the relationship between two variables by fitting a linear equation.
Multiple Regression: Analyzing the relationship between one dependent variable and multiple independent
variables.

Correlation :

Pearson Correlation Coefficient: Measures the linear relationship between two continuous variables.
Spearman Rank Correlation: Measures the strength and direction of association (‫ )ﯾﻘﯿﺲ ﻗﻮة واﺗﺠﺎه اﻻرﺗﺒﺎط‬between
two ranked variables.

Data
Definition of Data
Data refers to the raw facts, figures, and information collected for reference, analysis, and processing , It can be
in various forms such as numbers, text, images, and audio.

Types of Data: Qualitative and Quantitative

1. Qualitative (categorical) Data

Qualitative data ( ‫ ) اﻟﻨﻮﻋﯿﺔ‬, also known as categorical data, describes characteristics that cannot be measured
numerically , It is typically any non-numeric data , Ex : text data .

Types of Qualitative Data :

Nominal Data :

Definition: Data that can be categorized but not ranked or ordered.

Examples: Gender (male, female), eye color (blue, green, brown), types of cuisine (Italian, Chinese, Mexican).

Ordinal Data :

Definition: Data that can be categorized and ranked, but the intervals between ranks are not uniform.
Examples: Education levels (high school, bachelor's, master's, doctorate), customer satisfaction ratings
(satisfied, neutral, dissatisfied).

2. Quantitative (numerical) Data

Quantitative data , also known as numerical data, represents quantities (‫ )ﯾﻤﺜﻞ اﻟﻜﻤﯿﺎت‬and can be measured
and expressed numerically , It allows for mathematical operations and statistical analysis.

Types of Quantitative Data :

Discrete Data (‫ )ﺑﯿﺎﻧﺎت ﻣﻨﻌﺰﻟﺔ‬:

Definition: Data that can take on only specific, distinct inseparable values (‫)ﻗﯿﻢ ﻣﻨﻌﺰﻟﺔ ﻏﯿﺮ ﻗﺎﺑﻠﺔ ﻟﻠﺘﺠﺰﯾﺊ‬, always
counted as whole numbers (1,2,3, ..) .
Examples: Number of students in a class, number of cars in a parking lot, number of books on a shelf .

Continuous Data :

Definition: Separable data that can take on any value within a range and can be measured with infinite
precision (‫)دﻗﺔ ﻻ ﻧﻬﺎﺋﯿﺔ‬.
Examples: distance, temperature, weight, time , velocity .

Continuous vs. Discrete Data

1. Discrete Data

Discrete data consists of unique, separate values.

These values are countable and can only take specific values (a whole numbers) like {1,2,3,4,..} .
Discrete data often represent items that can be counted, and there are no intermediate (‫ )وﺳﻄﯿﺔ‬values
between the numbers in the dataset there is no { 3.7 , 5.5 , 1.2 ,..} .

Examples :

Number of students in a class: You can have 20 or 21 students, but not 20.5 students.
Number of cars in a parking lot: The count is a whole number (5, 10, 15).

Discrete Data Graph :

![[discrete_graph_example.jpg]]

2. Continuous Data

Continuous data consists of data that can take any value within a given range.
Unlike discrete data, continuous data represents any measurable object , and can take an infinite number
of values within a given interval.
Continuous data can be divided into finer (‫ )أدق‬and finer levels, essentially having an infinite number of
possible values and infinite precision .

Examples :

Temperature: The temperature can vary continuously, such as 23.4°C, 23.45°C, etc.
Time: Time can be measured with great precision, such as 12.3 seconds, 12.35 seconds, and so on.

Continuous Data Graph

![[continuous_graph.jpg]]

Statistics vs. Parameters and Population vs. Sample

Population vs. Sample

Population :

Definition: A population is the entire set of individuals, items, or things in a particular study to collect the data
.
It includes every member of a defined group that we are studying or collecting information on .
Examples: All students in a university, all cats in a city, all products sailed by a company.

Sample :
Definition: A sample is a subset of the population selected for measurement, observation, or questioning to
provide statistical information about the population.
Examples: 200 students selected from a university, 500 cat from a city .

Most used ways to collect data :

Random .
Stratified .
Systematic
Cluster

Statistics vs. Parameters

Parameter :

Definition: A parameter is a numerical value that describes a characteristic of a population.

Examples: Population mean (μ), population standard deviation (σ), population proportion (P).

Statistic :

Definition: A statistic is a numerical value that describes a characteristic of a sample.

Examples: Sample mean (x̄), sample standard deviation (s), sample proportion (p̂ ).

Measure : Sample Statistic Population Parameter

Mean x̄ (Sample Mean) μ (Population Mean)
Standard Deviation s (Sample Std Dev) σ (Population Std Dev)
Variance s² (Sample Variance) σ² (Population Variance)
Proportion p̂ (Sample Proportion) P (Population Proportion)
Size n (Sample Size) N (Population Size)

Scales of Measurement - Nominal, Ordinal, Interval, & Ratio

Scale Data
1. Nominal Scale

Definition :

The nominal scale : is the most basic level of measurement.

It categorizes data without any order or quantitative value.

Characteristics :
Categories Only: Data is grouped into distinct categories or groups.
No Order: Categories do not have a meaningful order or ranking.
Qualitative Data: It represents qualitative attributes.

Examples :

Gender (Male, Female)

Colors (Red, Blue, Green)
Types of animals (Dog, Cat, Bird)

Mathematical Operations :

Mode (most frequent category) is meaningful.

No meaningful arithmetic operations (e.g., addition or subtraction).

2. Ordinal Scale

Definition :

The ordinal scale categorizes data into ordered categories, but the differences between the categories are
unmeasurable .

Characteristics :

Order Matters: Data is ranked or ordered, but the distance between ranks is not defined.
Qualitative Data: Represents qualitative attributes with a meaningful order.

Examples :

Satisfaction Ratings (Very Unsatisfied, Unsatisfied, Neutral, Satisfied, Very Satisfied)

Education Level (High School, Bachelor’s, Master’s, PhD)

Mathematical Operations :

Median and mode are meaningful.

No meaningful arithmetic operations (e.g., addition or subtraction) due to non-uniform intervals.

3. Interval Scale

Definition :
The interval scale measures data with both order and exact differences between values, but it does not have
a true zero point ( BC there is negative and positive data).

Characteristics :

Ordered Data: Data has a meaningful order.

Values Differences: The differences between values are consistent and meaningful.
No True Zero: The zero point does not represent the start of the attributes .

Examples :

Temperature in Celsius or Fahrenheit (e.g., 10°C, 20°C, 30°C)

IQ Scores

Mathematical Operations :

Mean, median, and mode are meaningful.

Addition and subtraction are meaningful, but multiplication and division are not (e.g., 20°C is not "twice as hot" as
10°C).

4. Ratio Scale

Definition :

The ratio : scale is the highest level of measurement.

It has all the properties of the interval scale, with an additional true zero point (starts from zero) that allows for
the representation of the absence of the attribute ( ‫ )ﯾﺴﻤﺢ ﺑﺘﻤﺜﻞ ﻋﺪم وﺟﻮد ﻣﺘﻐﯿﺮ ﻣﻌﯿﻦ ﺑﻜﻮﻧﻪ = ﺻﻔﺮ‬.

Characteristics :

Order & Equal Intervals: Data has a meaningful order with consistent intervals.
True Zero: Zero represents the absence of the attribute, making comparisons of absolute magnitudes possible
(‫ )ﻣﻤﺎ ﯾﺠﻌﻞ اﻟﻤﻘﺎرﻧﺎت ﺑﯿﻦ اﻟﻤﻘﺎدﯾﺮ اﻟﻤﻄﻠﻘﺔ ﻣﻤﻜﻨﺔ‬.

Examples :

Height (e.g., 160 cm, 180 cm)

Weight (e.g., 50 kg, 70 kg)

Mathematical Operations :

Mean, median, and mode are meaningful.

Addition, subtraction, multiplication, and division are all meaningful.

Applied Statistics
No ratings yet
Applied Statistics
10 pages
4 5998881474881786338
No ratings yet
4 5998881474881786338
15 pages
الاحصاء الوصفي المستوي الاول لغة
No ratings yet
الاحصاء الوصفي المستوي الاول لغة
334 pages
L1 Biosta Introduction
No ratings yet
L1 Biosta Introduction
25 pages
2 Introduction Stat
No ratings yet
2 Introduction Stat
35 pages
Lecture 1 Introduction To Probability and Statistics
No ratings yet
Lecture 1 Introduction To Probability and Statistics
28 pages
STAT Final Notes
No ratings yet
STAT Final Notes
32 pages
Ahsan Stats
No ratings yet
Ahsan Stats
9 pages
Descriptive - Statistics Data Discret chp2
No ratings yet
Descriptive - Statistics Data Discret chp2
7 pages
Session 01
No ratings yet
Session 01
16 pages
Stat 2
No ratings yet
Stat 2
109 pages
MATH2203 Statistics I - Week 1
No ratings yet
MATH2203 Statistics I - Week 1
27 pages
Data Science (Unit 02) Notes
No ratings yet
Data Science (Unit 02) Notes
7 pages
ةداملا مسا (Subject) ثحبلا ناونع (Research Title) Graphs and its importance
No ratings yet
ةداملا مسا (Subject) ثحبلا ناونع (Research Title) Graphs and its importance
18 pages
Session 38 Statistic - 1
No ratings yet
Session 38 Statistic - 1
28 pages
سيناء 25 Full Answers
No ratings yet
سيناء 25 Full Answers
26 pages
STAT. Lec.1
No ratings yet
STAT. Lec.1
30 pages
CCM 202 Lecture 2 Statistics
No ratings yet
CCM 202 Lecture 2 Statistics
11 pages
Types and Levels of Data Explained
No ratings yet
Types and Levels of Data Explained
32 pages
Introduction To STATISTICS-new
No ratings yet
Introduction To STATISTICS-new
44 pages
Week 1
No ratings yet
Week 1
76 pages
Data and Its Types
No ratings yet
Data and Its Types
32 pages
التعاريف
No ratings yet
التعاريف
6 pages
Kkchap 1
No ratings yet
Kkchap 1
61 pages
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
No ratings yet
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
28 pages
Chapter 1. Biostatistics
No ratings yet
Chapter 1. Biostatistics
34 pages
STA132 Complete Note
No ratings yet
STA132 Complete Note
110 pages
S M For Management - Lec 1 22092024 095559am
No ratings yet
S M For Management - Lec 1 22092024 095559am
3 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
20 pages
Day 5 Statistics (1 of 3) - Basics
No ratings yet
Day 5 Statistics (1 of 3) - Basics
19 pages
Introduction To Biostatistics
100% (1)
Introduction To Biostatistics
13 pages
Introduction To Statistical Methods in Research
No ratings yet
Introduction To Statistical Methods in Research
30 pages
Statistics For Educational Research
100% (1)
Statistics For Educational Research
3 pages
MMW Stat 24 25
No ratings yet
MMW Stat 24 25
42 pages
Statistics Analysis With Software Application
No ratings yet
Statistics Analysis With Software Application
22 pages
Unit 1 AIDS
No ratings yet
Unit 1 AIDS
128 pages
Statistics and Probability - Lect1
No ratings yet
Statistics and Probability - Lect1
36 pages
Mid Term 1 PDF
No ratings yet
Mid Term 1 PDF
38 pages
Statistics & Data
No ratings yet
Statistics & Data
11 pages
Unit 1 - Introduction To Statistics - Notes
No ratings yet
Unit 1 - Introduction To Statistics - Notes
32 pages
Lect. One
No ratings yet
Lect. One
10 pages
Comprehensive Guide to Statistics
No ratings yet
Comprehensive Guide to Statistics
21 pages
Chapter 1 Introduction To Statistics
No ratings yet
Chapter 1 Introduction To Statistics
28 pages
Chapter 1 2
No ratings yet
Chapter 1 2
23 pages
Chapter1 - L1 احصاء
No ratings yet
Chapter1 - L1 احصاء
23 pages
Stats Bio Supp. 1
No ratings yet
Stats Bio Supp. 1
11 pages
Module1 Understanding Data1
No ratings yet
Module1 Understanding Data1
56 pages
Chapter 01
No ratings yet
Chapter 01
96 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
27 pages
DATA MANAGEMENT STATISTICSMeasures of Central Tendency Measures
No ratings yet
DATA MANAGEMENT STATISTICSMeasures of Central Tendency Measures
60 pages
STAT110 Biostatistics
No ratings yet
STAT110 Biostatistics
21 pages
Central Tendencies
No ratings yet
Central Tendencies
5 pages
DS1 Section D
No ratings yet
DS1 Section D
14 pages
Lesson1 - Data Definitions
No ratings yet
Lesson1 - Data Definitions
57 pages
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
No ratings yet
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
53 pages
ML Unit-II Notes
No ratings yet
ML Unit-II Notes
86 pages
Introduction to Statistics Basics
100% (1)
Introduction to Statistics Basics
46 pages
DSA Unit 2 Answers
No ratings yet
DSA Unit 2 Answers
22 pages
Fds Presentation II YEAR
No ratings yet
Fds Presentation II YEAR
21 pages
7.4 Mean Absolute Deviation
No ratings yet
7.4 Mean Absolute Deviation
3 pages
8.3 Correlation
No ratings yet
8.3 Correlation
11 pages
Probability First 4 Lec
No ratings yet
Probability First 4 Lec
83 pages
Binary Search
No ratings yet
Binary Search
7 pages
8.1 Introduction
No ratings yet
8.1 Introduction
1 page
Google Code Jam Practice Problems
100% (1)
Google Code Jam Practice Problems
14 pages
Lenovo V14 G2 ITL 82KA00C1LM
No ratings yet
Lenovo V14 G2 ITL 82KA00C1LM
2 pages
Afm A320f 05apr23
50% (2)
Afm A320f 05apr23
1,306 pages
Fluid Electrolytes
No ratings yet
Fluid Electrolytes
52 pages
Electrics DA42 V2.0
No ratings yet
Electrics DA42 V2.0
3 pages
A Brief Review of Arthritis and Its Types
No ratings yet
A Brief Review of Arthritis and Its Types
32 pages
0126 MSDS Thinner 2021
No ratings yet
0126 MSDS Thinner 2021
5 pages
Salon Aromatherapy Essentials
No ratings yet
Salon Aromatherapy Essentials
2 pages
Department of Education: Learning Activity Sheet
No ratings yet
Department of Education: Learning Activity Sheet
2 pages
Didier Blau - The Art of Divination PDF
100% (2)
Didier Blau - The Art of Divination PDF
252 pages
Uranium Deposit Geology Explained
No ratings yet
Uranium Deposit Geology Explained
5 pages
Gerunds and Infinitves
No ratings yet
Gerunds and Infinitves
5 pages
Features of Narrative Text
No ratings yet
Features of Narrative Text
8 pages
Akman 2015
No ratings yet
Akman 2015
7 pages
CSE320 Datapath Exam Prep
No ratings yet
CSE320 Datapath Exam Prep
13 pages
Stainless Steel Banding - Ss316 / 316L: Characteristics
No ratings yet
Stainless Steel Banding - Ss316 / 316L: Characteristics
1 page
Histology and Embryology Answer Key
No ratings yet
Histology and Embryology Answer Key
2 pages
Neural Network Learning Speed
No ratings yet
Neural Network Learning Speed
7 pages
Hiatal Hernia Pathophysiology - Schematic Diagram
100% (1)
Hiatal Hernia Pathophysiology - Schematic Diagram
1 page
Math 10 Diagnostic
No ratings yet
Math 10 Diagnostic
2 pages
(Ebook) Principles of Corporate Finance by Richard A. Brealey, Stewart C. Myers, Franklin Allen ISBN 9780073530734, 0073530735 Download
No ratings yet
(Ebook) Principles of Corporate Finance by Richard A. Brealey, Stewart C. Myers, Franklin Allen ISBN 9780073530734, 0073530735 Download
121 pages
1st Year Math Paper-Chp-7,9,11
No ratings yet
1st Year Math Paper-Chp-7,9,11
2 pages
Climate Change Interrupted Barbara Leckie Download
100% (1)
Climate Change Interrupted Barbara Leckie Download
30 pages
Research Opinions in Animal & Veterinary Sciences
No ratings yet
Research Opinions in Animal & Veterinary Sciences
4 pages
KUKA CNC Sinumerik 11 en
No ratings yet
KUKA CNC Sinumerik 11 en
45 pages
Exemple Dissertation Sur Le Romantisme
100% (2)
Exemple Dissertation Sur Le Romantisme
7 pages
2022 January - Unit 1 Mark Scheme
No ratings yet
2022 January - Unit 1 Mark Scheme
21 pages
Homeostasis: Body Systems Maintain Homeostasis
No ratings yet
Homeostasis: Body Systems Maintain Homeostasis
46 pages
Exponential and Logarithm
No ratings yet
Exponential and Logarithm
6 pages
IBT Sample Grade 5 Science
85% (13)
IBT Sample Grade 5 Science
8 pages

1 Introduction

Uploaded by

1 Introduction

Uploaded by

Introduction

Reason to study Statistics

Types of Statistics: Descriptive and Inferential

These methods include:

Measures of Central Tendency :

Mean: The average of a dataset.

Measures of Dispersion (‫ )ﻣﻘﯿﺎس اﻟﻌﻤﻖ‬:**

Range: The difference between the highest and lowest values.

Percentiles: Values below which a certain percentage of the data falls.

Histograms: Graphs showing the frequency distribution of a dataset.

Inferential Statistics (‫)اﻹﺣﺼﺎﺋﯿﺎت اﻻﺳﺘﺪﻻﻟﯿﺔ‬

These methods include:

Hypothesis Testing (‫ )اﺧﺘﺒﺎر اﻟﻔﺮﺿﯿﺎت‬:

Types of Data: Qualitative and Quantitative

Types of Qualitative Data :

Definition: Data that can be categorized but not ranked or ordered.

2. Quantitative (numerical) Data

Types of Quantitative Data :

Discrete Data (‫ )ﺑﯿﺎﻧﺎت ﻣﻨﻌﺰﻟﺔ‬:

Continuous vs. Discrete Data

Discrete data consists of unique, separate values.

Discrete Data Graph :

Continuous Data Graph

Statistics vs. Parameters and Population vs. Sample

Most used ways to collect data :

Statistics vs. Parameters

Definition: A parameter is a numerical value that describes a characteristic of a population.

Definition: A statistic is a numerical value that describes a characteristic of a sample.

Measure : Sample Statistic Population Parameter

Scales of Measurement - Nominal, Ordinal, Interval, & Ratio

The nominal scale : is the most basic level of measurement.

Gender (Male, Female)

Mode (most frequent category) is meaningful.

Satisfaction Ratings (Very Unsatisfied, Unsatisfied, Neutral, Satisfied, Very Satisfied)

Median and mode are meaningful.

Ordered Data: Data has a meaningful order.

Temperature in Celsius or Fahrenheit (e.g., 10°C, 20°C, 30°C)

Mean, median, and mode are meaningful.

The ratio : scale is the highest level of measurement.

Height (e.g., 160 cm, 180 cm)

Mean, median, and mode are meaningful.

You might also like