Last modified: September 21, 2024

This article is written in: 🇺🇸

Introduction to Statistics

Statistics is an empirical science, focusing on data-driven insights for real-world applications. This guide offers a concise exploration of statistical fundamentals, aimed at providing practical knowledge for data analysis and interpretation.

Key Concepts in Statistics

Real-World Importance of Statistics

Applied Statistical Methods

Statistical Tools in Action

Population and Sample

# @ * ! % * # ! @
* ! % # @ ! % @ *
@ # ! % * @ # % #
! % @ * # ! @ * !
% * # @ ! % @ * #

@ !
* %

Illustrative Scenarios

  1. In a poll of 1,200 registered voters, 45% preferred candidate A over candidate B.
  2. The population in this case is all registered voters in the country.
  3. The sample consists of 1,200 voters polled, with 45% supporting candidate A.

  4. An educational researcher surveyed 100 teachers across 20 schools to study remote learning.

  5. The population includes all teachers involved in remote learning.
  6. The sample is the group of 100 teachers surveyed from 20 different schools.

  7. Researchers interviewed 250 gym members from a city to estimate how often residents visit gym facilities.

  8. The population is the total membership of all city gyms.
  9. The sample includes the 250 gym members interviewed for the study.

  10. A representative sample accurately reflects the characteristics of the population, ensuring proportionality in terms of gender, age, or socio-economic status.

Population Distribution (Gender Example)

| F | F | M | M | F | M |

| F | M | F |

Types of Biases

Strategies to Counteract Bias

Variables and Data

Visualization of Data Collection from a Group

Imagine a group of individuals, each with unique attributes to be measured:

O   O   O   O   O
  /|\ /|\ /|\ /|\ /|\
  / \ / \ / \ / \ / \

Each stick figure represents a person, and the data collected could include measurements like weight, height, and gender.

Tabular representation of collected data:

Name Gender Weight Height
Alice Female 135 5'6"
Bob Male 180 6'0"
Carol Female 140 5'5"
David Male 175 5'11"
Eve Female 150 5'7"

In this table, the variables being measured are Name (categorical), Gender (categorical), Weight (numerical), and Height (numerical).

Parameter vs. Statistic

Example: Application of Parameters and Statistics

  1. Suppose researchers want to find the average income of all adults in a large city. The population is all adults in the city, and the parameter of interest is the average income.
  2. Since it’s impractical to collect income data from every adult, they take a sample of 500 adults. The average income from this sample is calculated as the statistic.
  3. Using this sample statistic, researchers estimate the population parameter—the average income for all adults in the city.

This process of using a statistic to estimate a parameter is foundational in inferential statistics, allowing researchers to draw conclusions about large populations from manageable samples.

Classification of Variables

Variables are broadly categorized into two types: Numerical and Categorical.

All Variables
                   /            \
            Numerical       Categorical
           /        \       
   Discrete  Continuous

Numerical Variables

Categorical Variables

Data Table Example with Variable Types

Name Age Height (inches) Income ($) Education Level Marital Status
Alice 28 64 50000 High School Married
Bob 35 70 75000 Bachelor's Single
Carol 42 62 60000 Master's Married
David 31 68 80000 Ph.D. Single
Eve 26 66 45000 Associate's Married

Explanation of Variables in the Table:

Explanatory and Response Variables

Explanatory Variable (Independent Variable):

Response Variable (Dependent Variable):

Practical Illustration:

Observational Studies and Experiments

Below is a table for the comparison between observational studies and experiments:

Aspect Observational Studies Experiments
Purpose Observe and collect data on naturally occurring events without intervention. Investigate cause-and-effect relationships by actively manipulating variables.
Control Limited control over variables; focus on observing existing conditions. High level of control, including manipulation of independent variables and control groups.
Causation Can identify associations or correlations, but cannot establish causation. Can establish causation by manipulating variables and observing effects.
Examples Cross-sectional studies, cohort studies, case-control studies, surveys. Clinical trials, laboratory experiments, field experiments.
Ethics Generally less intrusive, often not requiring consent for public or existing data. Requires informed consent, with strict ethical considerations for human or animal subjects.

Table of Contents

    Introduction to Statistics
    1. Key Concepts in Statistics
    2. Real-World Importance of Statistics
    3. Applied Statistical Methods
    4. Statistical Tools in Action
    5. Population and Sample
      1. Illustrative Scenarios
      2. Population Distribution (Gender Example)
      3. Types of Biases
      4. Strategies to Counteract Bias
    6. Variables and Data
      1. Visualization of Data Collection from a Group
      2. Parameter vs. Statistic
      3. Example: Application of Parameters and Statistics
      4. Classification of Variables
      5. Numerical Variables
      6. Categorical Variables
      7. Data Table Example with Variable Types
      8. Explanatory and Response Variables
    7. Observational Studies and Experiments