How to understand Normal distribution?
Normal distribution was developed by Dr.Abraham/De Moivre in 1733 as a limiting form of the binomial distribution. It was rediscovered by the German Mathematician Gauss in 1809 and again by a French Mathematician, Pascal in 1812. The Normal distribution is also known as the Gaussian distribution or normal curve. It is a perfectly symmetrical and bell-shaped distribution where the values of Mean, Median, and Mode are equal.
The normal distribution is a continuous probability distribution. It is one of the most important and widely used theoretical distributions in statistical work. It is mainly used to study the behavior of continuous random variables like height, weight, and intelligence of a group of students. The normal distribution, also known as the Gaussian distribution or the bell curve, is one of the most fundamental and widely encountered probability distributions in statistics. It is characterized by a symmetric, bell-shaped probability density function (PDF) and is often used to model the distribution of various continuous random variables in natural and social sciences.
Here are the key characteristics and properties of the normal distribution:
Symmetry: The normal distribution is symmetric around its mean (average), which is located at the peak of the bell curve. This means that the probability of observing values below the mean is equal to the probability of observing values above the mean.
Bell-Shaped Curve: The PDF of the normal distribution forms a smooth, bell-shaped curve. This curve reaches its peak at the mean and gradually tapers off as values move away from the mean in both directions.
Mean, Median, and Mode: In a normal distribution, the mean, median, and mode (the most frequent value) are all equal and located at the center of the distribution.
Parameterization: The normal distribution is fully described by two parameters: the mean (μ) and the standard deviation (σ). The mean determines the center of the distribution, while the standard deviation controls the spread or dispersion of the data points around the mean. Larger standard deviations result in wider, flatter curves.
Empirical Rule (68-95-99.7 Rule): A fundamental property of the normal distribution is that approximately 68% of data falls within one standard deviation of the mean (μ ± σ), approximately 95% falls within two standard deviations (μ ± 2σ), and approximately 99.7% falls within three standard deviations (μ ± 3σ). This rule is useful for interpreting data within the context of a normal distribution.
Probability Density Function: The probability density function of the normal distribution is given by the formula:
Where:
σ (sigma) is the standard deviation.
Ï€ (pi) is approximately 3.14159.
e is the base of the natural logarithm, approximately 2.71828.
x is the value of the random variable.
Standardization: To work with the normal distribution, data is often standardized by subtracting the mean and dividing by the standard deviation. This transformation results in a standard normal distribution with a mean of 0 and a standard deviation of 1.
Applications: The normal distribution is used in various fields, including statistics, natural sciences, social sciences, economics, and engineering. It is employed to model and analyze a wide range of phenomena, such as heights and weights of populations, test scores, measurement errors, and financial returns.
Central Limit Theorem: One of the most important concepts related to the normal distribution is the Central Limit Theorem. It states that the distribution of the sample mean of a sufficiently large number of independent, identically distributed random variables approaches a normal distribution, regardless of the original distribution of the variables. This theorem is foundational in statistical inference and hypothesis testing.
In summary, the normal distribution is a fundamental probability distribution characterized by its bell-shaped curve, symmetry, and parameterization by mean and standard deviation. It is a versatile and widely used tool for modeling and analyzing continuous data and plays a crucial role in various statistical and scientific applications.
Properties of Normal Distribution
The normal distributions have some main properties or characteristics:
a) Normal distribution is a continuous probability distribution.
b) Normal distribution is perfectly symmetrical and bell-shaped.
c) Normal distribution only one mode. So it is an unimodal distribution.
d) In a normal distribution, mean, median, and mode are equal(mean X=M=Z).
e) The total area of the normal distribution is considered as 1. The ordinate at the mean of the distribution divides the total area of the curve into two equal parts. So the area of the left-hand and right-hand sides of the curve from the mean ordinate is 0.5 each.
f) The normal curve is asymptotic to the baseline, it never touches the baseline on both sides of the curve though the curve remains very close to the baseline on either side.
g)The normal distribution has two parameters namely mean and standard deviation. The entire distribution can be known from these two parameters.
h) The height(ordinate) of the normal curve at the mean is maximum.
i) In a normal distribution the upper and lower Quartiles(Q3 and Q1) remain at equidistance from the median.(Q3-Median = Median- Q1).
j) In a normal distribution the Quartile Deviation is always equal to 2/3 of the standard deviation.(Q.D=2*S.D/3)
k) The mean deviation is always 4/5 of the standard deviation in such distribution(M.D=4*S.D/5)
l) One of the most important properties of a normal curve is the area relationship property. With the help of mean and standard deviation, we can measure the area of normal distribution. There is some relationship:
1) mean X +_ =1 S.D covers 68.27% area. It means the area under the normal curve between mean X- 1 S.D and mean X+1S.D is 0.6827.
2) meanX+_ 2S.D covers 95.45% area. It means the area under the normal curve between means-2S.D and means X+2S.D is 0.9545.
3) mean X+_ covers 99.73% area Which means the area under the normal curve between means-3S.D and means X+3S.D is 0.9973. This property is used to determine the confidence limits on the basis of standard error for a sample.
4) mean X+_ 1.96S.D covers 95% area. This property is used for testing a null hypothesis (H0).
Normal Distribution may be used as a limiting form of binomial distribution under some conditions:
a) The number of trials (n) is infinitely large(n tends to infinity)
b) Neither p nor q is very small.
Constants of Normal Distribution
The normal distribution has main constants:
a) The mean of the normal distribution is mean X (mean X=np)
b) The standard deviation of the normal distribution is S.D= square root(npq)
c) Moment coefficient about the mean(mu):
mu1= 0
mu2 = variance=(S.D)^2
mu3 = 0
mu4 3(S.D)^4
d) Moment Coefficient of Skewness(B1)
B1 =( mu3)^2/(mu2)^3 =0 (No skewness,it is always symmetrical)
e) Moment Coefficient of Kurtosis (B2)
B2 = (mu4)/(mu2)^2=3 (It is always mesokurtic).
Probability under Normal Curve
The normal distribution is defined and given by the following probability density function:
P(x) = 1/S.D[square root (2* Pie)]
Here: x = Value of continuous random variable
mu = Mean of Normal random variable
e = 2.7183
pie= 3.146
In real practice, we are more interested in the area of the probability of normal curves. Thus the area between two values expresses the probability between these two values. The probability can be measured with the help of the area table.
For this purpose, we have to convert the given information of X -scale into Zscale information by the formula:
Z = (X-mean X)/S.D
The importance and uses of the normal distribution :
a) Base on sampling theory and parametric test.
b) Useful to set up the range or confidence limits.
c) Importance in statistical quality control.
d) Study of natural phenomena.
e) Useful in a situation where the size of a sample is very large.
f) Approximation to binomial and poisson distribution.
Apply Normal Distribution in PYTHON
# Importing libraries
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
# Generate data for the plot
data = np.linspace(-3, 3, num = 1000)
# Define the mean and standard deviation of the normal distribution
mean = 0
std = 1
# Generate the function of the normal distribution
pdf = stats.norm.pdf(data, mean, std)
# Plot the normal distribution plot
plt.plot(data, pdf, '-', color = 'black', lw = 2)
plt.axvline(mean, color = 'black', linestyle = '--')
plt.grid()
plt.show()
More Related
0 Comments