Mean
Mean is one of the most common measure of central tendency. For collection of numbers, it represents the center point in the collection. It is also called Average or Expected value and represented by “μ”.
Calculation of Mean
Lets say the datapoints in a collection is given as [x₁, x₂, x₃, …. xₙ]
where:-
xᵢ : represents datapoint at i^th position
n : represents total number of datapoints
μ = (x₁+ x₂ + x₃ + …. + xₙ)/n = 𝛴x/n
For example :-
In a class a group of 5 students have scored 50, 80, 92, 98 and 33. The mean of their scores will be
μ = (50+80+92+98+33)/5 = 70.6
Why to use mean
- Mean indicates a region where most values in the distribution fall and referred as central location of distribution
- We can also think of it as middle point among the observation
Why to not use mean
- Mean is very sensitive to outliers in the dataset and fails to correctly measure the central tendency in the presence of outliers
- They also fail if the data is skewed in one direction
Lets take an example:-
In the following graph
x-axis : represents marks scored by students
y-axis : number of students scored the marks
In the above example, average of give data is 77.41 which doesn’t indicate the central tendency as the outliers are pushing the mean away from the central point.
As the distribution becomes more skewed, the mean is drawn further away from the center
Mean indicates central tendency only when we have symmetric distribution