Summarizing Data

Measuring Spread: the Range
When datasets are large, it's important to find a smaller set of values able to describe its overall behavior.[br][br]The first indicator is the [i][b]range [/b][/i]of the dataset, that we obtain by subtracting its lowest value from its highest value. The range describes how spread are data.[br][br][i]Example[/i]: The range of the set [math]\left\{2,3,23,44,56,122\right\}[/math] is [math]122-2=120[/math][br][br]
Measuring the Central Tendency: the Averages
The [i][b]measures of central tendency[/b][/i] (averages) summarize a dataset using one calculated value, that conveys information about the typical values of data and their distribution.[br][br]The most common[i][b] measures of central tendency[/b][/i] of a dataset are the ([i]arithmetic[/i]) [i][b]mean[/b][/i], the [i][b]median [/b][/i]and the [i][b]mode[/b][/i].[br]
Mean
The (arithmetic) [i][b]mean [/b][/i]is the sum of all values in the dataset, divided by their number.[br]If the dataset contains [i]qualitative data[/i], it's not possible to calculate the mean of these data.[br][br][i]Example 1[/i]: The mean of the set [math]\left\{2,3,23,44,56,122\right\}[/math] is [math]m=\frac{2+3+23+44+56+122}{6}=\frac{250}{6}\approx41.7[/math][br][br][i]Example 2[/i]: The mean of the set [math]\left\{dog,cat,cat,cat,dog,monkey\right\}[/math] is not defined.[br][br]
Median
The [i][b]median [/b][/i]is the middle value of an ordered dataset, with data arranged in increasing order.[br][br]- If the ordered data set contains an [b][i]odd [/i][/b]number of values, the [i][b]median [/b][/i]will be the [i][b]middle value [/b][/i]in the list.[br]- If the ordered data set contains an [i][b]even [/b][/i]number of values, the [i][b]median [/b][/i]will be the (arithmetic) [i][b]mean [/b][/i]of the two middle values.[br][br]If the dataset contains [i][b]qualitative data[/b][/i], it's not possible to calculate the median of these data.[br][br][i]Example[/i] 1: Calculate the median of the set [math]\left\{1,3,4,7,5,4,6,7,8,9,11\right\}[/math].[br]- The set contains 11 values (odd)[br]- Sort values in increasing order: [math]\left\{1,3,4,4,5,6,7,7,8,9,11\right\}[/math][br]- The median is [math]6[/math], that is the middle element of the list above.[br][br][i]Example [/i]2: Calculate the median of the set [math]\left\{2,3,23,44,56,122\right\}[/math][br]- The set is already ordered, and contains 6 values (even)[br]- The values in the middle of the list are [math]23,44[/math][br]- The median of the given set is the mean of these values: [math]med=\frac{23+44}{2}=\frac{67}{2}=33.5[/math][br][br][i]Example[/i] 3: The median of the set [math]\left\{dog,cat,cat,cat,dog,monkey\right\}[/math] is not defined.[br]
Mode
The [i][b]mode [/b][/i]is the value with the highest frequency in a dataset (i.e. that occurs most often)[br]It is not necessarily unique, and it is defined also for qualitative datasets.[br][br][i]Example[/i] 1: The mode of the set [math]\left\{2,3,4,4,4,4,4,5,6,7,7,8,9,9,321,321\right\}[/math] is [math]4[/math].[br][br][i]Example [/i]2: The set [math]\left\{1,1,1,2,2,3,3,3,4,5\right\}[/math] has two modes: [math]1[/math] and [math]3[/math].[br][br][i]Example [/i]3: The mode of the set [math]\left\{dog,cat,cat,cat,dog,monkey\right\}[/math] is [math]cat[/math].[br][br]

Information: Summarizing Data