Visualising Categorical Data
Frequency
- The number of times a particular value or category occurs is known as the frequency. This is often used as the basis for displaying and analysing categorical data.
Example
In the following dataset of colours:
Red Red Blue Red
The frequency of each colour is:
Red: 3
Blue: 1
Percentage
- The proportion of the total data points which belong to a particular group is known as the percentage.
- This can be calculated using the formula:
p(\text {group } A)=\frac{\text { data points in group } A}{\text { total number of data points }} * 100
Frequency Table
- A frequency table shows the frequency of each category, and the corresponding percentage.
- The last row in a frequency table contains the total values for the corresponding columns. Each entry in the total row is equal to the sum of the above entries.
Example
Colour
|
Frequency
|
Number
|
Percentage
|
Red
|
3
|
75
|
Blue
|
1
|
25
|
Total
|
4
|
100
|
Note: the values in the total row should always be the sum of the number above them. This fact can be used to check all values in the table are correct.
Note: due to rounding, the total percentage value (sum of all individual percentages) may be a number close, but not equal to 100%, this is acceptable, and the value of the sum (i.e. not 100) should be used.
Bar Chart
- Bar charts for categorical data have the categories listed along the x-axis, and the frequency along the y-axis. A bar is drawn above the name of each category reaching to the corresponding frequency.
Example
Note: to make the bar chart easier to read, you should always leave a gap between each category.
Two-way Frequency Tables
- Two-way frequency tables are used to show relationships between two categorical datasets.
- The categories for one dataset are listed along the top row and the other is listed along the first column.
- Each elements of the table lists the number of data points belonging to both the category listed above and the left of it.
- As with a regular frequency table, the sum of the column is shown along the bottom row. For a two-way frequency table, the same is done for the rows and the overall total (i.e. the total number of data points) is shown in the bottom right.
Example
|
Big
|
Small
|
|
Red
|
2
|
1
|
3
|
Blue
|
0
|
1
|
1
|
|
2
|
2
|
4
|
Note: The total values listed along the bottom row and the last column should both sum to the value listed in the bottom right corner. Use this to check the table is correct.
Segmented Bar Chart
- Unlike regular bar charts, a segmented bar chart consists of a single bar, made up of segments representing the frequency of each group.
- Segmented bar charts can show either the numerical frequency or the percentage frequency.
Example
Note: when creating a segmented bar chart, you should always include a legend showing what each segment represents.