## Cumulative frequency screencast

The YouTube video above explains the idea of cumulative frequency. It is the first fully scripted screencast, and I think it sounds like it is being read out! I’m going back to the cue card idea and extemporising around the slides I think.

This screencast is aimed at GCSE Maths students working at intermediate level on the data handling module of a modular GCSE. I have left out questions like “what percentage of the seedlings where greater than 50 mm in height” and I have used intervals that make finding the upper class boundary easy. Both of these points will need to be explored in a handout and in revision sessions.

### The script

This screencast assumes that you know what a grouped frequency distribution is, and that you are familiar with terms like range and median.

Slide 1: I will show how to find the cumulative frequencies from a frequency table and plot these against the upper class boundary for each interval in the table. Then I will use the cumulative frequency curve to find the median and the upper and lower quartiles. Finally, I will calculate the inter-quartile range and explain what the IQR tells you about the distribution.

Slide 2: The cumulative frequency is a running total of the frequency. Lets look as some example data…

Slide 3: This frequency table shows the results from some cress seedlings grown on soil. To find the cumulative frequencies just..

Slide 4: ...add a row for the cumulative frequencies…

Slide 5: The first cumulative frequency is seven..

Slide 6: The next cumulative frequency is seven plus twelve giving nineteen

Slide 7: and then the next cumulative frequency is seven plus twelve plus fifteen giving thirty four. You can add the fifteen in the third column to the cumulative frequency in the previous column, nineteen, to get the same result…

Slide 8: and finally just add three to the thirtyfour to get thirty seven. So now you have all the cumulative frequencies. The last cumulative frequency should be the total of the frequencies as a check.

Slide 9: Next we need to identify the upper class boundaries for the frequency diagram. These are what we plot the cumulative frequencies against.

Slide 10: The upper class boundary of the first interval is the largest value that can be allocated to the first interval.

Slide 11: Because of the way we have written the intervals, forty five is the largest value in the first interval.

Slide 12: fifty is the upper class interval for the second interval….

Slide 13: ... and fifty-five for the third…

Slide 14: .... and sixty for the last interval. Other ways of writing the intervals can make it harder to find the upper class boundaries – there are some examples in the handout.

Slide 15: ...Now we need to plot the cumulative frequency against the upper class boundaries
Slide 16: The data we are plotting looks like this (you don’t need to draw a new table when you are doing exam questions). The upper class boundary is plotted on the horizontal axis and the cumulative frequency on the vertical axis

Slide 17: A suitable graph scale might look like this. You plot each point just as you would on a scatter diagram.

Slide 18: To plot the first point, go along to 45mm on the horizontal axis and up to 7 on the vertical axis. I’m using huge blobs for clarity, you would be using small neat crosses!

Slide 19: along to 50mm and up to 19…

Slide 20: along to 55mm and up to 34…

Slide 21: along to 60mm and up to 37..

Slide 22: and then we add a point at forty mm and zero cumulative frequency. I chose 40mm as this is 45 – 5, or an interval width of 5 ‘before’ the first upper class boundary value. This extra point provides us with a starting value for the curve

Slide 23: Finally you draw a smooth curve through the points – don’t join the points dot-to-dot.

Slide 24: I have removed the huge blobs I used to represent the points to show the curve more clearly in what follows. You can see the smooth s shaped curve clearly – most data will give you a curve like this.

Slide 25: Once you have your cumulative frequency curve, you can use it to read off values including the median.

Slide 26: The median is the value of the ‘middle’ data item. 37 divided by 2 is 18.5….

Slide 27: So we find 18.5 on the cumulative frequency axis..

Slide 28: and then draw a line across to the curve

Slide 29: Draw a line down and read off the height value from the horizontal axis

Slide 30: This value is the median. I estimate the median to be 49mm

Slide 31: remember it is the value on the horizontal axis corresponding to half the frequency that gives the median, don’t just write down half the frequency as the median!

Slide 32: The Quartiles can give you some information about how spread out your data is, and tell you something about the middle 50% of your data items.

Slide 33: The lower quartile tells you the value that divides your data into the lowest 25%
Slide 34: The upper quartile tells you the value that the top 25% reach down to

Slide 35: For example, using our cress data…

Slide 36: A cumulative frequency of 9.25 is what you use to find the lower quartile value and a cumulative frequency value of 27.75 is used to find the value of the upper quartile

Slide 37: Next, look at the blank cumulative frequency curve…

Slide 38: To find the lower quartile just find 9.75

Slide 39: ... and draw along to the curve…

Slide 40: Draw a line down to the horizontal axis and read off the value of the lower quartile

Slide 41: I make it to be 45mm

Slide 42: In the same way to find the Upper quartile

Slide 43: find 27.75 on the vertical cumulative frequency axis

Slide 44: and read off the corresponding upper quartile value…

Slide 45: I make this to be 52mm

Slide 46: The interquartile range tells you how spread out your data is – how ‘wide’ your frequency distribution would be if you plotted a histogram or frequency bar chart.

Slide 47: The interquartile range is just the difference between the quartiles

Slide 48: Recapping our values so far, you can see that the inter-quartile range is 52-45 = 7mm

Slide 49: The interquartile range can tell you how the middle 50% of your distribution are spread out. we know that the middle 50% of the cress seeds fall within a range of 7mm from 45mm to 52mm. You can also use the interquartile range to compare distributions for spread. There will be more about comparing distributions in a later screencast.

Slide 50: If you are asked to draw a cumulative frequency chart in the exam,

Slide 51: You calculate a running total of the frequencies, you plot these against the upper class boundaries, draw a smooth curve through the points. Then you can read off the median, upper quartile and lower quartile and finally, you can find the inter quartile range for the data. The interquartile range tells you how spread out the data is.