Unit: Data Handling & Analysis
Chapter: Grouping Data
Reference: – What is Grouping Data, When to Use Grouped Data, Raw vs Grouped Data, Class Intervals, Lower- and Upper-Class Limits, Class Size (Width), Frequency Distribution Table for Grouped Data, Tally Marks for Grouped Data, Choosing Class Intervals, Solved Examples, Odd-One-Out Problems, Common Mistakes
After studying this chapter, you should be able to understand:
- What is Grouping Data and When to Use It
- How to Create Class Intervals
- How to Make a Grouped Frequency Distribution Table
- How to Choose Appropriate Class Intervals
Introduction to Grouping Data
Definition
Grouping data means organizing raw data into class intervals (ranges of values) rather than listing each distinct value separately. This is useful when the data has many different values, making an ungrouped frequency table too long and hard to read.
When we group data, we essentially ask:
"How can we summarize this large set of numbers into meaningful groups that still tell us about the data?"
Once grouped, we can see patterns and calculate statistics more easily.
Importance of Grouping Data
- Makes large data sets easier to understand at a glance
- Reveals patterns that might be hidden in raw data
- Essential for creating histograms and frequency polygons
- Helps identify the most common range of values
- Foundation for advanced statistical analysis
Example
Raw test scores of 50 students range from 52 to 98. Listing each score individually would be long. Grouping into intervals like 50-59, 60-69, 70-79, 80-89, 90-99 creates a simple table that shows most students scored in the 70-89 range.
Subtopics
1. When to Use Grouped Data
Use grouped data when:
- The data has many distinct values (more than 10-15 different numbers)
- The data is continuous (like heights, weights, temperatures, times)
- You want to see the overall distribution without focusing on every single value
Do NOT use grouped data when:
- The data has only a few distinct values (like dice rolls 1-6)
- You need to know every exact value in the data set
2. Raw Data vs Grouped Data
Raw Data (Ungrouped): 5, 7, 8, 5, 9, 7, 6, 8, 5, 7, 9, 6, 8, 7, 5
Grouped Data:
5-6: 6 scores
7-8: 7 scores
9-10: 2 scores
The grouped version loses some exact information (we don't know exactly how many 5s vs 6s) but gives a clearer overall picture.
3. Class Intervals
A class interval is a range of values grouped together. Each interval has:
Lower Class Limit: The smallest value that can belong to the interval
Upper Class Limit: The largest value that can belong to the interval
Example: In the interval 60-69
Lower limit = 60, Upper limit = 69
Rules for Good Class Intervals:
- Intervals must not overlap (60-69 and 70-79, not 60-70 and 70-80)
- Intervals should have the same width (size) throughout the table
- Choose 5 to 10 intervals for most data sets
- Every data value must fit into exactly one interval
4. Class Size (Width)
The class size is the difference between the upper and lower limits of an interval, plus 1 (when dealing with whole numbers). Or more simply, the range covered by each interval.
Formula: Class Size = (Range of data) / (Number of intervals) – then round up to a convenient number
Example: Data ranges from 20 to 79. Range = 59. With 6 intervals, class size ≈ 59/6 ≈ 9.8 → round up to 10
Intervals: 20-29, 30-39, 40-49, 50-59, 60-69, 70-79
For continuous data (decimals): Class size = difference between lower limits of consecutive intervals. Example: 0-10, 10-20 has class size 10.
5. Making a Grouped Frequency Table
Steps:
Step 1: Find the range of the data (max – min)
Step 2: Decide how many intervals you want (usually 5 to 10)
Step 3: Calculate class size = range ÷ number of intervals (round up)
Step 4: Write the intervals starting from the smallest value (or slightly below)
Step 5: Go through the data and mark tally marks for each value in its interval
Step 6: Count the tally marks to find the frequency for each interval
Example – Create a grouped frequency table:
Data: 12, 15, 18, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55
Range = 55 – 12 = 43
Choose 5 intervals → class size = 43/5 = 8.6 → round up to 9
Intervals: 12-20, 21-29, 30-38, 39-47, 48-56
Count frequencies:
12-20: 12,15,18 → 3
21-29: 22,25,28 → 3
30-38: 31,34,37 → 3
39-47: 40,43,46 → 3
48-56: 49,52,55 → 3
6. Tally Marks for Grouped Data
Use tally marks to count how many data points fall into each interval. Each mark represents one data point. Every fifth mark is drawn diagonally across the previous four.
Example: If 8 scores fall into the 70-79 interval, tally marks would look like: |||| ||| (one group of 5 and three more)
Solved Examples
Example 1 – Creating a Grouped Frequency Table:
The heights (in inches) of 25 students are:
62, 65, 68, 70, 72, 65, 68, 71, 74, 65, 68, 70, 73, 66, 69, 72, 65, 67, 70, 73, 66, 69, 71, 74, 68
Group the data into intervals of 3 inches starting from 62.
Solution:
Intervals: 62-64, 65-67, 68-70, 71-73, 74-76
Count frequencies:
62-64: 62 → 1
65-67: 65,65,65,66,66,67 → 6
68-70: 68,68,68,70,70,70,69,69 → 8
71-73: 71,71,72,72,73,73 → 6
74-76: 74,74 → 2
Answer: Frequency table with intervals 62-64(1), 65-67(6), 68-70(8), 71-73(6), 74-76(2)
Example 2 – Choosing Class Intervals:
Data ranges from 15 to 84. You want 7 intervals. What class size should you use? What are the intervals?
Solution: Range = 84 – 15 = 69
Class size = 69 ÷ 7 ≈ 9.86 → round up to 10
Intervals: 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84
Answer: Class size = 10; intervals as above
Example 3 – Finding the Missing Frequency:
A grouped frequency table for test scores has intervals 50-59, 60-69, 70-79, 80-89, 90-99. Frequencies are: 4, 7, ?, 6, 3. Total students = 25. Find the missing frequency.
Solution: Sum of known frequencies = 4 + 7 + 6 + 3 = 20
Missing frequency = 25 – 20 = 5
Answer: 5
Common Mistakes to Avoid
Mistake 1 – Creating overlapping intervals
Intervals like 10-20 and 20-30 share the value 20. Where does 20 go?
Correct understanding: Use 10-19, 20-29 to avoid overlap.
Mistake 2 – Using inconsistent class sizes
Using intervals 0-9, 10-19, 20-31 (last interval is different width).
Correct understanding: All intervals should have the same width.
Mistake 3 – Leaving gaps between intervals
Intervals like 10-19 and 30-39 have a gap (20-29 not included).
Correct understanding: Intervals should cover the entire data range without gaps.
Mistake 4 – Choosing too few or too many intervals
2 intervals lose too much information; 20 intervals defeat the purpose of grouping.
Correct understanding: Use 5 to 10 intervals for most data sets.
Mistake 5 – Forgetting to include all data points
If data goes to 100 and intervals stop at 99, the value 100 is excluded.
Correct understanding: Ensure the intervals cover the entire range from min to max.
Mistake 6 – Confusing class size with number of intervals
Class size = width of each interval; number of intervals = how many intervals there are.
Correct understanding: They are different concepts – class size × number of intervals ≈ range.
Quick Reference Summary
Grouped Data: Data organized into class intervals (ranges)
When to Use: Large data sets with many distinct values
Class Interval: A range of values (e.g., 60-69)
Lower Limit: Smallest value in the interval
Upper Limit: Largest value in the interval
Class Size (Width): Difference between lower limits of consecutive intervals (or upper – lower + 1 for whole numbers)
Rules: No overlap, equal width, cover all data, 5-10 intervals
Grouped Frequency Table: Shows intervals and how many data points fall in each