FREQUENCY
DISTRIBUTIONS
PURPOSE
The purpose of a frequency distribution is to summarize and organize a set of data. Presenting data in a frequency distribution makes inspection of the data set much more manageable than presenting the entire set of raw data. A frequency distribution can be considered a type of descriptive statistic.
BACKGROUND
Frequency distribution: A distribution showing the number of observations associated with each score value in a set of data that is quantitative in nature. Frequency distributions make it easy to see trends in data, particularly when two different data sets are compared.
Qualitative data can be organized using the same basic idea, with categories instead of scores.
Frequency distributions can be structured as a table or a graph, provided you present:
FREQUENCY DISTRIBUTION TABLE
In the simplest form, the table consists of two columns, one for the score value (X) and a second indicating the number of observations for the score value (f). Note: Sf = N, where N is total number of observations. The X values should be:
To find the sum of the entire set of scores, you can multiply the score value, X, by the number of times the score value was observed, f, and sum these products.
REAL LIMITS AND FREQUENCY
DISTRIBUTIONS
Real (exact) limits: The boundaries of a continuous variable that generally extend from one-half of the smallest unit of measurement below the value of the score to one-half unit above.
For example, a measurement of 6'1" (if measured to the nearest inch) is actually 6'.5" to 6'1.5". If measured to the nearest foot, 6' would lie between 5'6" and 6'6". There are lower and upper real limits.
THE RELATIVE FREQUENCY
DISTRIBUTION
Relative frequency distribution: A distribution that indicates the category or score value and the proportion or percentage of the total number of cases associated with the category or score value. In other words, it tells the proportion (or percent) of scores at each X value.
The relative frequencies, expressed as a proportion (p), are found using the equation:
p = f/N
Where:
f is the frequency associated with each category or score value
N is the total number of observations
p can be converted to a percent by multiplying p by 100.
Relative frequencies are particularly helpful when comparing frequency distributions in which the number of cases differs. Relative is the term used to indicate that the percentages or proportions are used in addition to the number of cases.
THE CUMULATIVE FREQUENCY AND
PERCENTAGE DISTRIBUTIONS
Cumulative distribution: A distribution that indicates the category or score value and the cumulative frequency (cum f) or proportion/percentage (cum p/cum %) of the total number of cases below the upper real limit of the associated category or score value.
Cumulative distributions are particularly helpful when determining Percentile Ranks and Percentile Points.
Percentile point: A point in a distribution below which a specific percentage of scores fall.
Percentile rank: The percentage of scores in a distribution that fall
below a specific score.
FREQUENCY DISTRIBUTION GRAPHS
THE HISTOGRAM
A histogram is a pictorial representation of a frequency distribution in which the scores (X) are plotted on the X-axis of a graph and the frequency (or relative frequency) of occurrences is plotted on the Y-axis. A Histogram is used when the X scores are quantitative, or continuous.
When creating a Histogram:
1. Create a Frequency Distribution of the scores of interest.
2. The X-axis
a. Determine a suitable scale for the horizontal axis & determine the number of squares needed.
b. Try not to break the X-axis, but if you do, use proper notation.
3. The Y-axis
4. Create bars for each score value (X).
a. Height = frequency/relative frequency.
b. No gaps between bars (except for N = 0 intervals). Bars should touch.
5. Label histogram with title. Be sure to label X and Y-axes.
THE BAR CHART
Create a bar chart when data is qualitative, or discrete/categorical. You create a bar chart exactly as a histogram with one exception. A bar chart contains spaces between the bars, whereas the histogram does not. The space between the bars signifies the categories do not signify the amount of the underlying variable.
THE FREQUENCY POLYGON
A frequency polygon is a pictorial representation of a frequency distribution in which the scores (X) are plotted on the X-axis of the graph and the frequency (or relative frequency) of occurrences is plotted on the Y-axis. However, the frequencies at each value of X are represented as dots connected by a line as opposed to bars (as in a Histogram).
SHAPES OF FREQUENCY DISTRIBUTIONS
SYMETRICAL -
ASYMETRICAL - Skewed (Positive or Negative), J-shaped, or perhaps Bimodal/Trimodal
STEM AND LEAF PLOTS
Stem and leaf plots are a different pictorial representation of a set of data. To create a stem and leaf plot:
The 'leaves' should increase from left to right.