FREQUENCY DISTRIBUTIONS

PURPOSE

The purpose of a frequency distribution is to summarize and organize a set of data. Presenting data in a frequency distribution makes inspection of the data set much more manageable than presenting the entire set of raw data. A frequency distribution can be considered a type of descriptive statistic.

BACKGROUND

Frequency distribution: A distribution showing the number of observations associated with each score value in a set of data that is quantitative in nature. Frequency distributions make it easy to see trends in data, particularly when two different data sets are compared.

Qualitative data can be organized using the same basic idea, with categories instead of scores.

Frequency distributions can be structured as a table or a graph, provided you present:

  1. The score values that make up the original scale of measurement
  2. A record of the frequency of observations in each category.

FREQUENCY DISTRIBUTION TABLE

In the simplest form, the table consists of two columns, one for the score value (X) and a second indicating the number of observations for the score value (f). Note: Sf = N, where N is total number of observations. The X values should be:

  1. Continuous
  2. Descending

To find the sum of the entire set of scores, you can multiply the score value, X, by the number of times the score value was observed, f, and sum these products.

REAL LIMITS AND FREQUENCY DISTRIBUTIONS

Real (exact) limits: The boundaries of a continuous variable that generally extend from one-half of the smallest unit of measurement below the value of the score to one-half unit above.

For example, a measurement of 6'1" (if measured to the nearest inch) is actually 6'.5" to 6'1.5". If measured to the nearest foot, 6' would lie between 5'6" and 6'6". There are lower and upper real limits.

THE RELATIVE FREQUENCY DISTRIBUTION

Relative frequency distribution: A distribution that indicates the category or score value and the proportion or percentage of the total number of cases associated with the category or score value. In other words, it tells the proportion (or percent) of scores at each X value.

The relative frequencies, expressed as a proportion (p), are found using the equation:

p = f/N

Where:

f is the frequency associated with each category or score value

N is the total number of observations

p can be converted to a percent by multiplying p by 100.

Relative frequencies are particularly helpful when comparing frequency distributions in which the number of cases differs. Relative is the term used to indicate that the percentages or proportions are used in addition to the number of cases.

THE CUMULATIVE FREQUENCY AND PERCENTAGE DISTRIBUTIONS

Cumulative distribution: A distribution that indicates the category or score value and the cumulative frequency (cum f) or proportion/percentage (cum p/cum %) of the total number of cases below the upper real limit of the associated category or score value.

Cumulative distributions are particularly helpful when determining Percentile Ranks and Percentile Points.

Percentile point: A point in a distribution below which a specific percentage of scores fall.

Percentile rank: The percentage of scores in a distribution that fall below a specific score.

FREQUENCY DISTRIBUTION GRAPHS

THE HISTOGRAM

A histogram is a pictorial representation of a frequency distribution in which the scores (X) are plotted on the X-axis of a graph and the frequency (or relative frequency) of occurrences is plotted on the Y-axis. A Histogram is used when the X scores are quantitative, or continuous.

When creating a Histogram:

1. Create a Frequency Distribution of the scores of interest.

 

2. The X-axis

a. Determine a suitable scale for the horizontal axis & determine the number of squares needed.

b. Try not to break the X-axis, but if you do, use proper notation.

 

3. The Y-axis

    1. Displays information about frequency
    2. Determine the length of the Y-axis by multiplying the number of squares on the X-axis by .75 to get the number of squares for the Y-axis.
    3. Try not to break the Y-axis, but if you do, use proper notation.

 

4. Create bars for each score value (X).

a.       Height = frequency/relative frequency.

b.      No gaps between bars (except for N = 0 intervals). Bars should touch.

5. Label histogram with title. Be sure to label X and Y-axes.

THE BAR CHART

Create a bar chart when data is qualitative, or discrete/categorical. You create a bar chart exactly as a histogram with one exception. A bar chart contains spaces between the bars, whereas the histogram does not. The space between the bars signifies the categories do not signify the amount of the underlying variable.

THE FREQUENCY POLYGON

A frequency polygon is a pictorial representation of a frequency distribution in which the scores (X) are plotted on the X-axis of the graph and the frequency (or relative frequency) of occurrences is plotted on the Y-axis. However, the frequencies at each value of X are represented as dots connected by a line as opposed to bars (as in a Histogram).

  1. Create a Frequency Distribution of the scores of interest.
  2. The X-axis - completed as a Histogram with the following caveat - create a X value for the score above the highest and below the lowest actual X scores.
  3. The Y-axis - completed as a Histogram
  4. Create dots for each score value - the height should be equal to frequency or relative frequency.
  5. Connect dots with straight lines. Connect the dots above the highest and below the lowest score values (X) to the X-axis at the score values created in step 2.
  6. Label frequency polygon with title. Be sure to label X and Y-axes.

SHAPES OF FREQUENCY DISTRIBUTIONS

SYMETRICAL - Normal, Bell-shaped, Rectangular (perhaps Bimodal/Trimodal).

ASYMETRICAL - Skewed (Positive or Negative), J-shaped, or perhaps Bimodal/Trimodal

STEM AND LEAF PLOTS

Stem and leaf plots are a different pictorial representation of a set of data. To create a stem and leaf plot:

  1. Place lowest scores on top
  2. Create a 'stem' that contains the first digit, or digits
  3. Create a 'leaf' that contains the last digit, or digits

The 'leaves' should increase from left to right.