Frequency Distributions Reporting Using Advantage VISION:Excel

Document ID : KB000055671
Last Modified Date : 14/02/2018
Show Technical Document Details

Suppose a survey of the bodyweight of 100 people was taken and recorded in the table below. In this case the bodyweight range is the value, and the count is the number of times a bodyweight value falls into a specified range. A range is commonly called an interval when the sections are of equal length.

Body Weight RangeTally and Frequency
95 - 11417
115 - 13413
135 - 15416
155 - 17412
175 - 19416
195 - 21412
215 - 23414

Data presented in the form of a Frequency Distribution table makes analysis of the data much easier than studying a set of raw data.

There are a few variations of the Frequency Distribution table. The format of such tables differs according to the nature of the data and the type of analysis desired. The table below is a version of the table above with the interval percentages inserted. This new column or table is called the Relative Frequency Distribution. The Relative Frequency P is defined as P = F/N, where F is the frequency associated with each interval or count, and N is the total number in the set of data. In this example N is equal to 100.

Body Weight RangeFrequencyRelative Frequency
95 - 11417.17
115 - 13413.13
135 - 15416.16
155 - 17412.12
175 - 19416.16
195 - 21412.12
215 - 23414.14

There are also graphical tools used to analyze Frequency Distributions. The standard graphic forms include bar charts, frequency polygons, and histograms. Below is an example of the Bodyweight distribution as an Advantage VISION:Excel histogram;

Figure 1

In this histogram the vertical axis represents the frequency or count and the horizontal axis represents the ranges of bodyweights. The appearance of a histogram will vary depending on the nature of the data. This appearance or graphical shape of a distribution is called the Kurtosis. This describes the flatness, or peakedness of a distribution when compared to another. The variations of Kurtosis are called Leptokurtic, Mesokurtic, and Platykurtic.

There are also descriptive measures used to explain the distribution of data. The mean, standard deviation, standard error of the mean and coefficient of variation can all be used to summarize and describe the distribution of data.
The Mean or average, assuming a set of numeric values, is Defined as M = T/N, where M is the mean, T the total sum of the distribution, and N is the total items in the set of data.

The Standard Deviation is basically a measure of how spread out the values in a dataset are. The Standard Deviation is defined as follows:

Figure 2

Where S.D. is the standard deviation of the distribution, Xi is the value of the ith item, M is the mean of the distribution, and N is the number of items in the set of data.

The Standard Error of the Mean gives an idea of how much variation you would expect to find if you took repeated samples from the same population. The Standard Error of the Mean is defined as the Standard Deviation divided by the square root of the number of items in the set of data. See the formula below.

Figure 3

The Coefficient of Variation is a measure of variability in relation to the mean. The Coefficient of Variation is defined as the Standard Deviation divided by the Mean of the population. See below.

Figure 4

With Advantage VISION:Excel, Frequency distributions tables, histograms, and the formulas explained above, can be produced with relative ease. The choices of Frequency Distributions with Advantage VISION:Excel are Equal Interval, Logarithmic, and Alphanumeric. With the example of the 100 bodyweight sample and distribution table with intervals of 20 pounds, the Equal Interval option of Advantage VISION:Excel would be best for this distribution. The following is a diagram of the syntax for an Advantage VISION:Excel Equal Interval Frequency distribution;

         FREQ nn EQUAL
              DISTFIELD dataname START nnnnnnnnnnn
              INTERVALSIZE nnnnnnnnnnn INTERVALNO nn
              [REPTITLE dataname] [STDREPT {YES | NO}]
              [HISTOGRAM {NO | COUNT | VALUE | BOTH}]

Applying this syntax to the bodyweight example, below is the VISION:Excel code to produce an Equal Interval Frequency distribution table and histogram. This example uses a card input file that includes 100 randomly chosen values between 94 and 235. Following the code is the output including the frequency distribution table and the histogram.

With Advantage VISION:Excel, you can make the use of very complex Statistical methods easy!

               FILE CARDFILE F 80 
                  BODYWEIGHT 3 NU
 
                SORT CARDFILE USING BODYWEIGHT 
 
                FREQ 07 EQUAL
                DISTFIELD BODYWEIGHT
                HISTOGRAM COUNT
                START                95
                INTERVALSIZE  20
                INTERVALNO     7
 
                FREQUENCY 07
                                             100 BODYWEIGHT SAMPLE
  6/14/05                                         FREQUENCY DISTRIBUTION REPORT                    PAGE   1
                                                          APPLICATION 07
------------ R A N G E ------------   COUNT    %  OF       TOTAL    %  OF          STANDARD  SQUARE ROOT
   L O W            H I G H         OF ITEMS   ITEMS       OF ITEMS  TOTAL   MEAN  DEVIATION RANGE X COUNT
     95                114             17       17.00       1790    11.00     105      6           18
    115                134             13       13.00       1637    10.06     126      6           34
    135                154             16       16.00       2324    14.28     145      6           51
    155                174             12       12.00       1984    12.19     165      5           66
    175                194             16       16.00       2946    18.10     184      6           83
    195                214             12       12.00       2440    14.99     203      6           98
    215                234             14       14.00       3159    19.40     226      6           114
TOTAL                                 100      100.00       16280   100.00
MEAN =                          163             STANDARD ERROR OF THE MEAN =    4.100
STANDARD DEVIATION =             41             COEFFICIENT OF VARIATION =      0.252
MINIMUM VALUE =                  95             MAXIMUM VALUE =                   234
NUMBER OF ZERO ITEMS              0


                      COUNT  HISTOGRAM
   ---------------------------------------------------
      +++++++++++++++++
      +++++++++++++
      ++++++++++++++++
      ++++++++++++
      ++++++++++++++++
      ++++++++++++
      ++++++++++++++