<< Previous         Index         Next >>


Frequency table construction

3. Frequency table with R

To prepare the frequency table, we will use the "fdth" package. If it is not installed, it can be installed either using the menu Tools -> Install Packages, or entering the command line:
> install.packages("fdth")

Once it is installed, we enable the package:
> library(fdth)

We can now obtain a frequency table with 5 classes with:
> print(fdt(df$Age,5))

where the second argument (5) is the number of intervals that we want. The result is the following:
 Class limits  f   rf rf(%) cf cf(%)
     [9.9,14)  4 0.08     8  4     8
      [14,19) 15 0.30    30 19    38
      [19,23) 19 0.38    38 38    76
      [23,28)  9 0.18    18 47    94
      [28,32)  3 0.06     6 50   100

The first column shows us the value intervals, while the second gives us the absolute frequency. In the third column we find the relative frequency, while the 4th column shows the relative frequency (in percentages). Finally, the last column shows the accumulated absolute frequency and the accumulated relative frequency (in percentages).
You can also specify the lower limit of the first interval (with "start="), the upper limit of the last interval (with "end="), and the interval width (with "h="). For instance to get 5 intervals of width equal to 5, the command is the following:
> print(fdt(df$Age,start=10,end=35,h=5))

This is the result:
 Class limits  f   rf rf(%) cf cf(%)
      [10,15)  4 0.08     8  4     8
      [15,20) 19 0.38    38 23    46
      [20,25) 19 0.38    38 42    84
      [25,30)  7 0.14    14 49    98
      [30,35)  1 0.02     2 50   100

A similar command allows us to get a frequency table for a categorical variable. Suppose that we have data on preferred colors for 30 persons. We can find the data in the following Excel file:
Preferred colors
Enter the data as explained in the previous page.
To get a frequency table for the categorical variable of this data set, we enter:
> print(fdt_cat(color_eng$Color))

This produces the following freqency table:
 Category f   rf rf(%) cf  cf(%)
   Purple 9 0.30 30.00  9  30.00
      Red 9 0.30 30.00 18  60.00
   Yellow 8 0.27 26.67 26  86.67
    Black 3 0.10 10.00 29  96.67
     Blue 1 0.03  3.33 30 100.00

As you can see, we get the absolute and relative frequency, as well as the accumulated values, for the categorical variable. R orders the categories from largest to lowest absolute frequency. If you want an alphabetical order of the categories, you can add "sort=FALSE":
> print(fdt_cat(color_eng$Color, sort=FALSE))

and you get:
 Category f   rf rf(%) cf  cf(%)
    Black 3 0.10 10.00  3  10.00
     Blue 1 0.03  3.33  4  13.33
   Purple 9 0.30 30.00 13  43.33
      Red 9 0.30 30.00 22  73.33
   Yellow 8 0.27 26.67 30 100.00

<< Previous         Index         Next >>




File translated from TEX by TTH, version 4.12.