Module 2: Writing functions in R - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Module 2: Writing functions in R

Description:

Var: variance. log (natural log) exp(a) : e^a. sqrt (square root) 2005 ... v1=var(x); v2=var(y); tau1=mean(abs(x-md1)); tau2=mean(abs(y-md2)); del1=(m1-md1)/tau1; ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 21
Provided by: edith5
Category:

less

Transcript and Presenter's Notes

Title: Module 2: Writing functions in R


1
Module 2 Writing functions in R
  • Why do we write functions?
  • How do we write functions? Basic Syntax
  • Argument of the function
  • Output of the function
  • Examples ( and what we learned about R while
    writing those functions)
  • Confidence interval for the mean absolute
    deviation.
  • Confidence interval for the ratio of two mean
    absolute deviations
  • Calculating the periodogram
  • Plotting the cumulative periodogram
  • Estimating the spectrum from the autocorrelations

2
Why do we write functions?
  • The two main motivations are
  • Sometimes we need to calcuate the same thing for
    several data sets.
  • What we want to calculate is not in the
    commercial software yet.

3
Basic syntax
  • namelt-function(x,y)
  • commands
  • output
  • You decide the name of the function
  • Inside the parentheses you can write all the
    input you want and give them the name you want
  • We need to type after every command
  • All the set of commands goes inside brackets
  • The name of whatever you want as output goes at
    the end of the function
  • Any comment line starts with
  • At the beginning usually we include a description
    of the function starting with

4
Example 1 calculating a confidence interval for
the mean absolute deviation
  • In a sample, MAD is calculated as the average of
    the distances of the values to the median.
  • In Bonett Seier (2003), Confidence Intervals
    for Mean Absolute Deviations The American
    Statistical Association, Vol 57 4 the following
    formula for the confidence interval for the
    population mean absolute deviation was derived

5
  • Function citau

6
Function argument
  • We need to indicate where (x) are the
    observations and what is the confidence we want
    to work with (z is the critical value).
  • So, citau is a function of x and z.
  • citault-function(x,z)

7
Calculations
  • mdmedian(x)
  • mmean(x)
  • vvar(x)
  • taumean(abs(x-md))
  • del(m-md)/tau
  • nlength(x)
  • cn/(n-1)
  • gamv/(tau2)
  • varlnt(delgam-1)/n
  • setausqrt(varlnt)
  • forlowlog(ctau)-zsetau
  • foruplog(ctau)zsetau
  • lowerendexp(forlow)
  • upperendexp(forup)
  • We use the following commands
  • length number of observations
  • mean
  • median
  • Var variance
  • log (natural log)
  • exp(a) ea
  • sqrt (square root)
  • / -

8
The output
  • We need to create an object with the results we
    want as output. In the example of the confidence
    interval we would like to have the lower end ,
    the point estimate, and upper end of the
    interval. So we will create an object that we
    will call ci
  • cilt- c(lowerend, tau,upperend)
  • We could put a name to each element of the output
    introducing a vector of names such as
  • alt-c(lower-end, point-estimate,upper end)
  • names(ci)a
  • The last expression of the function should be the
    name of the object we want as output, in this
    case
  • ci

9
Executing the function
  • Imagine we have a set of 100 observations from a
    random sample of a population. The data are in
    the file WTAM1G.dat . First we read the data
    with
  • xlt-scan(awtam1g.dat)
  • To calculate the 95 confidence interval, z1.96
  • Once we have copied and pasted the function citau
    into R, we simply write
  • citau(x,1.96)
  • and obtain
  • 0.6204765 0.7347000 0.8876145
  • If we wanted 90 confidence we would write
  • citau(x,1.645)

10
Example 2 confidence interval for the ratio of
two mean absolute deviation.
  • Two populations can be compared in terms of their
    variability by comparing the variances ( Levene
    and Barlett tests) or their mean absolute
    deviations. In Bonett Seier (2003),
    Confidence Intervals for Mean Absolute
    Deviations The American Statistical Association,
    Vol 57 4, a formula for the confidence interval
    of the ratio of two MADs was derived. If the
    confidence interval covers the value 1, that
    means we would reject the hypothesis that the two
    population mean absolute deviations are equal.

11
  • ratauslt-function(x,y,z)
  • md1median(x)
  • md2median(y)
  • m1mean(x)
  • m2mean(y)
  • v1var(x)
  • v2var(y)
  • tau1mean(abs(x-md1))
  • tau2mean(abs(y-md2))
  • del1(m1-md1)/tau1
  • del2(m2-md2)/tau2
  • n1length(x)
  • n2length(y)
  • c1n1/(n1-1)
  • c2n2/(n2-1)
  • gam1v1/(tau12)
  • gam2v2/(tau22)
  • varlnt1(del1gam1-1)/n1
  • varlnt2(del2gam2-1)/n2
  • forratlog((c1tau1)/(c2tau2))
  • seratsqrt(varlnt1varlnt2)
  • forlowforrat-zserat
  • forupforratzserat
  • ratauexp(forrat)
  • lowerendexp(forlow)
  • upperendexp(forup)
  • cilt- c(exinf,ratau,exsup) ci

12
Example 3. Plotting the periodogram of a time
series.
  • The main commponent of this program is the Fast
    Fourier Transform fft
  • funcion para calcular el periodograma
  • perioplotlt-function(x)
  • adjxx-mean(x) substracts
    the mean of the series
  • tffft(adjx) calculates
    finite Fourier transform
  • nflength(tf) n2nf/21 decides
    the number of frequencies
  • pritflt-tfc(1n2) takes the
    elements of the Fourier transform
  • intensitylt-(abs(pritf2))/nf calculates
    the ordinates of periodogram
  • nyquist1/2 pfreqlt-seq(0,nf/2,by1)
    preparation for frequencies
  • freqlt-pfreq/(length(pfreq)-1)nyquist
    calculates frequencies
  • plot(freq,intensity,type"l")

13
Using the function perioplot
  • First we need to copy and paste the function
    perioplot into R, read the data file and then
    execute the function
  • tempmarlt-scan(atempsea.dat)
  • perioplot(tempmar)

14
Example 4. Use of the cusum function to write the
cumulative periodogram.
  • The function cusum calculates the cumulative sum
    of the elements of a vector.
  • periocumlt-function(x)
  • adjxx-mean(x)
  • tffft(adjx) nflength(tf) n2nf/21
  • pritflt-tfc(1n2)
  • intensitylt-(abs(pritf2))/nf
  • nyquist1/2 pfreqlt-seq(0,nf/2,by1)
  • freqlt-pfreq/(length(pfreq)-1)nyquist
  • cumintlt-cumsum(intensity)/(max(cumsum(intensity)))
  • plot(freq,cumint,type"l")

15
.
  • Assuming that we have previously read the data of
    the sea temperature, to execute the function we
    just write periocum(tempmar) to obtain the plot
    at the left. The one of the right was obtained
    with a function that already comes with R.
    cpgram(tempmar)

16
Example 5. Estimating the spectrum based on the
autocorrelations
  • We include this example to mention that generally
    it is better to avoid the loops like do while
    if we can substitute them by matrix calculations
  • The formula we want to calculate is
  • We will be using the operators outer and .

17
  • The input au contains the autocorrelations of
    the series. Notice how matrix operations are
    used instead of loops.
  • estspeclt-function(au)
  • Mlt-length(au) counts how many
    autocorrelations (M) we read
  • jlt-seq(1,M,by1) creates subindeces j1M
  • lam0.5(1cos(jpi/M)) calculates Tukeys
    weights
  • wlt-seq(0,pi,bypi/50) calculates angular
    frequencies
  • lact(lam)au multiplies each weight by the
    corresponding autocorrelation
  • flt-function(j,w) cos(jw)
  • zlt-outer(j,w,f)
    calculates cos j w for all values of w and j
  • szlt-lacz obtains the
    sum of weights correlations cos jw
  • hlt-(1/(2pi))(12sz) calculates
    h(w)
  • plot(w,h ,type"l",mainEstimated spectral
    density )

18
  • We read the first 30 autocorrelations of the sea
    temperatures
  • ault-c(0.827,0.537,0.241,0.003,-0.155,-0.223,
    -0.196, -0.087,0.075,0.253,,0.379,0.402,0.306,0.12
    4,-0.084,-0.256,-0.380,-0.429,-0.382,-0.247,-0.055
    ,0.141,0.288,0.352,0.289,0.123,-0.076,-0.261,-0.39
    0,-0.444)
  • Then we write
  • estspec(au)
  • to obtain the plot at the right.
  • Note.- R has also a function specpgram that
    estimates the spectrum using a different method.

19
Practicing writing functions
  • 1) a)Use the functions cuartil y maximo. max(),
    min(), median(), y quantile( , ) to write a new
    function that calculates the 5 number summay for
    the data set x and callc the new function
    fivenum )
  • xlt-c(0,2,9,1,8,7,5,5,6,2,5,1,9,4,8)
  • b) To check if the function is correctly written,
    write
  • fivenum(x) the results should be 0.0 2.0 5.0
    7.5 9.0
  • c) There are functions already in R to calculate
    the 5 number summary
  • Quantile (x) and summary(x) , use them with the
    previous data set and compare the results with
    those of your function fivenum, what would you
    add to your function to put labels as the
    function summary?

20
  • Exercise 2.
  • Write a function histolog to calculate the
    natural logarithm of the observations and plot
    the histogram of the transformed data.
  • Exercise 3.
  • Write a function (you decide the name) to
    calculate the number of observations, the mean ,
    median and the absolute value of the difference
    between the mean and the median. Apply the
    function to a data set.
  • Exercise 4.
  • Write a function (you decide the name) to read
    the observations to two variables. The output
    should be the mean of each varaible and the
    correlation between the two. Apply the function
    to the observations to two variables.
Write a Comment
User Comments (0)
About PowerShow.com