Title: R objects
1R objects
- All R entities exist as objects
- They can all be operated on as data
- We will cover
- Vectors
- Factors
- Lists
- Data frames
- Tables
- Indexing
- R packages and datasets
2Vectors
- Think of vectors as being equivalent to a single
column of numbers in a spreadsheet - You can create a vector using the c( ) function
(concatenate) as follows - x lt- c( )
- For example
- x lt- c(1,2,4,8) creates a column of the numbers
1,2,4,8
3Vectors
- Other ways of creating columns of numbers
(vectors) - The seq function
- seq(1,10,1) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
- seq(1,4,0.5) 1, 1.5, 2, 2.5, 3, 3.5, 4
- xy
- 110 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
- 2 110 2, 4, 6, 8, 10, 12, 14, 16, 18, 20
- The rep function
- rep(2,4) 2, 2, 2, 2
4Indexing
Referencing (indexing) specific cells in a
column
Example if x is the vector 1, 2, 5 then x 1
1, x 2 2, x 3 5 and x 12 1,
2 first two listed items in x x 23 2,
5 2nd 3rd listed items in x x xgt2 5 use
of gt and lt characters
5Performing simple operations on vectors
- In R, when you carry out simple operations ( -
/) on vectors that have the same number of
entries, R just performs the normal operations on
the numbers in the vector, entry by entry - If the vectors dont have the same number of
entries, then R will cycle through the vector
with the smaller number of entries
6Performing simple operations on vectors
Example
7Performing simple operations on vectors
Examples
8Performing simple operations on vectors
Example
9Performing simple operations on vectors
Vectors (columns of numbers) can be assigned by
putting together other vectors, for example
10Functions
- R functions take arguments (information that you
put into the function which goes between the
brackets) and can perform a range of tasks - In the case of the help function the task is to
display information from the R documentation
files - A comprehensive list of R functions can be
obtained from the R reference manual under the
help menu
11Simple statistic functions
R comes with some useful functions
sqrt ( ) square root mean ( ) arithmetic
mean hist ( ) calculating plotting histograms
R also comes with pre-loaded datasets, which
well discuss later.
12Basic statistic functions on vectors
gt X1 lt- c(1.1, 4.3, 5, 2, 1, 4, 9.5) gt
sum(X1) sum 26.9 gt mean(X1) mean 3.842857 gt
median(X1) median 4 gt var(X1) variance
8.762857 gt sd(X1) standard deviation
2.960212 gt summary(X1) Min. 1st Qu. Median
Mean 3rd Qu. Max. 1.000 1.550 4.000 3.843
4.650 9.500 gt quantile(X1) 0 25 50 75
100 1.00 1.55 4.00 4.65 9.50
13Mixing vectors and scalars
- R has the very convenient feature of having
operators that work with vectors - It is even possible to mix vectors and scalars
- For example
gt X1 lt- c(1.1, 4.3, 5, 2, 1, 4, 9.5) gt X1 1 1
2.1 5.3 6.0 3.0 2.0 5.0 10.5 gt X1 2 1 2.2 8.6
10.0 4.0 2.0 8.0 19.0
14Vectors to record data
gt x c(45,43,46,48,51,46,50,47,46,45) gt
length(x) 1 10 gt x c(x,48,49,51,50,49)
append values to x gt length(x) 1 15 gt
x16 41 add to a
specified index gt length(x) 1 16 gt mean(x) 1
47.1875 gt x1720 c(40,38,35,40) add
to many specified indices gt length(x) 1 20 gt
mean(x) 1 45.4
15Factors
- A factor is a vector that encodes information
about the group to which a particular observation
belongs - Categorical data is often used to classify data
into various levels or factors - To make a factor is easy, using the factor
function
16Factors smoking survey example
A survey asks people if they smoke or not. The
data is Yes, No, No, Yes, Yes gt
xc("Yes","No","No","Yes","Yes") gt x
print out values in x 1 "Yes"
"No" "No" "Yes" "Yes" gt factor(x)
print out value in factor(x) 1 Yes No
No Yes Yes Levels No Yes
notice levels are printed.
Notice the difference in how R treats factors
with this example
17Factors student height example
Suppose the recorded height of South African and
British students are as follows heights lt-
c(1.7,1.95,1.63,1.54,1.29) You make a new vector
fac_heights, to record the nationality that each
observation pertains to fac_heights lt-
factor(c(GB, SA, GB, GB, SA))
Useful when testing for differences between groups
18Factors gender survey example
Consider a survey that has data on 691 females
and 692 males gt gender lt- c(rep("female",691),
rep("male",692)) create vector gt gender lt-
factor(gender) change vector to
factor
- Once stored as a factor, the space required for
storage is reduced - Values female and male are the levels of the
factor - gt levels(gender) assumes gender is a factor
- 1 "female" "male"
19Lists
A set of objects (e.g. vectors) can be combined
under a single name as a list (similar to a
spreadsheet in Excel)
Example x lt- c (1, 7, 8, 9, 10) y lt- c (red,
yellow, blue, green) example_list lt- list
(size x, colour y)
Note vectors can consist of characters (i.e.
letters/words) instead of numbers, but never
numbers AND characters
20Data frames
- The function data.frame( )
- This is a special kind of list, in which the
entries in a specific position in the elements of
the list correspond to one another - Each element of the list has the same length
- It is a rectangular table, with rows and columns
21Data frames
- Example 1
- Simple data frames can be created
- Enter the following information at the prompt
line - h lt- c (150, 170, 168, 179, 130)
- w lt- c (65, 70, 72, 80, 51)
- patient_data lt- data.frame (weightw, heighth)
- Type in patient_data to see whats just been
created
22Access of elements in data frames
- Individual elements can be accessed using a pair
of square brackets and by specifying their
index, or name - Here are some ways to access a cell, row or
column - patient_dataheight accesses a column
- patient_data , i accesses the ith column
- patient_data i, accesses the ith row
- patient_dataheight i i is the cell position
in height column - patient_data i, j looking for the jth cell in
the ith column
23Data frames
- More complex tables can be created
- Data within each column must have the same type
(e.g., number, text), but different columns may
have different types like a spreadsheet, as in
the example
24Data frames
Accessing specific cells, or data
Note "" is a shortcut minus "-" sign means not.
25Tables
- We often view categorical data with tables
- The table function allows us to look at tables
- Its simplest usage is table(x) where x is a
categorical variable
26Tables
Example smoking survey
A survey asks people if they smoke or not. The
data is Yes, No, No, Yes, Yes gt
xc("Yes","No","No","Yes","Yes") gt table(x) x No
Yes 2 3
The table command simply adds up the frequency of
each unique value of the data
27R packages and datasets
- View a list of R packages library()
- Access datasets with the data function
- data( ) provides a list of all the datasets
- data (Titanic) loads the Titanic dataset
- summary (Titanic) provides summary information
about the Titanic dataset - attributes(Titanic) provides more information
- Titanic dataset name will display the data
- List all datasets in a package, e.g.,
data(package'stats')
28Working through some examples
- List preloaded datasets in R data( )
- Display the women dataset women
- Now lets access specific data
- Access data from each column
- womenheight or women ,1
- womenweight or women ,2
- Access data from individual rows
- women1, or women10, etc.
- Try it.
29Working through some examples
- Now that you can access sample data, lets work
with it - Get the mean weight and height of the women in
our example.. - Remember the help function help(mean)
- Also, R can show an example example(mean)
30Common useful functions
print() prints a single R object cat()
prints multiple objects, one after the
other length() number of elements in a vector,
or of a list mean() median() range() unique(
) gives the vector of distinct
values sort() sort elements into
order order() xorder(x) orders elements of
x rev() reverse the order of vector elements