R Fundamentals
Submit Attendance: link
Files for today: example | vectors | matrices | lists
Object Types
- Variables
- Variables are named objects; objects result from the evaluation of an expresssion
- Names can be assigned with an equals sign (
=
) or the assignment operator (<-
, alt+minus shortcut) - For example, with
x <- 2+2
, the expression2+2
evaluates to a length-one vector holding value 4, to which we have bound the namex
- Many ways to check the “type” of data in an R object
- My preferred method is to assess the structure of an object using
str()
- Additional options include
mode()
/typeof()
/class()
among others
- My preferred method is to assess the structure of an object using
- You can check data types or convert between data types
- For example, using
is.character()
to check oras.character()
to convert
- For example, using
- Special “values” in R include:
TRUE
andFALSE
logical values (coerced to 0 and 1 when used in mathematical operations)NA
indicates a placeholder for a missing valueInf
andNaN
result from, eg, dividing by zeroNULL
indicates a placeholder for a missing vector object
- It is helpful to think that everything in R is either an object or a function (and functions themselves are objects)
- The expression
2+2
is actually a call to the+()
function, as in`+`(2,2)
- The expression
Vectors
- R automatically performs vector recycling when operating on multiple vectors of different lengths
- This makes a lot of sense when one of the vectors has length-one
- It can lead to nice “tricks” in very special circumstances
- However, in general for vectors longer than length-one, it is unusual behavior
- Vector can be subset using square brackets
[]
by:- Position number, or a vector of position numbers – no zero-based indexing like Python or C++
- Logical vectors, often of the same length as the vector to be subset
- Element names, if the elements of the vector are named
- Logical tests are extremely common
- Use double-equals
==
to test for equality - Use
!=
to indicate not-equal-to
- Use double-equals
- R heavily utilizes vectorized functions
- For example, if
x=c(4,9,16)
thensqrt(x)
returns(2,3,4)
- For example, if
Matrices
- Matrices are vectors with a dimension attribute
- Matrix math, ie Linear Algebra, is common in statistical applications
- Standard operators (
+
,-
,*
,/
) are element-wise - Matrix multiplcation uses the
%*%
operator - Transpose via
t()
- Invert via
solve()
or a Choleski decomposition
- Standard operators (
- Subset a matrix with square brackets
[]
- Uses form [row index, column index]
- Blank index indicates “all”
- You can “sweep” a function over the rows or columns
- Simple functions are built-in, such as
RowSums()
andColMeans()
- Alternatively, use
apply()
- Simple functions are built-in, such as
Lists
- Lists are generalized vectors
- Each element of a vector holds a value, each element of a list “holds” an R object
- You can nest lists; ie a list element could be another list
- Most algorithmic functions in R return lists, eg,
lm()
,kmeans()
,prcomp()
- List subsetting / extracting elements
- Single brackets
[]
return a list object - Double brackets
[[]]
or dollar sign$
extract a list element
- Single brackets
- It is common to apply a function to each object in a list
lapply()
returns a list of results,sapply()
returns a (simplified) vector