Introduction
Working off of my previous R programming post here, I'll continue with the core of R programming: data collections.
Data Collections
Frequently, your program will require that you store multiple data items together. This might be because you have a group of data that should be referenced together, or even to reduce the number of variables you have to define. Regardless of why, there are four data collections that you can utilize in R: Vectors, Matrices, Lists, and DataFrames.
Vectors
The most basic object in R is known as vector, which contains objects of the same class. Let's try creating vectors of different classes. We can create vector using c()
:
a <- c(1.8, 4.5) # numeric
b <- c(1 + 2i, 3 - 6i) # complex
d <- c(23, 44) # integer
Challenge
Using the variable vec1
, create a vector with 5 numerical values.
vec1 <- c(1,2,3,4,5)
print(vec1)
[1] 1 2 3 4 5
Matrices
When a vector is introduced with row and columns (the dimension attribute), it becomes a matrix. It consist of elements of the same class, such as the following:
my_matrix <- matrix(1:6, nrow=3, ncol=2)
print(my_matrix)
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
Challenge
Create two vectors with the values 1 to 5 and 10.5 to 12.5, respectively. Then concatinate these two vectors into 1 vector, named
vec1
. What is the class? Call the function and assign its result to the variableclass1
.Change the 4th element of the above vector to the word 'four' and assign it to the vector
vec2
. Did this change the class? Call the function and assign its result to the variableclass2
.Using the
rep()
function, create a vector that repeats the values 1 2 3 twice. Assign this vector the variablevec3
. (Result: 1 2 3 1 2 3)Create a 3 by 4 matrix where each row has the same value. Assign this to the variable
matrix1
(hint: use the rep function)Create a 4 by 3 matrix where each row has the same value. Assign this matrix to the variable
matrix2
. (hint: use therep()
function)
Lists
Lists are present in R, as well as most other programming languages. A list is a data structure that can hold any number of any types of other data structures. For example, if you have vector, a dataframe, and a character object, you can put all of those into one list object.
Constructing a List
To begin constructing a list, we'll create three variables with different data types. Since lists support mixed types, we'll use these to add to a list.
vec <- 1:4
num <- 17
char <- "Hello!"
Then you can add all three objects to one list using list()
function:
list1 <- list(vec, num, char)
print(list1)
[[1]]
[1] 1 2 3 4
[[2]]
[1] 17
[[3]]
[1] "Hello!"
You can also turn an object into a list by using the as.list()
function. Notice how every element of the vector becomes a different component of the list.
Manipulating a List
We can put names on the components of a list using the names()
function, which is useful for extracting components. We could have also named the components when we created the list.
names(list1) <- c("Numbers", "Some.data", "Letters")
Extracting Components
The first way you can extract an object from the list is by using the [[ ]] operator.
list1[[3]]
'Hello!'
It's also possible to extract components using the component’s name, as shown below:
list1$Letters
'Hello!'
Subsetting a List
If you want to take a subset of a list, you can use the [ ]
operator and c()
to choose the components:
list1[c(1, 3)]
$Numbers
1 2 3 4
$Letters
'Hello!'
We can also add a new component to the list or replace a component using the $ or [[ ]] operators, such as the following two examples:
list1$newthing <- lm(y ~ x, data = df)
list1[[5]] <- "new component"
Finally, we can delete a component of a list by setting it equal to NULL:
list1$Letters <- NULL
Describing Lists
Now we'll go over ways in which we can extract list properties.
Class
The class of the list and the class of one of the components of the list.
class(list1)
'list'
class(list1[[1]])
'integer'
Size
You can find the size of a list with the length()
method, like in the following:
length(list1)
5
Converting
Finally, we can convert a list into a matrix, dataframe, or vector in a number of different ways. The first, most basic way is to use unlist(), which just turns the whole list into one long vector:
unlist(list1)
Challenge
Create a new vector that performs the operation 2x^2 for x from 0 to 6. Assign this vector to the variable
f
.Create a new vector that contains the value 0 repeated 5 times. Assign this vector to the variable
r
.Create a list with vectors
f
andr
, as well as with the element, 'hello'. Assign this list to the variablelist1
.
DataFrame
DataFrames are used to store tabular data. It's similar to a matrix in that there are rows and columns, but it's different because every element does nothave to be the same class. In a dataFrame, you can put list of vectors containing different classes. This means that every column of a data frame acts like a list.
df <- data.frame(name = c("ash","jane","paul","mark"), score = c(67,56,87,91))
print(df)
name score
1 ash 67
2 jane 56
3 paul 87
4 mark 91
DataFrame objects are incredibly useful when working with data that has relational relationships, such as a csv file. You'll soon see the extent to which these become useful soon enough!
Challenge
Using the variable df1
, create a 3x3 dataframe using three lists.
To summarize this succinctly,
Structure | Multidimension | Multiple Types |
---|---|---|
Vector | Not Capable | Not Capable |
Matrix | Capable | Not Capable |
List | Not Capable | Capable |
DataFrame | Capable | Capable |
Final Words
If you liked any of this material, feel free to check out the GitHub here and stay tuned for more posts by me! If you have any solutions or questions about the challenge questions, drop a comment and I'll get back to you.