For this assignment you will practice creating, examining, and combining different data structures in R. This assignment is different from others in that it takes a worksheet format with built-in error checking. Each time you complete an answer, if you knit the document it should check your answer (but don’t remove “NULL” on an answer before you want it checked or you’ll get an error!) For this reason, you should re-knit every time you answer a question, so that if something goes wrong, you know the last edit caused the issue!
You will need to do the following:
#); remove and replace the comments with your code:
# DO NOT EDIT”; these are used to test your code.
>) in designated locations to answer questions.
Example for Code Questions:
# Ex 1) Create a vector of the numbers 1, 5, 3, 2, 4. example <- NULL # You would write this: example <- c(1,5,3,2,4) # Ex 2) Add the number 10 to the sixth position of example2 example2 <- example example2 <- NULL # You would write: example2 <- 10
Example for Text Questions: 1) “How do atomic vectors differ from lists?”
An atomic vector is a one dimensional object where every element is the same type of data. Lists are one dimensional but elements can be different data types.
# 1) Use "seq()" to create a vector of 10 numbers evenly spaced from 0 to 15. vec_num <- NULL # 2) Use ":" to create an integer vector of the numbers 11 through 20. vec_int <- NULL # 3) Use "LETTERS" and "[ ]" to create a vector of the last 10 capital letters. vec_cha <- NULL # 4) Use letters and "[ ]" to create a factor variable using the first ten lower case letters. vec_fac <- NULL # 5) Use "c()" to combine "vec_cha" and "vec_fac" into "vec_let". Do not convert it to a factor! vec_let <- NULL # 6) Use "c()" and "[ ]" to combine the first 4 elements of "vec_num" with the last # 4 elements of "vec_int" to create "vec_ni". vec_ni <- NULL # 7) Use "rev()" to reverse the order of "vec_fac". fac_vec <- NULL
How’d you do?
No code entered for vec_num yet. No code entered for vec_int yet. No code entered for vec_cha yet. No code entered for vec_fac yet. No code entered for vec_let yet. No code entered for vec_ni yet. No code entered for fac_vec yet.
vec_fac, what class of vector would you get? Why?
new_vec <- c(TRUE, FALSE, TRUE, TRUE)
# 1) Use matrix() to create a matrix with 10 rows and four columns filled with NA mat_empty <- NULL # 2) Assign "vec_num" to the first column of "mat_1" below. mat_1 <- mat_empty # DO NOT EDIT THIS LINE; add code below it. mat_1 <- NULL # 3) Assign "vec_int" to the second column of "mat_2" below mat_2 <- mat_1 # DO NOT EDIT THIS LINE; add code below it. mat_2 <- NULL # 4) Assign "vec_cha" and "vec_fac" to the third and fourth columns of "mat_3" using one assignment operator. mat_3 <- mat_2 # DO NOT EDIT THIS LINE; add code below it. mat_3 <- NULL # 5) Select the fourth row from "mat_3" and assign it to the object "row_4" as a vector. row_4 <- NULL # 6) Assign the element in the 6th row and 2nd column of "mat_3" to "val_6_2" as a numeric value. val_6_2 <- NULL # 7) Use "cbind()" to combine "vec_num", "vec_int", "vec_cha", and "vec_fac" into "mat_4". mat_4 <- NULL # 8) Next, first transpose mat_4, then select only the first four columns and assign to mat_t mat_t <- NULL # 9) Then use rbind() to add the rows from mat_3 to this (mat_t first, mat_3 second) and assign this combination to mat_big. mat_big <- NULL
How’d you do?
No code entered for mat_empty yet. No code entered for mat_1 yet. No code entered for mat_2 yet. No code entered for mat_3 yet. No code entered for row_4 yet. No code entered for val_6_2 yet. No code entered for mat_4 yet. No code entered for mat_t yet. No code entered for mat_big yet.
mat_4? What about
colnames()? What about
rownames()? Can you guess why you get all these results?
mat_letters <- matrix(letters, ncol=2)
"m"in the first column and
"z"in the second. What would be an easty way to make the matrix go in alphabetical order left to right, top to bottom?
math_mat <- matrix(1:5, nrow=5, ncol=5) math_vec <- 1:5
math_vec. When you add
math_mat + math_vec, what happens?
math_mat %*% math_vecand from
math_mat * math_vec. Can you tell what is happening?
# 1) Use "list()" to create a list that contains "vec_num" and "row_4", and assign the names # "vec_num" and "row_4" to these two elements of "list_1". list_1 <- NULL # 2) Using "$", extract "row_4" from "list_1" and assign it to the object "row_4_2". row_4_2 <- NULL # 3) Create another list that contains "val_6_2" and "mat_big". list_2 <- NULL # 4) Combine list_1 and list_2 together using "c()" and assign them to "list_3" list_3 <- NULL # 5) Use "unlist()" to turn "list_3" into a vector and assign it to "vector_3" vector_3 <- NULL # 6) Use "as.list()" to convert "vector_3" into a list and assign it to "list_big" list_big <- NULL # 7) Now copy "list_3" as "list_4" and use "[[ ]]" to assign "list_3" as the last (fifth) element of "list_4"; # that is, you should have a list object with five elements named "list_4" that contains the same four # elements as "list_3" plus a fifth element that is -all- four elements of "list_3" as one object. list_4 <- NULL # 8) Select the third element (that is, the sub-element) of the the fifth element of "list_4" and assign it # to element_5_3 using "[[ ]]". element_5_3 <- NULL # 9) Lastly, repeat the previous assignment of the third element of the fifth element, but # extract the element as a list rather than scalar using "[ ]" and assign it to "list_5_3". list_5_3 <- NULL
How’d you do?
No code entered for list_1 yet. No code entered for row_4_2 yet. No code entered for list_2 yet. No code entered for list_3 yet. No code entered for vector_3 yet. No code entered for list_big yet. No code entered for list_4 yet. No code entered for element_5_3 yet. No code entered for list_5_3 yet.
Many functions in R produce lists as output because they produce objects with different types of data and of different lengths. For instance, consider the linear regression saved to
lm.output below. Don’t worry if you are not familiar with regression, we’re just concerned with what the function produces!
lm.output <- lm(mpg ~ wt, data=mtcars) lm.output
## ## Call: ## lm(formula = mpg ~ wt, data = mtcars) ## ## Coefficients: ## (Intercept) wt ## 37.285 -5.344
wt? Remember to call on them from
lm.outputobject for your answer!
# 1) Use "data.frame()" to combine "vec_num" (first column) and "vec_int" (second column) into "df_1". df_1 <- NULL # 2) Use "$" to extract "vec_num" from "df_1", reverse it with "rev()", and assign it as the vector "vec_num_2". vec_num_2 <- NULL # 3) Use "$" to add "vec_num_2" to "df_2" as a new column with the name "number_vector". df_2 <- df_1 # DO NOT EDIT THIS LINE; add code below it. df_2 <- NULL # 4) Combine "df_2" with itself using "rbind()" to create "df_3" df_3 <- NULL # 5) Create a new data frame "df_4" using "data.frame()" that contains the following named columns (in order): # "y" that contains 20 numbers evenly spaced from 31 to 125 # "x" that has 20 numbers between 0 and 10 generated using "runif()" (get help with ?runif) # "color" that consists of 20 values sampled from "col_vec "below using "sample()". col_vec <- colors() # DO NOT EDIT THIS LINE; add code below it. df_4 <- NULL # This code here should produce a plot of those values with those colors! if(is.null(df_4)==FALSE) df_4 %>% ggplot(aes(x, y, color=color)) + geom_point() + theme(legend.position="none") # 5) Use "cbind()" to combine "df_4" and "df_2" into "df_5". df_5 <- NULL # 6) Now use "data.frame()" to combine "df_4" with "df_2" df_6 <- NULL
How’d you do?
No code entered for df_1 yet. No code entered for vec_num_2 yet. No code entered for df_2 yet. No code entered for df_3 yet. No code entered for df_4 yet. No code entered for df_5 yet. No code entered for df_6 yet.
vec_chato df_1 to make a new data frame and then use
lapply(df_cha, class)to get a list that indicates the class of each column in
df_1(you’ll need to run this yourself in the console or turn eval=TRUE):
df_cha <- data.frame(df_1, vec_cha) lapply(df_cha, class)
Do you see anything unusual about
df_cha$vec_cha, say if we get
class(vec_cha)? Get help on data frames with
?data.frame and explain in a sentence what you could do to modify this behavior.
rownames() on df_1. How does this compare to the behavior of these functions on lists and matrices?
Similarly, how do the results of
dim() differ between data frames, lists, matrices, and vectors?
Using the console, use
== to test to see if
data.frame() in questions 5 and 6 produce the same result. Optional: If you want to make this cleaner, integrate either
any() functions, which could allow you to show this result in-line in your answer.
As a final question, which won’t be graded strictly, use between three sentences and a paragraph to describe the differences and similarities between (this is mainly to help you think them through!):