Assignment 1

Create a new repository on Bitbucket, called “cinf401-assignment-1”, and add me (joshuaeckroth) as a “reader”. Then create a new project in RStudio with the same name and a new R Markdown file with the same name. See the RStudio workflow for instructions.

Complete the tasks below. Write your answers, intermixed with R code and plots (if/when appropriate), in the single R Markdown file. Commit and push your changes to Bitbucket so I can retrieve them on my own machine. I will regenerate your final report by “knitting” your R Markdown file. See the syllabus for the general grading rubric.

Task 1

Find two projects that purport to require “big data analysis.” We looked at three case studies in the big data notes. Answer these questions about the projects:

Task 2

Part A

Create a vector of numeric values. Include at least seven values, two of which must be NA. Then show how to “normalize” the vector so that the magnitude of the vector (ignoring NA’s) is 1.0. Refer to MathWorld for definitions of normalized vector and L2-norm.

Part B

Create a data frame with columns “A”, “B”, “C”, and “D”, satisfying the following properties:

Insert five rows of data. Print/show the data frame.

Next, set the column names of the data frame to “W”, “X”, “Y”, and “Z”. Print the data frame again.

Update column X (previously called B) so that all of its numeric values are multiplied by 2. Then print just column X, just rows 2-5 (inclusive).

CINF 401 material by Joshua Eckroth is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Source code for this website available at GitHub.