# Assignment 2

Write all of your code in an RMarkdown file. Show every step of your process.

Grab the first table of Foreign Terrorist Organizations (the one called “Designated Foreign Terrorist Organizations”) entirely using R (do not copy-paste the content or download ahead of time). Consider using the readHTMLTable function in the XML library in R. Note that readHTMLTable cannot open https: connections, so you’ll need to do some googling to figure this out.

Produce a data frame sorted by organization name, i.e., the following:

Produce another data frame that shows the number of terrorist organizations that began in each year. Use the same column names shown below, and ensure it is ordered by year.

Note: For full credit, you must not manually modify any of the data. Use only R functions/features to manipulate the data. You should never type “2007”, or “al-Nusrah Front”, for example. You are allowed to rewrite the column names using colnames(). And as stated above, for full credit you must download the HTML page from R itself, and not save any intermediate CSV files.

Go to the IPEDS site and follow the directions on the data sources cookbook, in the IPEDS section, to grab CSV datasets with the following qualities:

• Final release data
• US Institutions (should be about 7500 of them)
• Variables: Graduation rate with Bachelor degree within 4 years, total (all students), all years

You’ll get a CSV file with several columns (institution fields, plus a column for each year). Cast/melt/merge/munge until you have a data frame that looks like this, with the exact column names shown (the rows may appear in a different order, that’s no problem):

Running summary on the data frame will summarize each column (mean, max, min, etc.). Do this so we can be sure you’ve combined all the data (see how all the years are represented):

Now, make a data frame with the average 4-year graduation rate, across all years, for each school:

Next show Stetson’s individual year rates:

Finally, show Steston’s average rate. It should be 54.27 :(

Find your own two data sets, from different sources, that may help you answer this question, and merge them with merge. Then create a summary data frame (using melt/dcast and/or aggregate). This summary should lead to an answer to your question.