To perform this analysis, you will need a csv file with the unique IDs for your alumni as well as their majors and if they are a donor or not as a binary variable (1 or 0). You can also add in any other variables that you want to include.
First, load the ggplot2 library (use install.packages(“ggplot2”) if you don’t have this package yet. We will use this for plotting. Also, set the working directory to the folder where this csv file is located:
Then, read the file into R:
Use the summary function to look for any outliers:
I want to use affinity score or engagement score for my x-axis to see the impact of affinity on giving so I am going to trim my dataset so that I don’t get heavily skewed results from outliers and also to improve readability.
Next, let’s create a table of all the majors in the dataset to see which ones have the most representations and then pull a top ten list.
From this table, pull a few majors to compare. I will select three.
Then, use a faceted plot to compare giving between majors as well as the relative impact of engagement.