Last week we held our first ever software carpentry bootcamp for the R statistical programming language. Rather than teach to a room of novice learners, we pitched this bootcamp to a much smaller crowd of potential helpers and instructors. We did this for two reasons:
To provide our newest instructor, Scott Ritchie, a chance to get feedback on his teaching.
While the python teaching materials are quite mature, this is not the case for the R materials. Although there have been roughly 30 R bootcamps taught in the past year, these have come from each individual instructor, rather than from a unified set of materials.
Currently there are two sets of materials in the Software Carpentry lessons repository:
the novice R materials, which are the culmination of several months of hard work by a few members of the SWC R community (particularly John Blischak) to translate the python materials to R,
We chose to work from the intermediate materials for two reasons:
We felt that the material is more representative of R code you would encounter and write in the wild,
most researchers we’ve talked to at the University of Melbourne who want to learn more R already use it for their research, and have to use because of some specialised package useful to their research question. This means they are able to cover slightly more difficult material than those attending a bootcamp without any prior programming experience.
Many on the R-discuss mailing list have also recently expressed they would not be using the novice materials, for reasons similar to #1 above, and there are plans for a conference call in the next few weeks to discuss the path forward for creating appropriate novice materials.
We spent a full day going through the intermediate materials. The lessons covered can be found here on the bootcamp site, and feedback on the material and teaching can be found on the corresponding etherpad. In summary, we found the material was became too technical too quickly, lacked research context, and lacked sufficiently spaced challenges.
At the end of the day, we sat down and brainstormed on ways to transform this material into something suitable for novices, but also representative of real world R code. We came up with a list of challenges to re-mold the intermediate materials around, splitting them across two afternoons (we find splitting bootcamps across afternoons is much better for helping attendees retain knowledge, and avoid scheduling conflicts).
Afternoon 1: Understanding Data Types
Confusion over data types is one of the biggest struggles for novice R users, so we still want to provide a strong foundation by creating lessons around them.
The goal of afternoon 1 is to teach attendees how to read in various types of data, what those data structures mean, extract a useful subset of the data, and visualise it.
Lesson 1: Read in some small research datasets for each data type (matrix, data.frame, list) so that the instructor can explain the different data types.
Challenge 1: (a)Create vectors of various types, (b) combine theminto a matrix, data.frame, list. What happened? Is it what you expected?
Lesson2:Seeking help and collaborating. Teach attendees how to save/write objects.
Challenge 2:Save the object you created for challenge 1, and share it with the person next to you. Load in their data.
Lesson 3: Subsetting data
Challenge 3: current exercises for intermediate materials “R Basics”
Lesson / Challenge 4: Plotting data
Afternoon 2: Wrangling and Exploring Data
The goal of the second afternoon is to dive deeper into data exploration, introducing the split-apply-combine (a.k.a MapReduce) pattern of problem solving, which includes writing your own functions.
Lesson 5: The apply lesson from the intermediate materials, with the split-apply-combine image from the alternate lesson.
Challenge 5:applying some basic function to groups within some loaded data.
Lesson 6:writing your own functions (and using them inside apply).
Challenge 6:write a function to do x, and apply it to the groups as before.
Challenge 8:“Package Speed dating”: instructors + helpers break off into small groups to show off useful/cool packages related to their area of expertise.
Another thought we had was creating an R cheatsheat, featuring common “gotchas” and protips, pointing out useful functions, and default arguments to common functions that may trip you up (e.g. stringsAsFactors).
We will be sharing our experience and thoughts with the R software carpentry community, and actively contributing to the next iteration of R novice materials. We plan to run a larger R bootcamp in late November / early December with the updated materials. Watch this space for dates!
Welcome to the Research Computing Services Blog. We're here to help you do your research better! We'll connect you with the best research tools, workshops, expertise & community. Need more information? Check out our pages below!