One of the challenges of teaching applied data science courses is managing individual students’ local computing environment. This is especially challenging when teaching massively open online courses (MOOCs) where students come from across the globe and have varied socioeconomic status and access to computational resources. Providing these students with a cloud-based fully hosted programming environment ensures equity of access to data science education and reduces the technical burden for students and instructors alike. In developing the Coursera Clinical Data Science Specialization (a series of six online courses that teach students practical tools for working with EHR data) we needed to develop a solution that could scale to support the computational needs of hundreds to thousands of learners. We deployed a virtual Google Compute Engine the can grow with the course by adding additional hard drive space, CPU and RAM capacity with just a few clicks. Students access course data through Google BigQuery and perform all computation on an RStudio Server instance hosted on the Google virtual machine. In addition to supporting the individual student’s education, by logging all programming activities completed by the learners, we can better understand how the students are applying the education content presented in the course.


David Mayer, University of Colorado School of Medicine
Seth Russell, University of Colorado School of Medicine
Chan Voong, University of Colorado School of Medicine
Michael Kahn, University of Colorado School of Medicine
Laura Wiley (Presenter)
University of Colorado School of Medicine

Presentation Materials:

None yet.