In a data-driven world, it’s increasingly important for students to know how to generate, analyze and leverage information for problem-solving, no matter what career lies ahead. With a goal of better preparing students for their work at Temple and beyond, CST launched a new course last year, SCTC 1013: Elements of Data Science for the Physical and Life Sciences, open to all CST students and a requirement for all biology students, natural science majors, TUteach majors, and students within all four environmental science concentrations.
The course focuses on basic computer programming in Python and statistical inference through hands-on data projects in biology, ecology, environmental science, genomics, chemistry and physics. Last fall, more than 300 students took the course across five sections, taught by four instructors with 10 undergraduate course assistants helping guide students in their learning.
Elements of Data Science was inspired by a class called Data 8 at University of California at Berkeley, which is a requirement for all students across schools and majors. At Temple, the course was also developed as a web-based interactive textbook with live coding (using Jupyter Notebook) that students can access through laptops, iPads and other devices.
“We retooled the course, really looking at developing problem solving, critical thinking and even algorithmic thinking,” says Jonathan Smith, director of Data Science First Year and an associate professor of instruction in the Department of Chemistry. “I came to this as a chemist, and one of the biggest challenges students face is reading a detailed problem, whether on a test or in a problem set, and translating it into what they actually have to do to get to an answer. That’s often a big inhibitor.”
The course not only gives students the data science tools they may need to solve problems, but also a methodology for applying those tools, a learning process which is fostered by aid from course assistants. Learning how to think through steps and work systematically is a skillset that applies to all CST majors, Smith says.
“One of the beauties of coding is that it's unforgiving,” Smith says. “You can come up with a solution to a problem in a few different ways, but the coding process lets you know if you've done something wrong nearly immediately. That's an important part of the mindset that students need to develop. We want to introduce students to working with data in a robust way—first, how do you bring in a data set, but then how do you think about it statistically using probability, even simulation. Then ultimately toward the end of the course, we give them an introduction to machine learning and artificial intelligence.”
Smith says more than 80 percent of CST students report little to no coding experience before taking the course. “One thing I love about the class is that regardless of experience or skill level with math, students can be successful,” says Susan Jansen Varnum, professor of chemistry, and senior associate dean for undergraduate affairs, science education, and community engagement. “The class builds quantitative skills, analytical reasoning, and logic, as a core foundation.”
Students start with the basics of coding, looking at variables and data types, and then work up from smaller problems to larger data sets, practicing analysis and visualization. Each week, students meet for hands-on lab sessions, which steadily become more challenging over the course of 14 weeks, and the transformation instructors see over that time can be dramatic.
“By the end of the class, students are running a pretty sophisticated analysis of data sets, developing the code to analyze and visualize that data. And they're presenting group projects on novel data sets that they have an interest in whether from their potential major or otherwise,” Smith says.
Perhaps most of all, students come away with a firm grasp on the value of data science and how to apply it.
“This course covers a lot of ground-—it mostly aims to give students a glimpse into what data science can offer them,” says course assistant Liam Mackay, a mathematics and computer science major with a minor in data science. “Data is everywhere. Every company works with data; every industry collects, interprets and stores data in some way.”
In a research heavy field like biology, explains Mackay, it's even more apparent. “Bio labs will measure results and perform analysis on the data, and research labs need to perform analysis that is so computationally extensive that they require special high-performance computers,” says Mackay. “In the last lab of this class, we do a demonstration of molecular simulations using RDKit, a python library designed to make these kinds of simulations easy and accessible.”
Varnum, who views data science as a critical tool in confronting nearly every global challenge we face, says she sees its utility even in places where one might not expect it. “My colleague from the College of Liberal Arts uses data analytics to understand literature better. We already know how data science is moving marketing. We need to encourage students across the entire university to learn about data science.”
It will also be important for faculty to get up to speed on the discipline. Indeed, Smith has offered the class to faculty and will continue training new faculty as they arrive.
For first-year student Noah Peles, the class helped him with coding, problem solving and active reading skills. Peles says that by the end of the course he was thinking in new ways—so much that it may influence his course of study at Temple. “I am now considering choosing data science as a major.”