The University of Toronto’s Student Newspaper Since 1880

Share on facebook
Share on twitter
Share on email

The research to data science pipeline: why you should pursue it and how to start

From academia to the tech world
Share on facebook
Share on twitter
Share on email
Free resources to learn data science skills include blogs like Towards Data Science and online courses. BJ FARMER CITOC/CC FLICKR
Free resources to learn data science skills include blogs like Towards Data Science and online courses. BJ FARMER CITOC/CC FLICKR

Several years ago, I reconnected with an acquaintance who was working toward an academic career as an ornithologist, someone who studies birds. Ever since they completed their doctorate, they’ve also started a parallel career as a programmer.

This got me thinking about a number of anecdotal stories I’ve heard about graduates of STEM programs who transitioned out of research and into the tech world. I was particularly interested in one career path: data science.

An analysis of thousands of résumés stored on the job search website Indeed shows that data scientists are more likely to hold advanced graduate degrees and come from academia than those in similar tech jobs, like data analysts and software engineers. This is good evidence that STEM graduates wind up doing big data work, which is why it’s worth exploring as a potential career option before you hit the job market.

What do data scientists do, exactly?

The terminology can be confusing. A data scientist is distinct from but similar to a data analyst, which can in turn be distinct from a data engineer. The difference lies in the level and purpose of the programming you do.

Some data scientists deal with very large datasets from which useful information can be extracted using automated statistical processes. This is sometimes called ‘data mining.’ However, data scientists may also work with predictive data, generating models and expectations for future outcomes.

It’s a job that requires a rigorous knowledge of coding and statistical literacy. Data scientists also spend a lot of time collecting and cleaning data, as well as presenting their findings with visualizations.

Why are STEM graduates gravitating toward data science?

The Indeed study found that data scientists are more likely to transition directly from academia than some other tech workers. Moreover, data scientists come from more varied academic backgrounds. Only about 30 per cent of the résumés analyzed listed a degree in computer science or data science. The rest came from different fields, including about 60 per cent from business, economics, or other STEM programs.

Writing for Nature, Netflix Senior Data Scientist Grace Tang described her transition from having a neuroscience PhD to becoming a data scientist as a natural fit for someone with her statistical and communications skills.

Speaking to Symmetry Magazine, ex-physicist and data scientist Jamie Antonelli said, “The world and your career opportunities are so much broader than what they are inside academia. You have highly valued tech skills, and you can take your favorite part of your job and find someone that will pay you to do just that.” 

How do I get started?

If you’re really set on data science, U of T does offer a data science specialist program. But even outside of that, there are a number of undergraduate courses you can take to learn some of the basic programming and statistical skills of data science. 

They include MAT245 — Mathematical Methods in Data Science, STA302 — Methods of Data Analysis I, STA303 — Methods of Data Analysis II, and CSC311 — Introduction to Machine Learning. At least one such course, CSC321 — Introduction to Neural Networks and Machine Learning, has an online course page still available from a previous session, but it is not being offered in the current course calendar.

There are lots of free resources online for learning how to code in popular languages like R and Python, but there are also specific resources aimed at potential data scientists. 

The blog Towards Data Science maintains a host of tutorials aimed at varying skill levels, including tips on how to nail a job interview. Mathematician Joseph Misiti lists dozens of free books and machine learning frameworks on his Github. If you’re looking to find a community to learn from, PyData is an international network of data science enthusiasts with a Toronto chapter.

Wherever you choose to learn from, it’s clear that data science is going to be appealing to STEM graduates for quite some time.