As many U of T students were wrapping up classes in March, first-year engineering student Hannah Le and her team won the third Pioneer Tournament — a worldwide competition that rewards participants for developing innovative ideas — for their project that used machine learning to identify and understand human biomarkers that predispose individuals to certain diseases.

Competition participants submit their project online and post weekly progress updates. The project then earns points awarded by contestants, who vote on the updates. After three weeks, the project becomes eligible to win a weekly prize, which is awarded to the team that wins the highest number of points at the end of that week. A project that places as a finalist for three weeks wins the team a larger award.

Le and her team members — Samarth Athreya, 16, and Ayaan Esmail, 14 — earned a top spot on the leaderboard in March and were awarded $7,000 from Pioneer to put toward their project. 

How the team got together

“Samarth, Ayaan and I met each other at an organization called The Knowledge Society in 2017,” wrote Le to The Varsity. The Knowledge Society is a startup incubator that exposes high school students to emerging technologies, such as artificial intelligence (AI), virtual reality, and brain-computer interfaces.

When the three innovators met, Esmail was working on a project that could accurately pinpoint and target cancer cells, while Athreya was working with machine learning models. With Le’s interest in genetics, the three decided to team up and investigate whether there was a way to use metabolic data to predict the onset of a disease.   

“I became incredibly curious on how we can decode the 3 billion letters [of DNA] in every cell of our body to increase human lifespan and healthspan,” wrote Le.

“Inspired by my grandmother who passed away due to cancer, I started asking myself the question: [could] there possibly be a way for us to predict the onset of cancer before it happens, instead of curing it?”

How Le’s team developed a model for predicting the risk of cancer development

At its core, the team’s AI platform uses a patient’s biological information to predict their risk of developing certain forms of cancer.

Metabolites are molecules that play a key role in maintaining cellular function, and some studies have shown that high levels of certain metabolites can signal the progression of lung cancer. But to develop and test their model, the team needed a large amount of metabolic data.

“To overcome such [a] limitation, we had the fortune to reach out to mentors such as the Head of Innovation at JLABS, [a Johnson & Johnson incubator], for further guidance and advice,” wrote Le. “As our team cultivates a stronger database, we would be able to produce more reliable results.”

“As teenagers we were far from experts [in] the field but we were really hungry to learn,” added Le.

As participants of the Pioneer Tournament, Le and her team received the opportunity to select a board of virtual advisors, who would provide guidance for their project.

“I recalled contacting Josh Tobin at OpenAI to ask him about the use of synthetic data in genomics research,” wrote Le. “[That enabled] us to understand both the strengths and weaknesses of such [an] approach, allowing us to pivot on what models to implement.”

The competition as a learning experience

Le remembers the Pioneer Tournament as an exciting chance to learn about different machine learning models and what made them effective as well as other projects that fellow participants were working on, all while attending courses at U of T.

“First year was an interesting journey of challenging course content, intertwined with unexpected personal growth,” wrote Le. “I learned how to strike a balance between working on personal projects, meeting interesting people, while completing my school work.”

And while Le is intrigued by the intersection of machine learning and genomics, she wrote, “I hope to keep an open mind and continue to be curious about the world around me.”