GSoC 2026
Open Genome Informatics

Open Genome Informatics — Project Ideas

Got an idea for GSOC 2026?’

Then please post it. You can either

  • Add it here, by directly editing this page. Just copy, paste and update the templatebelow. This requires that you create a fork of this repo and then make a pull request with the changes.

Projects can use a broad set of skills, technologies, and domains, such as GUIs, database integration and algorithms.

Students are also encouraged to propose their own ideas related to our projects. If you have strong computer skills and have an interest in biology or bioinformatics, you should definitely apply! Do not hesitate to propose your own project idea: some of the best applications we see are by students who go this route. As long as it is relevant to one of our projects, we will give it serious consideration. Creativity and self-motivation are great traits for open-source programmers.

*Brief explanation:*MP-BioPath is a computational tool designed to predict the effects of perturbations on biological pathways. Utilizing Reactome’s pathway models, MP-BioPath employs an optimization model. Our objective is to develop pipelines and tools that integrate MP-BioPath results with genomic data.*Expected results:*As a result of this project, we aim to develop tools and pipelines capable of handling diverse genomic datasets. Additionally, we anticipate the generation of novel biologically significant insights.Project Home Page URL:ReactomeMP-BioPathProject paper reference and URL:“Evaluating the predictive accuracy of curated biological pathways in a public knowledgebase”*Knowledge prerequisites:Python, R, JuliaSkill level:MediumProject Time:175-hour approximately 10 weeksMentors:*Adam Wright adam.wright@oicr.on.ca

*Brief explanation:*Cancer patients are often left on their own to find clinical trials of cutting-edge therapies. This project seeks to develop an LLM-driven chatbot and interactive map that lets patients describe their situation and find nearby clinical trial sites that they may be eligible for.*Expected results:*As a result of this project, patients will be able to more effectively discover clinical trials, learn more about them, and contact the study doctors to seek enrollment.*Project Home Page URL:*There is no project web page at the moment, but you can get an idea of the type of underlying database we will be using atthe Cancer Trials Canadawebsite.*Knowledge prerequisites:SQL, Python, React (TypeScript), familiarity with Chainlit (LLM) and Mapbox (Geomapping) APIsSkill level:MediumProject Time:175-hour approximately 10 weeksMentors:*Lincoln Stein lincoln.stein@gmail.com, Shraddha Pai spai@oicr.on.ca.

*Brief explanation:*Reactome houses a meticulously curated repository of human biological pathways. Our current initiative focuses on crafting a RAG chat application optimized for intuitive interaction with the Reactome web portal. Our primary aim is to empower the application to interpret user queries and leverage the LLM (Language Model) to delve deep into pathway structures, enabling the generation of comprehensive and insightful responses for users.Expected results:expected outcomes include the application’s ability to effectively handle a diverse range of user queries and to expand its capabilities to accommodate an increased number of use cases. Furthermore, the application is expected to leverage advanced reasoning capabilities powered by the LLM, thereby providing more insightful and comprehensive responses tailored to each user’s inquiryProject Home Page URL:Reactome*Project paper reference and URL:**Knowledge prerequisites:Python, RAGSkill level:MediumProject Time:175-hour approximately 10 weeksMentors:*Adam Wright adam.wright@oicr.on.ca

*Brief explanation:*Reactome provides users with various computational interfaces for computationally accessing the curated biological pathways, including analysis tools and a chat interface React-to-Me. A Reactome MCP would make the website more computationally accessible by providing access to the tools through React-to-Me and other LLM based chat interfaces.*Expected results:*expected outcomes include the application’s ability to run Reactome analysis tools through the React-to-Me chat interface. Other features of Reactome, including our REST APIs, should be made accessible to LLMs through the MCP.Project Home Page URL:React-to-Me*Project paper reference and URL:**Knowledge prerequisites:Python, RAG, MCPSkill level:MediumProject Time:175-hour approximately 10 weeksMentors:*Adam Wright adam.wright@oicr.on.ca

*Brief explanation:*This project will develop and validate a real-time artificial intelligence (AI) application that continuously monitors major social media platforms (TikTok, YouTube, Instagram, X, and Reddit) to identify emerging health-related trends involving ear, nose, and throat (ENT) issues among children and adolescents. The system will classify and rank viral behaviors based on engagement metrics and notify pediatric otolaryngologists about potentially harmful trends or misinformation. The goal is to explore how automated social media surveillance can support early awareness and clinical decision-making in pediatric otolaryngology.*Expected results:*expected outcomes include:- A working prototype of a real-time social media monitoring pipeline.

  • Automated collection of public data via platform APIs.
  • NLP/LLM-based classification and ranking of pediatric ENT-related trends.
  • A reporting dashboard visualizing trends, engagement metrics, and risk flags for clinicians. Evaluation of model performance (precision, recall, accuracy) and preliminary assessment of clinical usefulness with pediatric otolaryngologists. Documentation and open-source code suitable for further research and extension.

Project Home Page URL:Host lab webpage, no specific project page yet*Project paper reference and URL: No existing paper yet; this project will contribute to future publications on AI-driven social media surveillance in pediatric otolaryngology.**Knowledge prerequisites:*Programming languages: Python (for AI/NLP and data pipelines), JavaScript/TypeScript (for frontend) Experience with:- REST APIs and social media data extraction

  • NLP and/or LLM integration
  • Basic machine learning workflows
  • Full-stack development (backend services + frontend dashboards)

*Skill level:AdvancedProject Time:350-hour approximately 12 weeksMentors:*Melanie Courtot, OICR and UoT mcourtot@oicr.on.ca; Jochen Weile, OICR, jweile@oicr.on.ca

*Brief explanation:*Brief description of the idea, including any relevant links, etc.*Expected results:*describe the outcome of the project idea.*Project Home Page URL:*if there is one.*Project paper reference and URL:*Is there a paper about the project this effort will be a part of?*Knowledge prerequisites:*programming language(s) to be used, plus any other particular computer science skills needed.*Skill level:*Basic, Medium or Advanced.*Project Time:*90-hour, 90, 175 or 350 hours that are a standard 10 weeks long and no longer than 12 weeks.*Mentors:*name + contact details of the lead mentor, name + contact details of 1 or 2 backup mentors.

Command Palette

Search for a command to run...