Project Ideas
If you are a project maintainer and consider mentoring during the GSoC 2025 cycle, please, submit your ideas below using the template.
Google summer of code timeline.
You can find the project ideas from previous year here.
NOTE: Please note that GSoC is a program known for its strict deadlines. In addition to responding to your mentee on time, you will be required to submit evaluations on time. Failures to meet the deadlines might affect CNCF's future participation in GSoC.
Template
#### CNCF Project Name
##### Project Title
- Description:
- Expected Outcome:
- Recommended Skills:
- Expected project size: # one of small (~90 hour projects), medium (~175 hour projects) and large (~350 hour projects)
- Mentor(s): #For GSoC, it is **required** to have at least 2 mentors with 1 being a primary mentor.
- Jane Doe (@jane-github, jane@email.address) - primary
- John Doe (@john-github, john@email.address)
- Upstream Issue (URL):
Ideas
<!-- TOC -->- etcd
- Harbor
- Jaeger
- KCL
- Konveyor
- KubeArmor
- KubeBuilder
- KubeStellar
- Kubewarden
- Lima
- LitmusChaos
- Meshery
- Open Cluster Management
- ORAS
- The Update Framework (TUF)
- Vitess
- WasmEdge
etcd
etcd cache
- Description: Develop a generic, high-performance caching library for etcd, inspired by the Kubernetes watch cache, to facilitate building scalable and efficient etcd-based applications.
- Expected Outcome:
- A well-tested and performant library providing core caching primitives similar to Kubernetes' watch cache, significantly reducing etcd load and latency for generic etcd use cases.
- The library will offer feature parity with Kubernetes watch cache, including support for:
- Caching watch events and demultiplexing requests.
- Caching non-consistent list requests using a B-tree structure, updated via watch events.
- Handling requests during cache initialization and re-initialization.
- Custom encoders/decoders for data serialization.
- Custom indexing for optimized lookups.
- Consistent reads.
- Exact stale reads via B-tree snapshots.
- Comprehensive documentation, examples, benchmarks, and metrics to enable easy adoption and monitoring. This includes e2e and robustness tests.
- Recommended Skills: Go, Distributes Systems
- Expected project size: small
- Mentor(s):
- Marek Siarkowicz (@serathius, siarkowicz@google.com) - primary
- Madhav Jivrajani (@MadhavJivrajani, madhav.jiv@gmail.com)
- Upstream Issue (URL): https://github.com/etcd-io/etcd/issues/19371
Harbor
Enhance Harbor Satellite for Artifact Replication from Remote Registry to Edge
- Description: The Harbor Satellite project aims to enable decentralized artifact replication in edge computing environments. This project currently focuses on Use Case #1, where Harbor Satellite will pull images from a central Harbor registry and store them in a local OCI-compliant registry for use by edge devices. The solution is designed for environments with limited or intermittent internet connectivity, ensuring continuous access to required artifacts by local edge devices even when connectivity is unavailable.
- Expected Outcome:
- Enhance Harbor Satellite to enable reliable artifact replication from a central Harbor registry to a local OCI-compliant registry at the edge.
- Implement secure synchronization between central and local registries, especially in air-gapped environments.
- Optimize configuration management for edge container runtimes to pull images from the local registry.
- Provide clear documentation and setup guides for deploying Harbor Satellite in edge environments.
- Recommended Skills: Go, OCI-Registries, Distribution-spec
- Expected project size: medium (~175 hour projects)
- Mentor(s):
- Vadim Bauer (@vad1mo, vb@container-registry.com) - primary
- Orlin Vasilev (@OrlinVasilev, orlin@orlix.org)
- Prasanth Baskar (@bupd, bupdprasanth@gmail.com)
- Upstream Issue (URL): https://github.com/goharbor/harbor/issues/21605
Jaeger
Service performance analysis on top of Elasticsearch / OpenSearch data
- Description: Jaeger is an open-source, distributed tracing platform designed to monitor and troubleshoot transactions in distributed systems. In its basic deployment it allows collecting tracing data, storing it in a database, and querying & analyzing individual traces in the UI. This workflow is great for deep-diving into individual requests, but it does not answer some higher level questions like "which endpoints in my service are the slowest?" To address those questions Jaeger has a special feature called SPM (Service Performance Management), which allows the user to see the trends of services' and endpoints' performance and to drill down into the outliers. However, this feature requires a more complicated deployment where a special real-time processor is running and extracting metrics from the traces and storing those metrics in a Prometheus-compatible remote storage. Some of the storage backends supported by Jaeger, such as Elasticsearch & OpenSearch, can provide the same aggregate answers directly from the trace data, which can significantly simplify the deployment. This project aims to enable this integration.
- Expected Outcome:
- Support SPM functionality directly in Elasticsearch / OpenSearch backends by implementing the metrics query API
- Enhance existing e2e integration tests to continuously test this new capability
- Recommended Skills: Go, basic familiarity with Elasticsearch
- Expected project size: large (~350 hour projects)
- Mentors:
- Yuri Shkuro (@yurishkuro, github@ysh.us) - primary
- Jonah Kowall (@jkowall, jkowall@kowall.net)
- Upstream Issue (URL): https://github.com/jaegertracing/jaeger/issues/6641
KCL
KCL OCI third-party dependency management enhancement
-
Description: KCL is an open-source constraint-based record & functional language mainly used in configuration and policy scenarios. KPM is a package management tool for the KCL language that supports the management of KCL packages in the OCI registry and Git Repo. This topic only applies to third-party dependencies from the OCI registry. Use the layering mechanism in OCI to help KPM implement dependency management of KCL third-party dependencies.
-
Expected Outcome:
- Refactor the current KPM dependency management module with the OCI's layered mechanism.
-
Recommended Skills: Go, OCI
-
Expected project size: medium (~175 hour projects)
-
Mentor(s):
- Zhe Zong (@zong-zhe, zongzhe1024@163.com)
- Heipa (@He1pa, he1pa404@gmail.com)
-
Upstream Issue (URL): https://github.com/kcl-lang/kpm/issues/598
Knative Functions
Dynamic AI Agent Callbacks
- Description: Knative Functions is well-suited for AI agent integration. The serverless nature and isolated runtime environment of Functions make them ideal for creating lightweight, purpose-built services that can be dynamically invoked and even created by agents.
- Expected Outcome: This project would be a combination of research and practicum. First, an analysis of current AI agent interaction patterns, including emergent protocols and available frameworks. Second, the development of a Proof-of-concept integration between Functions and AI agents. This would involve at a minimum invocation, with a stretch goal of implementation and deployment by the agent based on a human prompt.
- Recommended Skills:
- Strong language and communication skills, with the ability to both research deeply and communicate clearly.
- Experience with AI/ML agents and desire to learn about programmatic LLM integrations.
- Familiarity with the Go programming language (ideal) or Python (secondarily), and web services.
- Familiarity with kubernetes, serveless, and microservices a plus.
- Expected project size: Large
- Mentor(s):
- Luke Kingland @lkingland (kingland AT redhat DOT com) - primary
- Aleksander Slominski @aslom (aslomins AT redhat DOT com)
- Upstream Issue (URL): https://github.com/knative/func/issues/2690
Konveyor
Extend usage of Konveyor AI to detect and update deprecated Kubernetes API usage in golang applications
-
Description: Konveyor is an application modernization platform that helps organizations migrate legacy applications to Kubernetes at scale. As part of this effort, you will contribute to Konveyor AI (Kai), an intelligent code assistant that automates source code updates using data from static code analysis and changelog histories. Your work will focus on applying Generative AI techniques to detect and update deprecated Kubernetes APIs in Golang applications. You’ll build a tool that uses a LLM to generate Konveyor static code analysis rules from published documentation such as the Kubernetes deprecated API guide. Additionally, you’ll create workflows to identify deprecated API usage in legacy applications and automate code suggestions for updates — all powered by Konveyor AI.
-
Expected Outcome:
- Develop a prototype tool to convert Kubernetes API deprecation documentation into static code analysis rules.
- Collaborate with the Konveyor AI team to extend support for Golang applications, identify issues, and contribute improvements.
- Demonstrate Konveyor AI’s ability to detect and suggest fixes for deprecated API usage in Golang projects.
-
Recommended Skills: Golang, Python, Kubernetes, Generative AI
-
Expected project size: # Large (~350 hours)
-
Mentor(s):
- John Matthews (@jwmatthews, jwmatthews@gmail.com) - primary
- Savitha Raghunathan (@savitharaghunathan, saveetha13@gmail.com)
-
Upstream Issue (URL): https://github.com/konveyor/kai/issues/644
KubeArmor
Improve KubeArmor Observability Spectrum
-
Description: KubeArmor is a security enforcement system that provides runtime protection for Kubernetes workloads. To enhance observability, this task involves exposing key Prometheus metrics related to KubeArmor’s policy enforcement. These metrics will provide insights into security policy activity and alerting within a Kubernetes cluster.
For starters, the following metrics can be started with:
- Number of Policies Applied
- Number of Alerts Triggered
- List of Active Policies
- Policy Status (Active/Inactive)
-
Expected Outcome: Prometheus metrics are successfully integrated into KubeArmor, allowing users to monitor policy enforcement and security events effectively. The metrics should be accessible via a Prometheus endpoint and conform to best practices for Prometheus metric exposition.
-
Recommended Skills: Go, Prometheus, Kubernetes.
-
Expected project size: 175 hrs
-
Mentor(s):
- Rishabh Soni (@rootxrishabh, risrock02@gmail.com)
- Prateek Nandle (@Prateeknandle, prateeknandle@gmail.com)
- Barun Acharya (@daemon1024, barun1024@gmail.com)
- Nishant Singh (@tesla59, talktonishantsingh.ns@gmail.com)
-
Upstream Issue (URL): https://github.com/kubearmor/KubeArmor/issues/1902
KubeBuilder
Automating Operator Maintenance: Driving Better Results with Less Overhead
-
Description:
Code-generation tools like Kubebuilder and Operator-SDK have transformed cloud-native application development by providing scalable, community-driven frameworks. These tools simplify complexity, accelerate results, and enable developers to create tailored solutions while avoiding common pitfalls, laying a strong foundation for innovation.
However, as these tools evolve with ecosystem changes and new features, projects risk becoming outdated. Manual updates are time-consuming, error-prone, and make it challenging to maintain security, adopt advancements, and stay aligned with modern standards.
This project introduces an automated solution for Kubebuilder, with potential applications for similar tools or those built on its foundation. By streamlining maintenance, projects remain aligned with modern standards, improve security, and adopt the latest advancements. It fosters growth and innovation across the ecosystem, letting developers focus on what matters most: building great solutions.
Note that the initial idea is to solve this with 3-way Git merges. However, users will face conflicts, and in the first phase, we want to study whether AI could help resolve these conflicts in a future phase to achieve this goal. -
Expected Outcome
- Conduct research on 3-Way Merge & Advanced Merge Options in Git.
- Conduct research on how AI could help resolve conflicts. If open-source solutions are available and align with the proposal, include them for consideration in a second phase.
- Develop a Proof of Concept (POC) implementing a GitHub Action that automatically creates a Pull Request (PR) in a mock repository, demonstrating the feasibility of the proposed solution.
- Successfully complete the proposal for PEP.
- Introduce a new Kubebuilder Plugin that scaffolds the GitHub Action based on the POC. This plugin will be released as an alpha feature, allowing users to opt-in for automated updates. The initial solution does not need to have AI, but AI integration could be a future enhancement if feasible.
-
Recommended Skills
- Golang
- GitHub Actions
- Software Automation
- CI/CD
- Git
- IA
-
Expected project size: Large (~350 hour projects)
-
Mentor(s)
- Camila Macedo (@camilamacedo86, camilamacedo86@gmail.com) - Primary
- Varsha Prasad (@varshaprasad96, varshaprasad1507@gmail.com)
- TianYi(Tony) (@Kavinjsir)
-
Upstream Issue: WIP - Proposal: Automating Operator Maintenance: Driving Better Results with Less Overhead
KubeStellar
Automating Benchmarking of KubeStellar Data-Plane
-
Description: KubeStellar (KS) is an open-source Kubernetes multi-cluster workload configuration management system that can be used to manage AI workloads in multi-cluster environments. Hence, understanding KS performance is crucial especially for AI use-case scenarios.
This project aims to extend the performance monitoring & analysis tools for KS and develop an automated system to measure and provide real-time performance values and statistics of the KubeStellar data plane with focus on enhance the efficiency and repeatability of our measurements (e.g., down-syncing and up-syncing latencies, etc.). This framework will provide a dashboard of these results in which all KS data-plane metrics are represented, analyzed and charted visually to give better insight to the performance characteristics of KubeStellar data-plane. We will explore different AI use-cases to evaluate this framework and the performance of KS when managing resource intensive AI workloads.
-
Expected Outcome:
- A fully automated framework that can reliably assess the performance of KubeStellar data plane across various scenarios including AI use-cases.
- Dashboards facilitating KubeStellar Performance profiles and results which are analyzed and charted visually to give better insights to performance results.
- Dashboards added as a plugin to the KubeStellar UI.
- Design the solution and implement it, introduce it the KubeStellar community. The implementation needs to be well scalable and applicable across various scenarios.
-
Recommended Skills:
- Understanding of performance analysi
- Kubernetes and container orchestration
- Cloud native AI/ML apps deployment
- Python, Go (for Kubernetes integrations)
- Experience with logging/monitoring tools (Prometheus, OpenTelemetry)
- Familiarity with KubeStellar (preferred but not required)
-
Expected Project Size: Large (~350 hours)
-
Mentor(s):
- Braulio Dumba (@dumb0002, brauliodumba@gmail.com) - Primary Mentor
- Jim Cadden (@jimcadden, jcadden@ibm.com) - Secondary Mentor
- Andy Anderson (@clubanderson, andy@clubanderson.com) - Secondary Mentor
-
Upstream Issue (URL): https://github.com/kubestellar/kubestellar/issues/2853
Kubewarden
Allow policies to be written using JavaScript
-
Description: Kubewarden is a Policy Engine powered by WebAssembly policies that enforces security and compliance in Kubernetes clusters. Its policies can be written in CEL, Rego (OPA & Gatekeeper flavours), Rust, Go, YAML, and others.
Kubewarden does not have a JavaScript SDK yet. Recent work done inside of the Bytecode Alliance made possible to compile Javascript code into WebAssembly . This means It's now possible to create such a SDK. This task consists on writing a JavaScript SDK that provides an idiomatic way to write Kubewarden policies.
-
Expected Outcome: A new JavaScript SDK is created. The SDK API is documented, and the policy tutorial as well.
-
Recommended Skills: JavaScript, Kubernetes.
-
Expected project size: Large
-
Mentor(s):
- Victor Cuadrado (@viccuad, vcuadradojuan@suse.com) - primary
- Flavio Castelli (@flavio, fcastelli@suse.com)
- José Guilherme Vanz (@jvanz, jguilhermevanz@suse.com)
- Fabrizio Sestito (@fabriziosestito, fabrizio.sestito@suse.com)
-
Upstream Issue (URL): https://github.com/kubewarden/community/issues/37
Elevate our .NET SDK into a first class citizen
-
Description: Kubewarden is a Policy Engine powered by WebAssembly policies that enforces security and compliance in Kubernetes clusters. Its policies can be written in CEL, Rego (OPA & Gatekeeper flavours), Rust, Go, YAML, and others.
Kubewarden has a .NET SDK that allows policy authors to write policies in C#. Starting with .NET 8, a big chunk of the work from https://github.com/dotnet/dotnet-wasi-sdk made its way upstream. This means it's a good time to revisit Kubewarden's .NET SDK for policies. This task consists on bringing our .NET SDK up to standard with the rest of our SDKs such as the Go or Rust ones.
-
Expected Outcome: Our .NET SDK has been ported to .NET 9, and supports the same capabilities as our other SDKs. The SDK API is documented, and the policy tutorial as well.
-
Recommended Skills: C#, .NET, Kubernetes.
-
Expected project size: medium
-
Mentor(s):
- Victor Cuadrado (@viccuad, vcuadradojuan@suse.com) - primary
- Flavio Castelli (@flavio, fcastelli@suse.com)
- José Guilherme Vanz (@jvanz, jguilhermevanz@suse.com)
- Fabrizio Sestito (@fabriziosestito, fabrizio.sestito@suse.com)
-
Upstream Issue (URL): https://github.com/kubewarden/policy-sdk-dotnet/issues/47
Lima
VM plugin subsystem
- Description: Lima (https://lima-vm.io) is a project that provides Linux virtual machines with a focus on running containers.
Lima supports several VM backends via built-in drivers:
qemu,vz(Apple Virtualization.framework), andwsl2(seelima/pkg/driverutil/instance.go). The idea for GSoC is to make a plugin subsystem that decouples the built-in VM drivers into separate binaries that communicate with the main Lima binary via some RPC (probably gRPC). This idea will improve the maintainability of the code base, and also help supporting additional VM backends (e.g.,vfkitand cloud-based drivers). - Expected Outcome:
- Design the plugin subsystem and its RPC (probably gRPC)
- Migrate the existing built-in VM drivers to the new plugin subsystem
- Implement additional VM plugins if the time allows
- Recommended Skills: Go, gRPC, QEMU, macOS
- Expected project size: medium (~175 hour projects)
- Mentor(s):
- Akihiro Suda (@AkihiroSuda, suda.kyoto@gmail.com) - primary
- Anders Björklund (@afbjorklund, anders.f.bjorklund@gmail.com)
- Upstream Issue (URL): https://github.com/lima-vm/lima/discussions/2007
LitmusChaos
Terraform Support for LitmusChaos
-
Description: LitmusChaos is an open-source Chaos Engineering platform that helps teams uncover weaknesses and potential outages in their applications by running controlled chaos experiments. However, before injecting chaos, several prerequisite steps must be completed, including user and project creation, connecting target infrastructure, and setting up experiments. To streamline this process, developers and SREs often seek automation, especially when integrating chaos testing into CI/CD pipelines. This Google Summer of Code (GSoC) project proposes developing a Terraform provider for LitmusChaos, enabling users to automate these essential setup steps and seamlessly manage chaos experiments through Terraform.
-
Expected Outcome:
- LitmusChaos will have a terraform provider supporting user, project, infrastructure, and experiment resource operations along with proper documentation and usage scripts.
- A stretch goal for the mentee would be to become an official maintainer of the Litmus Terraform provider project.
-
Recommended Skills: Golang, Terraform
-
Expected project size: large (~175 hour projects)
-
Mentor(s):
- Saranya Jena (@Saranya-jena, saranya.jena@harness.io)
- Sarthak Jain (@SarthakJain26, sarthak.jain@harness.io)
-
Upstream Issue (URL): https://github.com/litmuschaos/litmus/issues/5042
Meshery
Extensibility of Meshery Dashboard: Enhance meshery dashboard with extensible widgets
Current Behavior
-Description: All dashboard widgets in Meshery UI are currently defined and bundled directly within the UI, limiting extensibility and modularity. Expected Behavior
-Expected Behavior: Widgets should be independent extension points, similar to other UI extensions.
New widgets should be developed, including: - Metrics widget - Resource Aggregation Charts widget - Beyond widgets, general improvements are needed for the dashboard and resource tables, including: Deep linking between resources Support for ad-hoc actions
- Recommended Skills: Golang, Kubernetes,React, Azure, well-written and well-spoken English
- Expected project size: large (~350 hour project)
- Mentor(s):
- Lee Calcote (@leecalcote, leecalcote@gmail.com)
- Edward Corley (@codeSafari10, codeSafari10@gmail.com)
- Upstream Issue: https://github.com/meshery/meshery/issues/14084
Support for Azure in Meshery
-
Description: Enhance Meshery's existing orchestration capabilities to include support for Azure. The Azure Service OperatorAzure Service Operator (ASO) provides a wide variety of Azure Resources via Kubernetes custom resources. as first-class Meshery Models. This involves enabling Meshery to manage and orchestrate Azure services and their resources, similar to how it handles other Kubernetes resources. The project will also include generating support for Azure services and their resources in Meshery's Model generator.
-
Expected Outcome: Meshery will be able to orchestrate and manage all Azure services supported by ASO. This includes the ability to discover, configure, deploy, and operate the lifecycle of Azure services through Meshery. The Meshery Model generator will be updated to automatically generate models for Azure services, simplifying their integration and management within Meshery. This will be an officially supported feature of Meshery.
-
Recommended Skills: Golang, Kubernetes, Azure, well-written and well-spoken English
-
Expected project size: large (~175 hour project)
-
Mentor(s):
- Lee Calcote (@leecalcote, leecalcote@gmail.com)
- Mia Grenell (@miacycle, mia.grenell2337@gmail.com)
-
Upstream Issue: https://github.com/meshery/meshery/issues/11244
Distributed client-side inference (policy evaluation) with WebAssembly (WASM) and OPA in Meshery
-
Description: Meshery's highly dynamic infrastructure configuration capabilities require real-time evaluation of complex policies. Policies of various types and with a high number of parameters need to be evaluted client-side. With policies expressed in Rego, the goal of this project is to incorporate use of the https://github.com/open-policy-agent/golang-opa-wasm project into Meshery UI, so that a powerful, real-time user experience is possible.
-
Expected Outcome: The goal of this project is to enhance Meshery's infrastructure configuration capabilities by incorporating real-time policy evaluation using the golang-opa-wasm project. This project will integrate the capabilities of golang-opa-wasm into the Meshery UI, enabling users to experience the existing, powerful, server-side policy evaluation client-side.
-
Recommended Skills: WebAssembly, Golang, Open Policy Agent, well-written and well-spoken English
-
Expected project size: large (~350 hour project)
-
Mentor(s): Lee Calcote (@leecalcote, leecalcote@gmail.com), James Horton (@hortison, james.horton2337@gmail.com)
-
Upstream Issue: https://github.com/meshery/meshery/issues/13555
Open Cluster Management
Privacy-preserving and efficient AI model training across multi-cluster
-
Description: Open Cluster Management (OCM) streamlines multi-cluster workload management through APIs that align with SIG-Multicluster standards. Beyond traditional workload orchestration, OCM enables scalable AI training and inference across distributed environments.
As machine learning (ML) expands across clusters, data privacy becomes a critical concern. ML models rely on vast datasets, making it essential to safeguard sensitive information across clusters without compromising model performance.
This project integrates Federated Learning (FL) into OCM, enabling privacy-preserving, collaborative model training without transferring raw data between clusters. Instead, training occurs locally where the data resides, ensuring compliance, enhancing efficiency, and reducing bandwidth and storage costs.
By leveraging OCM's Placement, ManifestWork, and other APIs. we standardize FL workflows and seamlessly integrate frameworks like Flower and OpenFL through a unified interface. This approach harnesses OCM's capabilities to deliver scalable, cost-efficient, and privacy-preserving AI solutions in multi-cluster environments.
-
Expected Outcome:
- Comprehensive Documentation:
- Define the scenarios addressed by the prototype, highlighting its purpose and value.
- Provide an intuitive and architectural comparison between Federated Learning (FL) and OCM, mapping FL terminology to OCM APIs to showcase OCM's native support for FL.
- Illustrate the complete Federated Learning workflow within Open Cluster Management.
- Extended Prototype (or CRD) Support:
- Enable model aggregation persistence in AWS S3 (currently supports only native PVC).
- Extend compatibility to support additional Federated Learning frameworks like OpenFL and NVIDIA FLARE (currently supports Flower). This involves understanding how OpenFL, NVIDIA FLARE, and other frameworks function, containerizing them, and integrating them into the prototype.
- Enhanced Observability and Metrics for Federated Learning API:
- Define key metrics, expose them via API and ensure extensibility for future enhancements.
- Track resource usage (optional), monitoring GPU, CPU, and memory usage for client training is beneficial but not essential.
- Comprehensive Documentation:
-
Recommended Skills: Golang, Kubernetes, Federated Learning, Open Cluster Management, Scheduling
-
Expected project size: large (~350 hour projects)
-
Mentor(s):
- Meng Yan (@yanmxa, myan@redhat.com) - primary
- Qing Hao (@haoqing0110, qhao@redhat.com)
-
Upstream Issue (URL): open-cluster-management-io/ocm#825
ORAS
Enhance Java ORAS SDK
- Description: The ORAS project aims to enhance its Java SDK to support a broader range of features from the OCI Distribution spec. This involves implementing missing functionality, improving existing features, and expanding the SDK’s overall capabilities.
- Expected Outcome:
- Implement missing features from the OCI Distribution and Image Specifications, such as chunked uploads and the Referrers API
- Improve existing features, robustness and tests to ensure full compatibility with the OCI Distribution and Image Specifications.
- Enhance documentation and provide more comprehensive examples.
- Add support for additional authentication methods, including using credentials from docker config.json
- Recommended Skills: java, oci
- Expected project size: medium (~175 hour projects)
- Mentor(s):
- Valentin Delaye (@jonesbusy, jonesbusy@gmail.com) - primary
- Feynman Zhou (@FeynmanZhou, zpf0610@gmail.com)
- Upstream Issues: https://github.com/oras-project/oras-java/issues
The Update Framework (TUF)
Snapshot Merkle trees
- Description: The TUF snapshot role is responsible for consistency proofs in a TUF repository. In certain high-volume repositories, the related snapshot metadata file can become prohibitively large, and thus impose a significant overhead for TUF clients. TAP 16 proposes a method for reducing the size of snapshot metadata a client must download without significantly weakening the security properties of TUF. In this project you will add TAP 16 support to python-tuf.
- Expected Outcome: Snapshot Merkle trees are implemented in python-tuf Metadata API and
ngclient - Recommended Skills: Python, data structures (merkle trees)
- Expected project size: medium (~175 hour projects)
- Mentor(s):
- Lukas Pühringer (@lukpueh) - primary
- Justin Cappos (@JustinCappos)
- Upstream Issue (URL): https://github.com/theupdateframework/taps/issues/134
Vitess
Enhancements for FAQ Chatbot for Vitess
-
Description: Vitess is a distributed database system built on MySQL. Developers often need to search through documentation, Slack discussions, and GitHub issues to find answers. We are starting a project to implement an AI-powered FAQ chatbot using Retrieval-Augmented Generation, integrating vector search with an LLM (such as OpenAI, DeepSeek, GPT-4, Mistral, Llama 3). The chatbot will be available via a CLI and Slack bot for developer support.
In the next phase, which will be implemented in this Summer Of Code (SOC) project, we will be adding more features like:
- Content filtering for chatbot safety and response validation
- Fine-tuning the model for improved accuracy
- Pipelines for refreshing data from new/updated docs
- Caching previous replies to reduce LLM costs
- Rate-limiting
- Better benchmarking for iterative improvements
- User feedback integration to improve relevancy
-
Expected Outcome: Improved chatbot that provides accurate Vitess-related answers via CLI and Slack, using indexed documentation and discussions for retrieval.
-
Recommended Skills: golang, python, LLM APIs, vector databases
-
Expected project size: large (~350 hour projects)
-
Mentor(s):
- Rohit Nayak (@rohit-nayak-ps, rohit@planetscale.com)
- Manan Gupta (@GuptaManan100, manan@planetscale.com)
-
Upstream Issue: https://github.com/vitessio/vitess/issues/17690
WasmEdge
Virtual filesystem security for WasmEdge plug-ins with exporting WASI APIs
- Description: The WASI proposal defines the variety of rules to guarantee the virtual filesystem security and isolation when executing WASM binaries. However, besides using WASI directly in WASM, developers can also implement the host functions to access the filesystem in their guest programming language. This will break the sandbox of WebAssembly. In this program, our goal is to export the WASI APIs in WasmEdge, and use the APIs in WasmEdge plug-ins to ensure the filesystem security and WebAssembly isolation.
- Expected Outcome:
- Export needed WASI APIs in WasmEdge internal to provide the functions of checking and accessing host filesystem.
- Apply the APIs in some WasmEdge plug-ins which accessing the filesystem, such as WASI-NN.
- Implement test suites to verify the above behaviors.
- Recommended Skills:
- C++
- WebAssembly
- Expected project size: Large (~350 hour projects)
- Mentor(s):
- YiYing He (@q82419 , yiying@secondstate.io) - Primary
- Shen-Ta Hsieh (@ibmibmibm , beststeve@secondstate.io)
- Upstream Issue (URL): https://github.com/WasmEdge/WasmEdge/issues/4012
Port WasmEdge and the WASI-NN ggml backend to the s390x platform
- Description: WasmEdge provides cross-platform support for amd64 and arm64 for executing AI/LLM applications. We would like to support as many new hardware platforms as possible, so developers and users will no longer need to worry about the actual hardware. All they need to do is develop their AI agent or LLM applications once and deploy their services anywhere. For more information, please check the upstream issue.
- Expected Outcome:
- Make the WasmEdge toolchain support the s390x platform, including the interpreter and the AOT mode.
- Ensure the WASI-NN ggml plugin can execute without any issues on the s390x platform.
- Implement test suites to verify the above behaviors.
- Write a document discussing the compilation, installation, execution, and verification of the work.
- Recommended Skills:
- C++
- s390x
- LLVM
- Expected project size: Large (~350 hour projects)
- Mentor(s):
- Hung-Ying Tai (@hydai, hydai@secondstate.io) - Primary
- dm4 (@dm4, dm4@secondstate.io)
- Upstream Issue (URL): https://github.com/WasmEdge/WasmEdge/issues/4010
Use Runwasi with WasmEdge runtime to test multiple WASM apps as cloud services
- Description: With WasmEdge serving as one of Runwasi’s standard runtimes, and as our C++-implemented library continues to evolve, we also need a verification process integrated into Runwasi to streamline and validate the stability of both container and cloud environments.
- Expected Outcome:
- A concise GitHub workflow demonstrates Runwasi end-to-end testing on Kubernetes.
- Need to design an interactive application scenario that supports multiple nodes
- Try to incorporate the use of the WasmEdge plugin into this scenario
- Document
- A concise GitHub workflow demonstrates Runwasi end-to-end testing on Kubernetes.
- Recommended Skills:
- Rust
- C++
- GDB
- git / github workflow
- shell script
- Expected project size: Large (~350 hour projects)
- Mentor(s):
- Vincent (@CaptainVincent, vincent@secondstate.io) - Primary
- yi (@0yi0 yi@secondstate.io)
- Upstream Issue (URL): https://github.com/WasmEdge/WasmEdge/issues/4011