ShareChat
Moj

Distributing protobuf contracts at scale: Why we built a custom tool

Placeholder

Sampath Shetty,Nilesh Kevlani14 Apr, 2026

Follow us on FacebookFollow us on TwitterFollow us on InstagramFollow us on Linkedin
Distributing protobuf contracts at scale: Why we built a custom tool

Background

At ShareChat, we recently overhauled the architecture for the data ingestion system. As part of this exercise, we also moved 100% of our event data contracts from JSON to Protobuf. With this, we planned to onboard ~1000 protobuf contracts for existing event types and then 4-8 new contracts per week for new event types getting created.

In this article, we will discuss the problems with existing solutions, their limitations and how we solved the limitations with an in-house tool.

Existing solutions for distributing protobuf contracts

Some of the well known approaches for maintaining and distributing protobuf contracts in the industry are:

  • Central repository with builds: Using a central repository for protobuf contract definitions, building libraries for them and distributing libraries from this central repository.
  • Central repository with protobuf contract: Using a central repository for protobuf contract definitions. Users build the language specific files in their own repositories.

Let’s discuss them one by one.

The centralized library approach—where all schemas are compiled into a single artifact (JAR, Go module, etc.)—was never going to scale for our trajectory:

Central repository with builds

We at ShareChat use this approach.

We have a central git repository where we used to maintain protobuf files. Developers only commit new proto files, and the CI-CD system built and published libraries in multiple languages for users to import into their services. Using the contract was easy too, add the library as a dependency and users can import the contracts in their language of choice.

But having 1000+ contracts and new contracts being onboarded frequently had some issues.

  • Whenever a new contract is added (4-8 times a week) or some change is made to an existing contract (10-15 times a week), it will result in a new library version release.
  • With this, we will end up releasing 700-1000 library versions every year for each language for which we are releasing the library.
  • Even if someone wants to consume or publish a single event, they will need to import the entire library that has 1000+ contracts.
  • Maintaining compatibility with all contract users is hard. Let's take a couple of examples for this.
  • In the Go build of the protobuf contracts, we were using an add-on plugin for Go code generation. This plugin resulted in using any keyword in the generated code. any keyword was released in Go 1.18. Due to this, any Go service using Go version older than 1.18 would not be able to import the library.
  • In the central repository, we were using Java 17 for building java libraries. If some service is using an older Java version, then they can not import this library.

Central repository with protobuf contract

In this approach protobuf contracts are kept in a central repository. Building language specific files happen in the service which is using these contracts.

The workflow to build the contract on the service side will look like this:


The Benefits of this is that the users can choose how they want to build language specific files for their service.

But it also results in needing to redo a lot of work - setting up pipelines, understanding what these arguments do, ensuring version of protoc and plugins are consistent across build env, dev env, etc.


The Solution: EMS CLI

To overcome the issues mentioned above, we implemented EMS CLI, our Event Management System command-line tool, for distributing contracts. It borrows ideas from “Central repository with protobuf contract” and abstracts many of the complexities.

The core idea is simple: “treat an Event as the fundamental unit of dependent”, not a file or a repository.

How It Works

Step 1: Initialize the workspace

First, initialize the workspace with your target language and toolchain versions:


This creates the ems-config.yaml and with your compile settings:

The versions specified here ensure every developer and CI runner uses the exact same toolchain.


Step 2: Add event dependencies

Developers declare the Event Meta ID (e.g., 1542445319 ) instead of file paths. This ID is a stable reference to a business concept, assigned during event creation and discoverable via the EMS UI.



This updates the config with the new dependency:


Step 3: Download proto files

The CLI resolves the Event Meta ID through our internal EMS Registry, which knows the mapping between an ID and its storage location.


This fetches only the proto files you declared into the local .proto/ directory.

Step 4: Compile to generate code

The CLI handles all the complexity of invoking protoc with right flags for your target language.

Reproducible Builds Through Version Pinning

A common issue with protobuf compilation is environment drift - different developers or CI runners produce slightly different output because of mismatched protoc or plugin versions.


EMS CLI solves this by managing the entire toolchain. When you run ems-cli deps compile , the CLI:

  1. Downloads the exact protoc binaries specified during init (cached in ~/.ems-cli)
  2. Downloads the exact plugin binaries specified (cached in ~/.ems-cli)
  3. Invokes compilation with consistent flags

Every developer and CI runner produces identical output because everyone uses the same pinned versions from the shared config.


Tracking Schema Freshness

We use Github commit SHAs to version proto files - giving us fine-grained control and a full audit trail. Once dependencies are set up, teams need visibility into whether their pinned versions are current.

The status command shows this at a glance:

For debugging - “Which version introduced that field?” - developers can audit an event’s commit history:


Multi-Language Support

One of the biggest wins was unifying the workflow across our polyglot stack. In the old "library" model, a schema change required releasing a Go library, a Java JAR, and a Node package.

With EMS CLI, the workflow is identical regardless of the language:


The CLI handles the language-specific flags (--go_out, --java_out) and plugin binaries (protoc-gen-go, protoc-gen-grpc-java) under the hood. A developer just runs ems-cli deps compile.

The Workflow

The CLI is designed to be non-intrusive and scriptable for CI/CD.


Generated code is treated as an ephemeral artifact. We do not commit it to Git; instead, we regenerate it on-demand during the build process, ensuring the code always matches the

Summary

Moving from a centralized library to a granular package manager was essential for our scale. By treating event schemas as first-class dependencies, we’ve achieved:

  • Decoupling : Teams move at their own pace without "library lock-step."
  • Granularity : Services only pay the compile-time cost for the events they actually use.
  • Consistency : A managed toolchain eliminates environment-specific bugs.
  • Multi-Language : Unified workflow across Go, Java, Python, JavaScript, and TypeScript.
  • Zero Setup : No need to install protoc or plugins—everything is managed automatically.

The pattern of Event-Centric Management has transformed our schema distribution from a manual bottleneck into a reliable, automated infrastructure component.



Other Suggested Blog

Are you in search of a job profile that fits your skill set perfectly?

Congratulations! You’ve reached the right place!

We are enroute to building a team of humble, yet ambitious folks. Grow professionally in your career with us, as we offer tremendous room for growth in unique career fields. Do not miss out on this unique opportunity. Send us your resume today!