Last modified: January 24, 2026
This article is written in: 🇺🇸
Protocol Buffers (often referred to as protobuf) is a language-neutral, platform-independent method for serializing structured data. Originally created at Google, it excels at enabling efficient data interchange between services, storing information in a compact binary format, and sustaining backward and forward compatibility across different versions of the data schema.
ASCII DIAGRAM: Flow of Working with Protobuf
+-----------+ +---------+ +---------------------+
| .proto | Use | Protoc | Gen | Language Classes |
| (Schema) +-------->+ Compiler+-------->+ (Java, Python, etc.)
+-----------+ +----+----+ +----------+----------+
| |
| (Serialize/ | (Deserialize/
| Deserialize) | Manipulate)
v v
+---------------------+ +---------------------+
| In-memory Objects | | In-memory Objects |
+---------------------+ +---------------------+
^ ^
| (Binary Data) |
+------------<--------------->+
protoc) generates data model classes for your target programming language. Each field’s number identifies it in the binary encoding, so it should not be changed once deployed.
Generated Code
protoc converts .proto definitions into classes in various languages (Java, Python, C++, Go, etc.). These classes provide getters, setters, and builder patterns to manipulate field values.
Serialization
A simple .proto file might define a Person message with nested fields and an AddressBook that holds multiple Person messages:
syntax = "proto3";
message Person {
string name = 1;
int32 id = 2;
string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
string number = 1;
PhoneType type = 2;
}
repeated PhoneNumber phones = 4;
}
message AddressBook {
repeated Person people = 1;
}
name, id, email, and repeated phones. PhoneNumber is defined inside Person. Person objects via a repeated field.protoc --java_out=. addressbook.proto
Output classes (for example, in Java) will include Person, Person.PhoneNumber, Person.PhoneType, and AddressBook.
Usage in Java (example):
Person person = Person.newBuilder()
.setName("Alice")
.setId(123)
.setEmail("alice@example.com")
.addPhones(
Person.PhoneNumber.newBuilder()
.setNumber("555-1234")
.setType(Person.PhoneType.HOME)
)
.build();
// Serialization
byte[] data = person.toByteArray();
// Deserialization
Person parsedPerson = Person.parseFrom(data);
System.out.println(parsedPerson.getName()); // "Alice"
Fast to parse compared to JSON or XML due to the binary encoding approach.
Language-Neutral
Protobuf supports many languages and platforms, making it flexible for cross-language communication.
Backward/Forward Compatibility
Each field’s unique numeric tag enables easy evolution of the schema.
Schema-Driven
Often used with gRPC, which builds on HTTP/2 and Protobuf for efficient remote procedure calls.
Persistent Storage
Helps with metadata storage, saving configurations, or logging events with minimal space overhead.
Mobile and IoT
| Aspect | Protobuf | JSON |
| Encoding | Binary | Text (UTF-8, etc.) |
| Readability | Not human-readable | Human-readable (plain text) |
| Size & Performance | Smaller, faster to parse | Larger, slower to parse |
| Schema Definition | Required (.proto files) |
Not required (schemaless) |
| Evolution | Facilitated by numeric tags (forward/backward) | Relies on optional fields or versioning manually |
| Tooling | Protobuf compiler needed, specialized libraries | Widespread support, easy debugging with text format |
Choose JSON if easy debugging, simplicity, or direct human editing is a priority.
Choose Protobuf if efficiency, strict schema, or large-scale message passing is crucial.
.proto files that match your project’s conventions (e.g., snake_case for field names if in Python or camelCase in Java).