Why gRPC
REST + JSON works well for public APIs, but causes friction in internal microservices:
- Schema-less: the contract is implicit and breaks on refactor
- Text-based JSON: slower to parse, more bytes on the wire
- No streaming: real-time events require polling, SSE, or WebSocket
- No code generation: you write clients by hand for each language
gRPC (Google RPC, 2015, open source) addresses all four:
- Contract via
.proto: strong typing, schema-driven - Protobuf wire format: binary, compact, fast
- HTTP/2 transport: multiplexing, binary framing, [[http2-internals|streaming]]
- Code generation: one
.protoproduces clients in 11 languages - Built-in deadlines, retries, load balancing
Where it is used:
- Internal microservices (Google, Netflix, Square, Lyft)
- Service mesh (Istio Pilot talks to Envoy over gRPC)
- Kubernetes (kubelet talks to the CRI runtime over gRPC)
- CockroachDB, etcd, dgraph for internal RPC
gRPC is not suited to the public web. Browsers support HTTP/2, but raw
gRPC requires a library because fetch() has no built-in gRPC support.
gRPC-Web solves this through a proxy (Envoy translates).
.proto - the contract language
syntax = "proto3";
package myservice;
option go_package = "github.com/me/myservice/pb";
service UserService {rpc GetUser(GetUserRequest) returns (User);
rpc ListUsers(ListUsersRequest) returns (stream User);
rpc CreateUsers(stream CreateUserRequest) returns (CreateUsersResponse);
rpc Chat(stream ChatMessage) returns (stream ChatMessage);
}
message GetUserRequest {string user_id = 1;
}
message User {string id = 1;
string email = 2;
int64 created_at = 3;
repeated string roles = 4;
}
Each field has:
- Type (string/int32/int64/bool/bytes/message/enum/repeated/map)
- Tag (1, 2, 3 - the wire format uses the tag, not the field name)
- Options (optional, deprecated, json_name)
Backward compatibility: adding a field with a new tag is safe; reusing an old tag breaks existing clients.
Code generation
Each language has a protoc plugin:
# Go
protoc --go_out=. --go-grpc_out=. user.proto
# Python
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. user.proto
# TypeScript
protoc --plugin=protoc-gen-ts_proto --ts_proto_out=. user.proto
The generator produces:
- Data structures for each message (GetUserRequest, User, ...)
- Stub for the client (UserServiceClient)
- Interface for the server (UserServiceServer - implement this)
The output is idiomatic code, not reflection-based, so it is fast.
Four RPC types
1. Unary - like REST
rpc GetUser(GetUserRequest) returns (User);
The client sends one request and receives one response. 99% of gRPC calls in microservices are unary.
resp, err := client.GetUser(ctx, &pb.GetUserRequest{UserId: "u123"})2. Server streaming
rpc ListUsers(ListUsersRequest) returns (stream User);
The client sends one request; the server sends multiple responses over the same stream. The stream closes when the server is done.
stream, _ := client.ListUsers(ctx, &pb.ListUsersRequest{})for {user, err := stream.Recv()
if err == io.EOF { break }handle(user)
}
Use cases: pagination, exports, server-sent events.
3. Client streaming
rpc CreateUsers(stream CreateUserRequest) returns (CreateUsersResponse);
The client sends multiple requests; the server replies once after the client closes the stream. Use cases: bulk upload, file streaming.
4. Bidirectional streaming
rpc Chat(stream ChatMessage) returns (stream ChatMessage);
Both sides send independently. Similar to websocket, but with schema-typed messages. Use cases: chat, real-time collaboration, Envoy xDS streaming config updates.
Wire format - HTTP/2 + protobuf
A gRPC call on the wire:
HEADERS
:method = POST
:scheme = https
:path = /myservice.UserService/GetUser
:authority = api.example.com
content-type = application/grpc
grpc-encoding = gzip
grpc-accept-encoding = identity, gzip
user-agent = grpc-go/1.50.0
DATA
[00] [00 00 00 24] [<protobuf-encoded GetUserRequest>]
| | |
| | +-- 36-byte payload
| +-- length (big-endian uint32)
+-- compressed-flag (0=no, 1=yes)
HEADERS (trailers)
grpc-status = 0
grpc-message =
- Path =
/<package>.<Service>/<Method> - content-type =
application/grpc(orapplication/grpc+proto) - DATA frame carries a gRPC message frame: 1-byte compressed-flag
- 4-byte length + payload (protobuf bytes)
- trailers (HEADERS after DATA) carry
grpc-statusandgrpc-message
Status codes (grpc-status) are separate from HTTP codes:
- 0 = OK
- 1 = CANCELLED
- 2 = UNKNOWN
- 3 = INVALID_ARGUMENT
- 4 = DEADLINE_EXCEEDED
- 5 = NOT_FOUND
- 13 = INTERNAL
- 14 = UNAVAILABLE (retryable)
HTTP/2 :status is always 200, even when the gRPC call fails.
TLS - mutual auth and certs
gRPC normally runs over TLS (grpc.WithTransportCredentials). For
microservices, mTLS with [[tls-certificates|client certificates]] is
the standard:
creds := credentials.NewTLS(&tls.Config{ Certificates: []tls.Certificate{clientCert},RootCAs: caPool,
})
conn, _ := grpc.Dial("api:443", grpc.WithTransportCredentials(creds))In a service mesh (Istio, Linkerd), mTLS is handled by the sidecar; the app sends plaintext to localhost:5500.
Deadlines, metadata, retry
Deadlines
Every client RPC has a deadline (an absolute cancellation time). Pass it through context:
ctx, cancel := context.WithTimeout(ctx, 2*time.Second)
defer cancel()
client.GetUser(ctx, ...)
The deadline is propagated to the server via the grpc-timeout header.
The server can return DEADLINE_EXCEEDED immediately without wasting
resources. It propagates further into downstream calls as well.
Metadata - custom headers
md := metadata.Pairs("auth-token", "secret123", "x-request-id", "abc")ctx = metadata.NewOutgoingContext(ctx, md)
On the server:
md, _ := metadata.FromIncomingContext(ctx)
token := md.Get("auth-token")Retry config
Configured via service config (JSON in connection options):
{ "methodConfig": [{ "name": [{"service": "myservice.UserService"}], "retryPolicy": {"maxAttempts": 3,
"initialBackoff": "0.1s",
"maxBackoff": "1s",
"backoffMultiplier": 2,
"retryableStatusCodes": ["UNAVAILABLE"]
}
}]
}
Load balancing
gRPC supports client-side LB via a resolver + LB policy:
- The resolver gets the list of backends (DNS, etcd, xDS)
- The LB policy picks a backend for each RPC (round_robin, pick_first)
The standard Kubernetes pattern is a headless Service + DNS resolver. A ClusterIP Service breaks gRPC LB: gRPC uses one long-lived HTTP/2 connection, so kube-proxy balances at the TCP level only, and all requests land on the same pod.
The alternative is xDS-based balancing (via Envoy/gRPC xDS API), a proxyless service mesh.
gRPC vs REST/JSON - comparison
| Property | REST/JSON | gRPC |
|---|---|---|
| Contract | OpenAPI (optional) | .proto (required) |
| Wire format | text JSON | binary protobuf |
| Wire size | baseline | 30-60% smaller |
| Parse speed | slower (~5x) | faster |
| Streaming | via SSE/WebSocket | first-class |
| Browser support | native | via gRPC-Web proxy |
| Schema evolution | manual | controlled via tag |
| Tooling | curl, Postman | grpcurl, BloomRPC |
| Use case | public API | internal microservices |
A common architecture: REST for the external API, gRPC for internal
services. A gateway translates between them (grpc-gateway).
gRPC and Kubernetes Ingress
The default NGINX Ingress supports gRPC via an annotation:
metadata:
annotations:
nginx.ingress.kubernetes.io/backend-protocol: GRPC
Envoy/Istio has native gRPC support and is more flexible for:
- RPC path matching (
/myservice.UserService/*) - Per-method routing and rate limiting
- Retries on specific status codes
See [[kubernetes-services-and-ingress|k8s services and ingress]] for the fundamentals.
Troubleshooting
UNAVAILABLE: connection refused: the server is not listening or there is a network issue. Check withkubectl exec ... grpcurl host:port list.- All requests hit one pod: a ClusterIP service breaks gRPC LB. Switch to a headless service + DNS resolver.
UNIMPLEMENTED: the method is not registered on the server. Verify both sides use the same.proto.- Deadline immediately exceeded: the context.WithTimeout deadline already passed before the call. Check your context chain.
INTERNAL: stream terminated by RST_STREAM: an HTTP/2-level error, usually a proxy timeout (Envoy idle timeout is 1h, NGINX is 60s). Usetcpdump+tshark -Y http2.max-message size exceeded: the default gRPC message limit is 4 MB. Raise it withgrpc.MaxRecvMsgSize(64<<20).- HPACK errors: see http2-internals, often a bug in the server library.
Useful tools
- grpcurl: curl for gRPC, works with server-reflection or a .proto file
- buf: modern Protobuf builder and linter (replaces protoc)
- BloomRPC, Postman: GUI clients for interactive testing
- ghz: load testing (like hey/wrk for gRPC)
- server-reflection: the server describes its own RPCs so grpcurl works without a .proto file