Istio

What is Istio?

Istio is an open-source service mesh platform that provides a uniform way to secure, connect, and observe microservices.

Why Use Istio?

In a microservices architecture, applications are decomposed into many small services that communicate over the network. This introduces challenges:

Service-to-service communication: Managing secure, reliable communication between hundreds of services
Observability: Understanding traffic flow, latency, and failures across services
Security: Enforcing authentication, authorization, and encryption between services
Traffic management: Implementing advanced routing, load balancing, and resilience patterns
Policy enforcement: Applying consistent policies across all services

Istio solves these challenges by providing a transparent infrastructure layer that sits between your services and the network, handling cross-cutting concerns without requiring changes to application code.

Key Benefits

Traffic control: Fine-grained control over traffic routing and behavior
Security: Automatic mTLS, authentication, and authorization
Observability: Metrics, logs, and distributed tracing out-of-the-box
Resilience: Circuit breakers, retries, timeouts, and fault injection
Policy enforcement: Centralized policy management and rate limiting

Service Mesh Fundamentals

What is a Service Mesh?

A service mesh is a dedicated infrastructure layer for managing service-to-service communication.

┌───────────────────────────────────────────────────────────────┐  
│                        Service Mesh                           │  
│                                                               │  
│  ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐     │  
│  │Service A│    │Service B│    │Service C│    │Service D│     │  
│  │  ┌───┐  │    │  ┌───┐  │    │  ┌───┐  │    │  ┌───┐  │     │  
│  │  │App│  │    │  │App│  │    │  │App│  │    │  │App│  │     │  
│  │  └───┘  │    │  └───┘  │    │  └───┘  │    │  └───┘  │     │  
│  │  ┌───┐  │    │  ┌───┐  │    │  ┌───┐  │    │  ┌───┐  │     │  
│  │  │ P │<─┼────┼─>│ P │<─┼────┼─>│ P │<─┼────┼─>│ P │  │     │  
│  │  └───┘  │    │  └───┘  │    │  └───┘  │    │  └───┘  │     │  
│  └─────────┘    └─────────┘    └─────────┘    └─────────┘     │  
│       ↑              ↑              ↑              ↑          │  
│       └──────────────┴──────────────┴──────────────┘          │  
│                  Proxy Network (Data Plane)                   │  
│                           ↕                                   │  
│                  ┌──────────────────┐                         │  
│                  │  Control Plane   │                         │  
│                  │   (istiod)       │                         │  
│                  └──────────────────┘                         │  
└───────────────────────────────────────────────────────────────┘  
  
P = Proxy (Envoy)

The service mesh consists of:

Data Plane: Network of proxies that handle all inter-service communication
Control Plane: Manages and configures the proxies

Istio Architecture

Istio’s architecture is divided into two main components:

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐  
│                          Istio Mesh                             │  
│                                                                 │  
│  ┌───────────────────────────────────────────────────────────┐  │  
│  │                    Control Plane                          │  │  
│  │                                                           │  │  
│  │              ┌───────────────────────┐                    │  │  
│  │              │      istiod           │                    │  │  
│  │              │  ┌─────────────────┐  │                    │  │  
│  │              │  │ Pilot           │  │  Configuration     │  │  
│  │              │  │ - Service Disc. │  │  & Service         │  │  
│  │              │  │ - Traffic Mgmt  │  │  Discovery         │  │  
│  │              │  └─────────────────┘  │                    │  │  
│  │              │  ┌─────────────────┐  │                    │  │  
│  │              │  │ Citadel (CA)    │  │  Certificate       │  │  
│  │              │  │ - mTLS/PKI      │  │  Management        │  │  
│  │              │  └─────────────────┘  │                    │  │  
│  │              │  ┌─────────────────┐  │                    │  │  
│  │              │  │ Galley          │  │  Config            │  │  
│  │              │  │ - Validation    │  │  Validation        │  │  
│  │              │  └─────────────────┘  │                    │  │  
│  │              └───────────────────────┘                    │  │  
│  │                         ↓                                 │  │  
│  └─────────────────────────┼─────────────────────────────────┘  │  
│                            ↓                                    │  
│  ┌───────────────────────────────────────────────────────────┐  │  
│  │                    Data Plane                             │  │  
│  │                                                           │  │  
│  │  ┌──────────┐      ┌──────────┐      ┌──────────┐         │  │  
│  │  │  Pod A   │      │  Pod B   │      │  Pod C   │         │  │  
│  │  │ ┌──────┐ │      │ ┌──────┐ │      │ ┌──────┐ │         │  │  
│  │  │ │ App  │ │      │ │ App  │ │      │ │ App  │ │         │  │  
│  │  │ └──────┘ │      │ └──────┘ │      │ └──────┘ │         │  │  
│  │  │ ┌──────┐ │      │ ┌──────┐ │      │ ┌──────┐ │         │  │  
│  │  │ │Envoy │<┼──────┼>│Envoy │<┼──────┼>│Envoy │ │         │  │  
│  │  │ │Proxy │ │      │ │Proxy │ │      │ │Proxy │ │         │  │  
│  │  │ └──────┘ │      │ └──────┘ │      │ └──────┘ │         │  │  
│  │  └──────────┘      └──────────┘      └──────────┘         │  │  
│  └───────────────────────────────────────────────────────────┘  │  
└─────────────────────────────────────────────────────────────────┘

Control Plane (istiod)

In modern Istio (1.5+), the control plane is consolidated into a single binary called istiod, which includes:

Pilot

Service Discovery: Maintains a registry of all services and their endpoints
Traffic Management: Converts high-level routing rules into Envoy configurations
Configuration Distribution: Pushes configurations to all Envoy proxies
Supports: A/B testing, canary deployments, traffic splitting, circuit breakers, retries, timeouts

Citadel (Certificate Authority)

Certificate Management: Issues and rotates X.509 certificates for workloads
Identity: Provides strong identity to each service
mTLS: Enables automatic mutual TLS encryption between services
SPIFFE: Implements SPIFFE standard for service identity

Galley

Configuration Validation: Validates user-authored Istio configuration
Configuration Ingestion: Processes and distributes configuration to istiod
Abstraction: Isolates istiod from underlying platform (Kubernetes, VMs)

Mixer (Deprecated)

Note: Mixer has been deprecated and removed in Istio 1.7+
Previously handled:
- Access control and policy checks
- Telemetry data collection
- These functions are now handled by Envoy proxies directly (via WASM extensions)

Data Plane

The data plane consists of Envoy proxies deployed alongside each service:

Envoy Proxy: High-performance C++ proxy originally built by Lyft
Sidecar Pattern: In sidecar mode, each pod gets an Envoy container
Traffic Interception: All inbound/outbound traffic goes through the proxy
Capabilities:
- Dynamic service discovery
- Load balancing
- TLS termination
- HTTP/2 and gRPC proxying
- Circuit breakers
- Health checks
- Staged rollouts with percentage-based traffic splits
- Fault injection
- Rich metrics

Deployment Modes

Istio supports multiple deployment modes to fit different use cases and requirements.

1. Sidecar Mode (Traditional)

In sidecar mode, Istio deploys an Envoy proxy container alongside each application pod:

┌────────────────────────────────────────┐  
│            Kubernetes Pod              │  
│                                        │  
│  ┌──────────────┐    ┌──────────────┐  │  
│  │              │    │              │  │  
│  │ Application  │◄───┤ Envoy Proxy  │  │  
│  │  Container   │    │  (istio-proxy)  │  
│  │              │    │              │  │  
│  └──────────────┘    └───────┬──────┘  │  
│                              │         │  
└──────────────────────────────┼─────────┘  
                               │  
                        All traffic flows  
                        through proxy

Characteristics:

Each pod gets its own Envoy proxy sidecar container
Proxy intercepts all inbound and outbound traffic using iptables rules
Full Layer 7 (HTTP/gRPC) capabilities per pod
Pros: Complete feature set, mature, well-tested
Cons: Higher resource overhead (one proxy per pod)

Use cases:

Production environments requiring full L7 features
Applications needing advanced traffic management
When resource overhead is acceptable

2. Ambient Mode (Sidecarless)

Ambient mode is a newer deployment model that reduces resource overhead by eliminating per-pod sidecars:

┌────────────────────────────────────────────────────────────────┐  
│                         Kubernetes Node                        │  
│                                                                │  
│  ┌──────┐  ┌──────┐  ┌──────┐  ┌──────┐                        │  
│  │ Pod  │  │ Pod  │  │ Pod  │  │ Pod  │  (No sidecars!)        │  
│  │      │  │      │  │      │  │      │                        │  
│  └───┬──┘  └───┬──┘  └───┬──┘  └───┬──┘                        │  
│      │         │         │         │                           │  
│      └─────────┴─────────┴─────────┘                           │  
│                      │                                         │  
│              ┌───────▼────────┐                                │  
│              │    ztunnel     │  Layer 4 (per-node)            │  
│              │  (L4 Proxy)    │  - mTLS                        │  
│              └───────┬────────┘  - Basic routing               │  
│                      │                                         │  
└──────────────────────┼─────────────────────────────────────────┘  
                       │  
                       ▼  
           ┌───────────────────────┐  
           │  Waypoint Proxy       │  Layer 7 (per-namespace)  
           │  (Optional)           │  - Advanced routing  
           │  - Full L7 features   │  - Traffic policies  
           └───────────────────────┘

Ambient mode has two components:

ztunnel (Zero Trust Tunnel)

Runs as a DaemonSet (one per node)
Handles Layer 4 traffic (TCP)
Provides:
Mutual TLS (mTLS) encryption
Basic authentication and authorization
Telemetry at L4
Lightweight and efficient

Waypoint Proxies

Optional per-namespace Envoy proxies
Provide Layer 7 (HTTP/gRPC) features
Deploy only when you need advanced capabilities:
Complex routing rules
Request-level policies
HTTP header manipulation
Fault injection
Advanced observability

Characteristics:

No sidecar containers in application pods
Significantly reduced resource consumption
Gradual adoption of L7 features (opt-in per namespace)
Pros: Lower resource overhead, simpler upgrades
Cons: Newer (less mature), limited L7 features without waypoint proxies

Use cases:

Large-scale deployments where resource efficiency is critical
Environments with many simple services
Gradual migration from no mesh to full mesh

Comparison: Sidecar vs Ambient

Aspect	Sidecar Mode	Ambient Mode
Resource overhead	High (proxy per pod)	Low (shared proxies)
L7 features	Always available	Opt-in via waypoint
Maturity	Stable, production-ready	Newer (Istio 1.15+)
Upgrade complexity	Rolling pod restarts	Simpler (node-level)
Best for	Feature-rich environments	Large-scale, cost-sensitive

Core Components

Istio uses several Custom Resource Definitions (CRDs) to configure service mesh behavior. Understanding these resources is essential for effective traffic management.

Configuration Resources Overview

External Traffic  
       ↓  
┌──────────────┐  
│   Gateway    │  ← Defines ports/hosts for mesh entry  
└──────┬───────┘  
       ↓  
┌──────────────┐  
│VirtualService│  ← Routing rules (where to send traffic)  
└──────┬───────┘  
       ↓  
┌──────────────┐  
│DestinationRule  ← Policies (how to handle traffic)  
└──────┬───────┘  
       ↓  
┌──────────────┐  
│   Service    │  ← Kubernetes Service  
└──────┬───────┘  
       ↓  
┌──────────────┐  
│     Pod      │  ← Application workload  
└──────────────┘

1. VirtualService (Traffic Routing)

VirtualService defines routing rules that specify how requests are routed to services within the mesh.

Key capabilities:

Route traffic based on HTTP headers, URI paths, source labels
Split traffic across multiple service versions (for canary deployments)
Add timeouts, retries, and fault injection
Redirect and rewrite URLs

Example use cases:

Route 90% of traffic to v1 and 10% to v2 (canary testing)
Route requests with header user: premium to a special backend
Add automatic retries on connection failures

apiVersion: networking.istio.io/v1beta1  
kind: VirtualService  
metadata:  
  name: reviews-route  
spec:  
  hosts:  
  - reviews.default.svc.cluster.local  
  http:  
  - match:  
    - headers:  
        user-type:  
          exact: premium  
    route:  
    - destination:  
        host: reviews  
        subset: v2  
  - route:  
    - destination:  
        host: reviews  
        subset: v1  
      weight: 90  
    - destination:  
        host: reviews  
        subset: v2  
      weight: 10  
    retries:  
      attempts: 3  
      retryOn: "5xx,reset,connect-failure"  
    timeout: 5s

2. DestinationRule (Traffic Policies)

DestinationRule defines policies that apply to traffic after routing has occurred. These are policies for the “real” destination.

Key capabilities:

Define service subsets (versions) based on labels
Configure load balancing algorithms
Set up connection pool settings
Enable/configure mutual TLS
Configure circuit breakers and outlier detection

Example use cases:

Define subsets for different versions (v1, v2, v3)
Use least-connection load balancing
Enable circuit breaker to prevent cascading failures

apiVersion: networking.istio.io/v1beta1  
kind: DestinationRule  
metadata:  
  name: reviews-destination  
spec:  
  host: reviews  
  trafficPolicy:  
    loadBalancer:  
      simple: LEAST_CONN  
    connectionPool:  
      tcp:  
        maxConnections: 100  
      http:  
        http1MaxPendingRequests: 50  
        http2MaxRequests: 100  
    outlierDetection:  
      consecutiveErrors: 5  
      interval: 30s  
      baseEjectionTime: 30s  
  subsets:  
  - name: v1  
    labels:  
      version: v1  
  - name: v2  
    labels:  
      version: v2  
    trafficPolicy:  
      loadBalancer:  
        simple: ROUND_ROBIN

Load Balancing Algorithms

Istio supports multiple load balancing strategies:

Algorithm	Description	Use Case
`ROUND_ROBIN`	Distributes requests evenly in rotation	Default, works well for homogeneous backends
`LEAST_CONN`	Sends to backend with fewest active connections	Backends with varying load capacity
`LEAST_REQUEST`	Sends to backend with fewest active requests	HTTP/2 and gRPC workloads
`RANDOM`	Randomly selects a backen	Simple, low-overhead distribution
`PASSTHROUGH`	Forwards without load balancin	Direct connection scenarios
`CONSISTENT_HASH`	Hash-based distribution (sticky sessions)	Session affinity requirements

Connection Pool Settings

LoadBalancerSettings options:

simple: Standard algorithms (ROUND_ROBIN, LEAST_CONN, etc.)
consistentHash: Hash-based routing for session affinity
localityLbSetting: Locality-aware load balancing (prefer local endpoints)
warmupDurationSecs: Gradually increase traffic to new endpoints instead of sending full load immediately

3. Gateway (Mesh Entry/Exit Points)

Gateway configures a load balancer operating at the edge of the mesh for receiving incoming or outgoing HTTP/TCP connections.

Key capabilities:

Define external entry points (ingress) or exit points (egress)
Configure ports, protocols, and TLS settings
Attach to specific gateway deployments using selectors
Support for mutual TLS (mTLS) authentication

Example use cases:

Expose services to external clients via HTTPS
Configure mTLS for client certificate authentication
Set up egress gateway for controlled external API access

apiVersion: networking.istio.io/v1beta1  
kind: Gateway  
metadata:  
  name: my-gateway  
  namespace: istio-system  
spec:  
  selector:  
    istio: ingressgateway  # Selects the ingress gateway pods  
  servers:  
  - port:  
      number: 443  
      name: https  
      protocol: HTTPS  
    hosts:  
    - "myapp.example.com"  
    tls:  
      mode: SIMPLE  
      credentialName: myapp-tls-cert

4. ServiceEntry (External Services)

ServiceEntry enables adding external services (outside the mesh) into Istio’s internal service registry.

Key capabilities:

Add external APIs or databases to the mesh
Apply mesh policies to external services
Control and monitor traffic to external endpoints

Example use cases:

Integrate external payment APIs with mesh policies
Apply retries and timeouts to external database connections
Monitor traffic to third-party services

apiVersion: networking.istio.io/v1beta1  
kind: ServiceEntry  
metadata:  
  name: external-payment-api  
spec:  
  hosts:  
  - api.payment-provider.com  
  ports:  
  - number: 443  
    name: https  
    protocol: HTTPS  
  location: MESH_EXTERNAL  
  resolution: DNS

5. Sidecar (Proxy Configuration)

Sidecar resource fine-tunes the configuration of sidecar proxies attached to workloads.

Key capabilities:

Limit the set of services a sidecar can reach
Optimize resource usage by reducing configuration size
Control inbound and outbound traffic behavior

Example use cases:

Reduce memory footprint in large meshes
Restrict which services a workload can communicate with
Improve proxy startup time

apiVersion: networking.istio.io/v1beta1  
kind: Sidecar  
metadata:  
  name: default  
  namespace: my-app  
spec:  
  egress:  
  - hosts:  
    - "./*"  # Only allow traffic within same namespace  
    - "istio-system/*"

Traffic Management

Traffic management is one of Istio’s core features, enabling sophisticated control over service-to-service communication.

Request Routing

Control where traffic goes based on various criteria:

Path-based routing:

http:  
- match:  
  - uri:  
      prefix: /api/v1  
  route:  
  - destination:  
      host: service-v1  
- match:  
  - uri:  
      prefix: /api/v2  
  route:  
  - destination:  
      host: service-v2

Header-based routing:

http:  
- match:  
  - headers:  
      x-api-version:  
        exact: "2.0"  
  route:  
  - destination:  
      host: service-v2

Traffic Splitting (Canary Deployments)

Gradually shift traffic from old version to new version:

http:  
- route:  
  - destination:  
      host: reviews  
      subset: v1  
    weight: 80  
  - destination:  
      host: reviews  
      subset: v2  
    weight: 20

Deployment strategy:

Deploy v2 alongside v1
Route 10% → v2, 90% → v1
Monitor metrics and errors
Gradually increase v2 traffic: 25%, 50%, 75%, 100%
Decommission v1

Timeouts and Retries

Timeouts prevent requests from hanging indefinitely:

http:  
- route:  
  - destination:  
      host: my-service  
  timeout: 5s

Retries automatically retry failed requests:

http:  
- route:  
  - destination:  
      host: my-service  
  retries:  
    attempts: 3  
    perTryTimeout: 2s  
    retryOn: "5xx,reset,connect-failure,refused-stream"

Circuit Breaking

Prevent cascading failures by limiting connections to unhealthy services:

trafficPolicy:  
  connectionPool:  
    tcp:  
      maxConnections: 100  
    http:  
      http1MaxPendingRequests: 50  
      maxRequestsPerConnection: 5  
  outlierDetection:  
    consecutiveErrors: 5  
    interval: 30s  
    baseEjectionTime: 30s  
    maxEjectionPercent: 50

How it works:

Service starts experiencing errors
After 5 consecutive errors, Istio ejects the endpoint for 30s
Gradual recovery: endpoint gets limited traffic to test health
If healthy, fully restored; if not, ejected again

Fault Injection (Chaos Engineering)

Test application resilience by injecting faults:

Delay injection (simulate slow networks):

http:  
- fault:  
    delay:  
      percentage:  
        value: 10  
      fixedDelay: 5s  
  route:  
  - destination:  
      host: my-service

Abort injection (simulate service failures):

http:  
- fault:  
    abort:  
      percentage:  
        value: 20  
      httpStatus: 503  
  route:  
  - destination:  
      host: my-service

Traffic Mirroring (Shadowing)

Send copy of live traffic to a test service without affecting production:

http:  
- route:  
  - destination:  
      host: service-v1  
  mirror:  
    host: service-v2  
  mirrorPercentage:  
    value: 50

Use cases:

Test new version with real traffic without risk
Compare performance between versions
Validate refactored services

Security

Istio provides multiple layers of security for microservices.

Mutual TLS (mTLS)

Automatic mTLS encrypts all service-to-service communication and provides strong identity.

┌──────────┐                                    ┌──────────┐  
│ Service A│                                    │ Service B│  
│  ┌────┐  │                                    │  ┌────┐  │  
│  │App │  │                                    │  │App │  │  
│  └──┬─┘  │                                    │  └─┬──┘  │  
│     │    │                                    │    │     │  
│  ┌──▼──┐ │  1. Establish mTLS connection      │  ┌─▼───┐ │  
│  │Envoy│─┼────────────────────────────────────┼─►│Envoy│ │  
│  │     │ │  2. Verify certificates (both ways)│  │     │ │  
│  │     │◄┼────────────────────────────────────┼──│     │ │  
│  │     │ │  3. Encrypted communication        │  │     │ │  
│  └─────┘ │◄────────────────────────────────►  │  └─────┘ │  
└──────────┘                                    └──────────┘  
     │                                                 │  
     └────────────── Citadel (CA) ─────────────────────┘  
              (Issues & rotates certificates)

Configuration modes:

apiVersion: security.istio.io/v1beta1  
kind: PeerAuthentication  
metadata:  
  name: default  
  namespace: my-namespace  
spec:  
  mtls:  
    mode: STRICT  # Options: STRICT, PERMISSIVE, DISABLE

STRICT: Only accept mTLS connections
PERMISSIVE: Accept both mTLS and plaintext (for migration)
DISABLE: Disable mTLS

Authorization Policies

Control who can access what services:

apiVersion: security.istio.io/v1beta1  
kind: AuthorizationPolicy  
metadata:  
  name: frontend-policy  
  namespace: default  
spec:  
  selector:  
    matchLabels:  
      app: frontend  
  action: ALLOW  
  rules:  
  - from:  
    - source:  
        principals: ["cluster.local/ns/default/sa/api-gateway"]  
    to:  
    - operation:  
        methods: ["GET", "POST"]  
        paths: ["/api/*"]

Common patterns:

Allow only specific services to call an API
Restrict HTTP methods (e.g., only GET and POST)
Deny access to admin endpoints except from specific namespaces

Request Authentication (JWT)

Validate JWT tokens from external identity providers:

apiVersion: security.istio.io/v1beta1  
kind: RequestAuthentication  
metadata:  
  name: jwt-auth  
spec:  
  selector:  
    matchLabels:  
      app: api-service  
  jwtRules:  
  - issuer: "https://auth.example.com"  
    jwksUri: "https://auth.example.com/.well-known/jwks.json"

Use cases:

Validate OAuth2/OIDC tokens
Enforce authentication for external API calls
Extract user identity from JWT claims

Observability

Istio automatically generates telemetry for all traffic in the mesh without requiring application changes.

Three Pillars of Observability

┌───────────────────────────────────────────────────────────┐  
│                    Observability Stack                    │  
│                                                           │  
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │  
│  │   Metrics   │  │    Logs     │  │   Traces    │        │  
│  │             │  │             │  │             │        │  
│  │ Prometheus  │  │    Fluentd  │  │    Jaeger   │        │  
│  │   Grafana   │  │     ELK     │  │    Zipkin   │        │  
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘        │  
│         │                │                │               │  
│         └────────────────┼────────────────┘               │  
│                          │                                │  
│              ┌───────────▼───────────┐                    │  
│              │   Envoy Proxies       │                    │  
│              │  (Generate telemetry) │                    │  
│              └───────────────────────┘                    │  
└───────────────────────────────────────────────────────────┘

1. Metrics

Istio automatically collects:

Request rate: Requests per second
Request latency: P50, P90, P95, P99 percentiles
Error rate: 4xx and 5xx responses
Request size: Bytes sent/received

Key metrics:

istio_requests_total: Total request count
istio_request_duration_milliseconds: Request latency
istio_request_bytes: Request size
istio_response_bytes: Response size

Golden Signals (RED method):

Rate: Requests per second
Errors: Percentage of failed requests
Duration: Request latency distribution

2. Distributed Tracing

Track requests as they flow through multiple services:

User Request → API Gateway → Auth Service → Product Service → DB  
                  20ms          15ms            100ms         50ms  
   │──────────────────────────────────────────────────────────│  
                     Total Latency: 185ms

Trace components:

Trace: End-to-end request journey
Span: Single operation (e.g., one service call)
Tags: Metadata (HTTP method, status code, etc.)

Integration: Jaeger, Zipkin

Visualize request flow
Identify bottlenecks
Debug latency issues

3. Access Logs

Envoy generates detailed access logs:

{  
  "start_time": "2024-01-15T10:30:00.000Z",  
  "method": "GET",  
  "path": "/api/products",  
  "response_code": 200,  
  "duration": 45,  
  "upstream_service": "products.default.svc.cluster.local",  
  "user_agent": "Mozilla/5.0...",  
  "request_id": "abc-123-def-456"  
}

Service Discovery and Endpoints

Istio’s Pilot component provides automatic service discovery:

How it works:

Kubernetes creates Endpoints for each Service
Pilot watches Kubernetes API for changes
Pilot pushes updated endpoint information to all Envoy proxies
Proxies use this info for load balancing

Real-world example:

# Service definition  
$ kubectl -n myapp get svc myapp-test  
NAME           TYPE        CLUSTER-IP       PORT(S)  
myapp-test   ClusterIP   10.100.229.132   8443/TCP  
  
# Endpoints (actual pod IPs)  
$ kubectl -n myapp get endpoints myapp-test  
NAME           ENDPOINTS  
myapp-test   240.48.67.221:8080,240.48.69.154:8080  
  
# Pods backing the service  
$ kubectl -n myapp get pods -o wide  
NAME                            READY   IP  
myapp-test-78ddbd8c64-9bkzb   3/3     240.48.69.154  
myapp-test-78ddbd8c64-tsnfw   3/3     240.48.67.221

Dynamic updates:

Pod scales up → New endpoint added → Pilot updates all proxies
Pod becomes unhealthy → Endpoint removed → Traffic stops routing to it
Zero-downtime deployments

Advanced Concepts

Graceful Termination and Connection Draining

When pods are terminated, ensure graceful shutdown:

Termination Flow:  
  App Container → istio-proxy → ingress-gateway → Load Balancer

Key settings:

terminationGracePeriodSeconds (Pod level): Time Kubernetes waits before killing pod
drainDuration (Istio): Time Envoy waits before closing connections

apiVersion: networking.istio.io/v1beta1  
kind: EnvoyFilter  
metadata:  
  name: drain-duration  
spec:  
  configPatches:  
  - applyTo: CLUSTER  
    patch:  
      operation: MERGE  
      value:  
        drain_connections_on_host_removal: true

Best practices:

Set terminationGracePeriodSeconds: 30 (or higher)
Configure drainDuration to allow connections to complete
Use preStop hooks to delay SIGTERM
Implement health check endpoints

Multi-Cluster Mesh

Connect services across multiple Kubernetes clusters:

┌──────────────┐         ┌──────────────┐  
│  Cluster A   │         │  Cluster B   │  
│              │         │              │  
│ ┌──────────┐ │         │ ┌──────────┐ │  
│ │ Service A│ │◄───────►│ │ Service B│ │  
│ └──────────┘ │         │ └──────────┘ │  
│              │         │              │  
│   istiod-A   │         │   istiod-B   │  
└──────────────┘         └──────────────┘  
       │                        │  
       └────────────────────────┘  
            Shared control plane  
            (or federated)

Deployment models:

Single control plane: One istiod manages multiple clusters
Multi-primary: Each cluster has its own control plane
Primary-remote: One primary, others are remote

Locality-Aware Load Balancing

Route traffic to nearby services first:

trafficPolicy:  
  loadBalancer:  
    localityLbSetting:  
      enabled: true  
      distribute:  
      - from: "us-west/us-west-1/*"  
        to:  
          "us-west/us-west-1/*": 80  
          "us-west/us-west-2/*": 20  
      failover:  
      - from: "us-west/us-west-1/*"  
        to: "us-east/us-east-1/*"

Benefits:

Reduced latency (same region/zone)
Lower data transfer costs
High availability (automatic failover)

WebAssembly (WASM) Extensions

Extend Envoy proxy with custom logic:

apiVersion: extensions.istio.io/v1alpha1  
kind: WasmPlugin  
metadata:  
  name: custom-auth  
spec:  
  selector:  
    matchLabels:  
      app: api-service  
  url: oci://my-registry/custom-auth-plugin:v1.0  
  phase: AUTHN

Use cases:

Custom authentication/authorization
Request/response transformation
Rate limiting
Custom telemetry

Real-World Examples

This section walks through practical examples using actual Kubernetes resources, demonstrating how traffic flows through an Istio service mesh.

Complete Traffic Flow

Understanding the full request path through Istio:

External Client  
      ↓  
   NLB (Network Load Balancer)  
      ↓  
┌─────────────────────────────────────────────────────────────────┐  
│                    Kubernetes Cluster                           │  
│                                                                 │  
│  ┌────────────────────────────────────────────────────────────┐ │  
│  │ Istio Ingress Gateway (app-ingress-gateway)                │ │  
│  │        (Envoy proxy deployment)                            │ │  
│  │    • Listens on configured ports (e.g., 8443)              │ │  
│  │    • Receives configuration from istiod                    │ │  
│  └────────────────────┬───────────────────────────────────────┘ │  
│                       ↓                                         │  
│  ┌────────────────────────────────────────────────────────────┐ │  
│  │ Gateway Resource                                           │ │  
│  │   • Defines ports, protocols, TLS settings                 │ │  
│  │   • Selects ingress gateway pods via label selector        │ │  
│  └────────────────────┬───────────────────────────────────────┘ │  
│                       ↓                                         │  
│  ┌────────────────────────────────────────────────────────────┐ │  
│  │ VirtualService                                             │ │  
│  │   • Matches incoming requests (host, path, headers)        │ │  
│  │   • Defines routing rules and destinations                 │ │  
│  │   • Configures retries, timeouts                           │ │  
│  └────────────────────┬───────────────────────────────────────┘ │  
│                       ↓                                         │  
│  ┌────────────────────────────────────────────────────────────┐ │  
│  │ DestinationRule (Optional)                                 │ │  
│  │   • Defines subsets (versions)                             │ │  
│  │   • Load balancing policies                                │ │  
│  │   • Connection pool settings                               │ │  
│  └────────────────────┬───────────────────────────────────────┘ │  
│                       ↓                                         │  
│  ┌────────────────────────────────────────────────────────────┐ │  
│  │ Kubernetes Service                                         │ │  
│  │   • ClusterIP with stable DNS name                         │ │  
│  │   • Selects pods via label selectors                       │ │  
│  │   • Maps service port to container targetPort              │ │  
│  └────────────────────┬───────────────────────────────────────┘ │  
│                       ↓                                         │  
│  ┌────────────────────────────────────────────────────────────┐ │  
│  │ Endpoints                                                  │ │  
│  │   • Dynamic list of pod IPs and ports                      │ │  
│  │   • Automatically updated as pods scale/fail               │ │  
│  └────────────────────┬───────────────────────────────────────┘ │  
│                       ↓                                         │  
│  ┌────────────────────────────────────────────────────────────┐ │  
│  │ Application Pod                                            │ │  
│  │  ┌──────────────┐     ┌──────────────┐                     │ │  
│  │  │  istio-proxy │     │ Application  │                     │ │  
│  │  │   (Envoy)    │────►│  Container   │                     │ │  
│  │  └──────────────┘     └──────────────┘                     │ │  
│  └────────────────────────────────────────────────────────────┘ │  
└─────────────────────────────────────────────────────────────────┘

Key points:

Istio Ingress Gateway is a pod running Envoy proxy (not a hardcoded config)
Gateway, VirtualService, DestinationRule are configuration objects that tell the proxies how to route
istiod (control plane) pushes all configurations to Envoy proxies at runtime
Configurations are dynamic and can be updated without restarting pods

Example 1: Istio Control Plane Components

Viewing the Istio system components:

$ kubectl -n istio-system get pods -o wide  
NAME                                   READY   STATUS    IP              NODE  
app-ingress-gateway-68945bdbd7-5dxxr   1/1     Running   240.48.71.138   ip-240-48-71-119  
app-ingress-gateway-68945bdbd7-jkbzv   1/1     Running   240.48.68.10    ip-240-48-69-0  
app-ingress-gateway-68945bdbd7-lj5fl   1/1     Running   240.48.67.64    ip-240-48-67-148  
istiod-fd589774b-2cl2l                 1/1     Running   240.48.71.251   ip-240-48-71-119  
istiod-fd589774b-dd2xw                 1/1     Running   240.48.69.34    ip-240-48-69-0  
istiod-fd589774b-k5dgc                 1/1     Running   240.48.67.115   ip-240-48-67-148

Observations:

app-ingress-gateway: Multiple replicas (3) for high availability
Each pod is an Envoy proxy acting as the entry point
Distributed across different nodes for fault tolerance
Target IPs are registered with external load balancer
istiod: Control plane component (3 replicas for HA)
Manages configuration for all proxies
Provides service discovery and certificate management

Example 2: Gateway Configuration

Defining an ingress gateway with mutual TLS:

$ kubectl -n istio-system get gateway myapp-mgt-qa1-usw2 -o yaml

apiVersion: networking.istio.io/v1beta1  
kind: Gateway  
metadata:  
  name: myapp-mgt-qa1-usw2  
  namespace: istio-system  
spec:  
  selector:  
    istio: app-ingress-gateway    # Selects ingress gateway pods with this label  
  servers:  
  - hosts:  
    - myapp-qa1-usw2.mgt-nonprod.myorg.com  # Hostname to handle  
    port:  
      name: https-mutual  
      number: 8443  
      protocol: HTTPS  
    tls:  
      mode: MUTUAL                # Requires client certificate authentication  
      minProtocolVersion: TLSV1_2  
      maxProtocolVersion: TLSV1_3  
      serverCertificate: /etc/istio/tls/tls.crt  
      privateKey: /etc/istio/tls/tls.key  
      caCertificates: /etc/myorg/ca/myorg_corp_auth_ca1.pem

Key observations:

selector: Uses label selector to identify which ingress gateway pods handle this config
hosts: Defines the hostname this gateway will accept traffic for
port: Listens on port 8443 for HTTPS traffic
tls.mode = MUTUAL: Requires both server and client certificates (strong authentication)
Certificates: Mounted from Kubernetes secrets/config maps into the gateway pods
Wildcard support: If hosts = ["*"], gateway accepts any hostname

Example 3: VirtualService Routing

VirtualService defines routing rules to forward requests to backend services:

$ kubectl -n myapp get vs myapp-test-vs -o yaml

apiVersion: networking.istio.io/v1beta1  
kind: VirtualService  
metadata:  
  name: myapp-test-vs  
  namespace: myapp  
spec:  
  gateways:  
  - istio-system/myapp-mgt-qa1-usw2    # References the Gateway (cross-namespace)  
  hosts:  
  - myapp-test.myapp.svc.cluster.local           # Internal DNS name  
  - myapp-qa1-usw2.mgt-nonprod.myorg.com # External hostname (matches Gateway)  
  http:  
  - retries:  
      attempts: 1  
      retryOn: connect-failure,refused-stream  
    route:  
    - destination:  
        host: myapp-test    # Routes to Service named "myapp-test"

Key observations:

gateways: Links to the Gateway resource (can be in different namespace)
hosts: Accepts requests for both internal and external hostnames
myapp-test.myapp.svc.cluster.local: Internal mesh traffic
myapp-qa1-usw2.mgt-nonprod.myorg.com: External traffic (must match Gateway host)
retries: Automatically retry failed requests (resilience)
destination.host: Routes to Kubernetes Service name (not pod directly)

How they work together:

External request arrives at Gateway with hostname myapp-qa1-usw2...
Gateway accepts it (hostname matches its configuration)
VirtualService matches the hostname and applies routing rules
Traffic is routed to the myapp-test Service

Example 4: Kubernetes Service and Endpoints

The Service provides stable networking and service discovery:

$ kubectl -n myapp get svc myapp-test  
NAME           TYPE        CLUSTER-IP       PORT(S)  
myapp-test   ClusterIP   10.100.229.132   8443/TCP

$ kubectl -n myapp get svc myapp-test -o yaml  
apiVersion: v1  
kind: Service  
metadata:  
  name: myapp-test  
  namespace: myapp  
spec:  
  type: ClusterIP  
  clusterIP: 10.100.229.132  
  ports:  
  - name: http  
    port: 8443         # Port exposed by the Service (client-facing)  
    protocol: TCP  
    targetPort: 8080   # Port on the Pod container  
  selector:  
    app.kubernetes.io/instance: myapp-myapp    # Selects pods with these labels  
    app.kubernetes.io/name: myapp-test

Key observations:

ClusterIP: Virtual IP accessible only within the cluster
port vs targetPort:
port: 8443: Clients connect to Service on this port
targetPort: 8080: Service forwards to pod containers on this port
Allows decoupling external API from internal implementation
selector: Labels that identify which pods receive traffic

Endpoints (automatically managed by Kubernetes):

$ kubectl -n myapp get endpoints myapp-test  
NAME           ENDPOINTS  
myapp-test   240.48.67.221:8080,240.48.69.154:8080

Key observations:

Endpoints list shows actual pod IPs and ports
Dynamically updated as pods scale, restart, or fail
Istio’s Pilot watches these endpoints and configures Envoy proxies

Example 5: Application Pods

The actual workload running the application:

$ kubectl -n myapp get pods -o wide  
NAME                            READY   STATUS    IP              NODE  
myapp-test-78ddbd8c64-9bkzb   3/3     Running   240.48.69.154   ip-240-48-69-0  
myapp-test-78ddbd8c64-tsnfw   3/3     Running   240.48.67.221   ip-240-48-67-148

Notice: READY 3/3 indicates 3 containers per pod:

Application container: Your app code
istio-proxy: Envoy sidecar injected by Istio
istio-init: Init container that sets up iptables rules (completed, not counted in READY)

Inspecting a pod:

$ kubectl -n myapp get pod myapp-test-78ddbd8c64-9bkzb -o yaml

Key Istio-specific annotations in pod metadata:

annotations:  
  # Istio sidecar injection  
  sidecar.istio.io/inject: "true"                # Enables automatic sidecar injection  
  istio.io/rev: default                          # Istio control plane revision  
  
  # Proxy configuration  
  proxy.istio.io/config: |  
    holdApplicationUntilProxyStarts: true        # App waits for proxy to be ready  
  
  # Security and identity  
  security.istio.io/tlsMode: istio               # Uses Istio mTLS  
  service.istio.io/canonical-name: myapp-test  
  service.istio.io/canonical-revision: latest  
  
  # Prometheus metrics  
  prometheus.io/scrape: "true"  
  prometheus.io/port: "15020"                    # Envoy metrics port  
  prometheus.io/path: /stats/prometheus  
  
labels:  
  app.kubernetes.io/instance: myapp-myapp        # Matches Service selector  
  app.kubernetes.io/name: myapp-test             # Matches Service selector
  security.istio.io/tlsMode: istio

How pod IPs match endpoints:

# Pod IPs  
240.48.69.154  
240.48.67.221  
  
# Endpoints  
240.48.67.221:8080,240.48.69.154:8080  
  
# Service routes to these endpoints  
# Envoy proxies receive these endpoint IPs from Pilot  
# Load balancing happens across these pod IPs

Example 6: Complete Request Flow

Scenario: External client makes HTTPS request to https://myapp-qa1-usw2.mgt-nonprod.myorg.com/api/data

Step-by-step flow:

External Load Balancer (NLB):
- Client DNS resolves to NLB IP
- NLB forwards to one of the ingress gateway pod IPs (e.g., 240.48.71.138:8443)
Istio Ingress Gateway Pod:
- Envoy proxy receives the request
- Checks Gateway resource: hostname matches myapp-qa1-usw2...
- Performs mTLS termination using configured certificates
- Validates client certificate (mutual TLS)
VirtualService Matching:
- Envoy checks VirtualService resources
- Finds match: hostname myapp-qa1-usw2... → routes to myapp-test
- Applies retry policy: retry on connect-failure,refused-stream
Service Resolution:
- Resolves myapp-test service to ClusterIP 10.100.229.132
- Pilot has pushed endpoint list to Envoy: [240.48.67.221:8080, 240.48.69.154:8080]
Load Balancing:
- No DestinationRule → uses default ROUND_ROBIN
- Selects one pod IP (e.g., 240.48.69.154:8080)
Pod Sidecar (istio-proxy):
- Request arrives at pod’s Envoy sidecar (240.48.69.154:15006)
- Sidecar applies mTLS (encrypts with destination cert)
- Forwards to application container on port 8080
Application Container:
- Receives request on localhost:8080
- Processes request and returns response
Response Path (reverse of request path):
- App → Sidecar → Ingress Gateway → NLB → Client

Key insights:

Every hop involves an Envoy proxy (except NLB)
Configuration is dynamic (no restarts needed for changes)
mTLS is automatic and transparent to the application
Observability data collected at each proxy

Summary and Best Practices

When to Use Istio

Good fit:

Large-scale microservices (50+ services)
Need for advanced traffic management (canary, A/B testing)
Security requirements (zero-trust, mTLS)
Polyglot environments (multiple languages/frameworks)
Complex observability needs

Not a good fit:

Simple applications (few services)
Performance-critical with tight latency budgets
Small teams without operational expertise
Limited infrastructure resources

Best Practices

Start incrementally:
- Begin with sidecar injection for observability
- Gradually add traffic management features
- Consider ambient mode for resource efficiency
Security:
- Enable STRICT mTLS in production
- Use AuthorizationPolicies for fine-grained access control
- Regularly rotate certificates (automated by Citadel)
Traffic management:
- Always define retries and timeouts
- Use circuit breakers to prevent cascading failures
- Test canary deployments with small traffic percentages first
Performance:
- Use Sidecar resources to limit proxy configuration size
- Monitor resource usage (Envoy memory/CPU)
- Consider ambient mode for large-scale deployments
Observability:
- Integrate with Prometheus and Grafana for metrics
- Set up distributed tracing (Jaeger/Zipkin)
- Configure appropriate access log formats
Operations:
- Version control all Istio configurations
- Test configuration changes in non-production first
- Implement graceful termination (drainDuration, terminationGracePeriodSeconds)
- Use revision-based upgrades for control plane

Common Troubleshooting

Service not reachable:

Check sidecar injection: kubectl get pod <name> -o jsonpath='{.spec.containers[*].name}'
Verify VirtualService hosts match Gateway hosts
Ensure Service selector matches pod labels

mTLS errors:

Check PeerAuthentication mode (STRICT vs PERMISSIVE)
Verify certificate expiration
Ensure both sides have Istio proxies

High latency:

Check for unnecessary retries
Review timeout configurations
Monitor Envoy resource usage
Consider connection pool tuning

Configuration not applying:

Validate with istioctl analyze
Check istiod logs for errors
Verify proxy can reach istiod (network policies)

Additional Resources

Official Documentation: https://istio.io/latest/docs/
Istio GitHub: https://github.com/istio/istio

Last updated on November 17, 2025

Service Mesh