Microservice Governance - Routing Patterns

These patterns deal with how the client-side services discover the locations of the server-side services and are routed over to them. In a cloud-based application, you might have hundreds of microservices, and each microservice might have hundreds of instances running at most. It is needed to abstract away the physical IP address of these services and has a single point of entry for each service so that you can consistently enforce connection and security policies for all service calls. Also, you need to have the capability to route the traffic based on some conditions dynamically. So we will touch on two patterns, which are Service Discovery and Service Routing.

Routing Pattern-1: Service Discovery

Questions To Solve

How to make microservice discoverable so client services can find them without having to hardcode the location of the services? Furthermore, how do we ensure those misbehaving instances of a service are removed from the pool of available instances?

Common Design

In any distributed architecture, we need to find the physical address of where a machine is located. This concept has been around since the beginning of distributed computing and is known formally as service discovery. It is critical to microservice, cloud-based applications for two reasons,

  • It offers the ability to quickly horizontally scale up and down the service instances without service consumers' awareness of the scaling,
  • It helps increase application resiliency by removing unhealthy or unavailable instances from the pool of available instances.

A more robust approach to implementing this pattern is to have a Service Discovery Service and use Client-side Load Balancing. Figure RP-1 shows the flow of a typical service discovery process.

It is better than DNS + LB on the server-side for the following reasons,

  • No single point of failure because the service IPs are stored in the client's local memory, so even if the DNS or ELB or both of them are broken, the client can still request services
  • More choices of LB algorithms, the most specifical ability is the client-side LB/Proxy can detect the performance of instances and balance the load over them
  • The Service Discovery service has higher availability because it is a distributed storage system, a cluster of multiple nodes. The data is synced up rapidly between each other based on some distributed system consistency algorithm, like Gossip.

The following diagram describes how to implement that client services never have direct knowledge of the IP address of the service, but instead, they get it from a service discovery service.

  1. When service comes online, it registers its IP address with a service discovery agent.
  2. A service location can be looked up by a logical name from the service discovery agent.
  3. Service discovery nodes share service instance health information
  4. Services send a heartbeat to the service discovery agent. If a service dies, the service discovery service removes the IP of the “dead” instance.
  5. When a client service needs to call a service, it will check a local cache for the service instance IPs. Load balancing between service instances will occur on the client services.
  6. If the client service finds a service IP in the cache, it will use it. Otherwise, it goes to the service discovery and then sends the request to the instance IP directly.
  7. Periodically, the client-side cache will be refreshed by the service discovery service.

Implementation In AWS App Mesh

We will leverage AWS Cloud Map to implement this pattern. To effectively configure outlier detection for a set of instance endpoints, the service discovery method should use AWS Cloud Map. Otherwise, if using DNS, the Envoy proxy would only elect a single IP address for routing to the upstream service, nullifying the outlier detection behavior of ejecting an unhealthy host from a set of hosts. Refer to the Service Discovery method section for more details on the Envoy proxy's behavior in relation to the service discovery type.

The pre-requisite is that the Kubernetes Deployment and Service have been declared for the service you want to be registered and discovered. Thus we could have the cluster local FQDN of the service, which is in pattern $service-name.$namespace.svc.cluster.local and can be used as the host domain when the client wants to request this service.

After that, create one VirtualNode for each set of instances as below,

  • Declare the App Mesh VirtualNode with the same metadata.name and podSelector as the Kubernetes Deployment declared
  • Configure Listeners for the VirtualNode expected inbound traffic, specify a Port and Protocol for a Listener. Currently, App Mesh only supports configuring one Listener, and please refer to the Listener API Reference to confirm this limitation.
  • Specify the Health Check policy for the Listener
  • Declare the Service Discovery attributes for them in the VirtualNode definition

As well, it’s better to have a naming specification that describes the mapping between business product type, cloud map namespace, and service name. The following table is an example,

Product Type Cloud Map Namespace Cloud Map Service
Car Driver car-driver.${env}.uber.aws.local ${microservice-name}
Car Passenger car-passenger.${env}.uber.aws.local ${microservice-name}
Food Store food-store.${env}.uber.aws.local ${microservice-name}

For example, to isolate API and Web requests, we need to deploy a service, say sw-foo-service, as two separate sets of instances, one is for servicing API traffic that predicts higher throughput, the other is for serving web traffic that has lower throughput but larger queries, the Service Discovery of this scenario will involve the following type of resources:

  • VirtualNode For Default Traffic Channel

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  podSelector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: default
  listeners:
    - portMapping:
        port: 8080
        protocol: http
      healthCheck:
        port: 8080
        protocol: http
        path: '/status'
        healthyThreshold: 2
        unhealthyThreshold: 3
        timeoutMillis: 2000
        intervalMillis: 5000
  serviceDiscovery:
    awsCloudMap:
      namespaceName: foo.prod.softwheel.aws.local
      serviceName: sw-foo-service
      attributes:
      - key: traffic-channel
        value: default

  • VirtualNode For Web Traffic Channel
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service-web
  namespace: sw-foo-service
spec:
  podSelector:
  matchLabels:
    app: sw-foo-service
    traffic-channel: web
  listeners:
  - portMapping:
      port: 8080
      protocol: http
    healthCheck:
      port: 8080
      protocol: http
      path: '/status'
      healthyThreshold: 2
      unhealthyThreshold: 3
      timeoutMillis: 2000
      intervalMillis: 5000
  serviceDiscovery:
    awsCloudMap:
      namespaceName: foo.prod.softwheel.aws.local
      serviceName: sw-foo-service
      attributes:
      - key: traffic-channel
        value: web
  • Deployment For Default Traffic Channel
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  replicas: 2
  selector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: default
  template:
    metadata:
      labels:
       app: sw-foo-service
       traffic-channel: default
    spec:
      containers:
       - name: sw-foo-service
         image: sw-foo-service-ecr:BUILD-29
         ports:
           - containerPort: 8080
         env:
           - name: "SERVER_PORT"
             value: "8080"
           - name: "COLOR"
             value: "blue"
  • Deployment For Web Traffic Channel
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service-web
  namespace: sw-foo-service
spec:
  replicas: 6
  selector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: web
  template:
    metadata:
      labels:
       app: sw-foo-service
       traffic-channel: web
    spec:
      containers:
       - name: sw-foo-service
         image: sw-foo-service-ecr:BUILD-29
         ports:
           - containerPort: 8080
         env:
           - name: "SERVER_PORT"
             value: "8080"
           - name: "COLOR"
             value: "blue"

Routing Pattern-2: Service Routing

Questions To Solve

How to provide a single entry point for all microservices so that security policies and routing rules are applied uniformly? How do we ensure that developers do not have to come up with their own solutions for routing to other services or service instances? Service Routing includes the following 4 aspects,

  1. Dynamic routing - based on the data from the incoming requests, such as headers, route the traffic to different sets of instances of a specific service. P-6 patterns would rely on this ability heavily
  2. Static routing - Put all external service calls behind a single URL or URL Prefix and map those calls to the actual services
  3. Admission controlling - Check the admission for callers regarding the cross-cutting concerns for all services, such as authentication, authorization, anti-scraping, or access restrictions, in a centralized place.
  4. Metrics collection and logging - Collect the metrics data for all the incoming requests and ensure some key pieces of information are in place on every user request and response for log correlation across all downstream services.

Common Design

The Dynamic routing might need to be assisted by a Proxy solution, such as Nginx, HAProxy, or Envoy, especially the routing between the internal microservices. We need to leverage some capabilities those proxy provided, like Weighted-based Routing, Path-based Routing, or Header-based Routing, to implement dynamic routing.

The other three requirements, Static routing, Admission controlling, and Metrics collection can be covered by an API Gateway. Please go to Microservice Patterns website for more details about it, we will only talk about Dynamic routing from the traffic governance perspective.

Implementation In AWS App Mesh

We need to leverage VirtualService, VirtualRouter, and VirtualNode all together to implement this pattern, which is

Create VirtualNodes for multiple sets of instances for servicing different traffic

Create a VirtualRouter for declaring how to splitting the traffic based on routing rules, how much traffic will route to which VirtualNode

Create a VirtualService for screening the route rules from the clients by naming it after the name of the K8s Service name

The most important App Mesh API related to the implementation of this pattern is the RoutSpec, which supports multiple protocol routing and prioritizing their route rules, such as grpcRoute, http2Route, httpRoute, and tcpRoute. Currently, our implementation will heavily rely on httpRoute. HttpRoute uses match objects to specify the criteria for matching some requests, and action objects to determine what actions to take for a match.

So let’s walk through all kinds of Traffic Splitting by making some examples to demonstrate how to implement them.

Weight-based Routing

This routing method could be used to implement a Canary Deployment solution.

The involved resources type and their manifests are as below

  • Service #FQDN: sw-foo-service.sw-foo-service.svc.cluster.local
apiVersion: v1
kind: Service
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  ports:
    - protocol: TCP
      port: 8080
  • VirtualService
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  awsName: sw-foo-service.sw-foo-service.svc.cluster.local
  provider:
    virtualRouter:
      virtualRouterRef:
        name: sw-foo-service-router
  • VirtualRouter
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
  name: sw-foo-service-router
  namespace: sw-foo-service
spec:
  listeners:
   - portMapping:
      port: 8080
      protocol: http
  routes:
   - name: canary-route
     httpRoute:
       match:
         prefix: / # which means all of the traffic to this service
       action:
         weightedTargets:
           - virtualNodeRef:
               name: sw-foo-service
             weight: 99
           - virtualNodeRef:
               name: sw-foo-service-canary
             weight: 1
  • VirtualNode - Production one for the last functional release
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service-web
  namespace: sw-foo-service
spec:
  podSelector:
   matchLabels:
     app: sw-foo-service
     traffic-channel: web
  listeners:
  - portMapping:
      port: 8080
      protocol: http
    healthCheck:
      ...
  serviceDiscovery:
   awsCloudMap:
     namespaceName: foo.prod.softwheel.aws.local
     serviceName: sw-foo-service
     attributes:
     - key: traffic-channel
       value: web
  • VirtualNode - Canary one for the next release
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service-web
  namespace: sw-foo-service
spec:
  podSelector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: web
      sw-canary-release: true
  listeners:
  - portMapping:
      port: 8080
      protocol: http
    healthCheck:
      ...
  serviceDiscovery:
   awsCloudMap:
     namespaceName: foo.prod.softwheel.aws.local
     serviceName: sw-foo-service
     attributes:
     - key: traffic-channel
       value: web
  • Deployment - Production one for the last functional release
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service-web
  namespace: sw-foo-service
spec:
  replicas: 6
  selector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: web
template:
  metadata:
    labels:
      app: sw-foo-service
      traffic-channel: web
  spec:
    containers:
    - name: sw-foo-service
      image: sw-foo-service-ecr:BUILD-29
      ports:
      - containerPort: 8080
      env:
      - name: "SERVER_PORT"
        value: "8080"
      - name: "COLOR"
        value: "blue"
  • Deployment - Canary one for the next release
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service-web-canary
  namespace: sw-foo-service
spec:
  replicas: 2
  selector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: web
      sw-canary-release: true
  template:
   metadata:
     labels:
       app: sw-foo-service
       traffic-channel: web
       sw-canary-release: true
   spec:
     containers:
     - name: sw-foo-service
       image: sw-foo-service-ecr:BUILD-29
       ports:
       - containerPort: 8080
       env:
       - name: "SERVER_PORT"
         value: "8080"
       - name: "COLOR"
         value: "blue"

Header-based Routing

This routing method could be used to implement the separation of different kinds of traffic. We could call the separation feature Traffic Channel and each kind of traffic a channel. For example, we have two kinds of traffic, one is from the custom-facing website, the other is from all the other places, such as backend jobs, or our public APIs.

The involved resources type and their manifests are as below (The Service and VirtualService resource manifests are the same as the ones in Weight-based routing)

  • VirtualRouter
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
  name: sw-foo-service-router
  namespace: sw-foo-service
spec:
  listeners:
  - portMapping:
    port: 8080
    protocol: http
  routes:
  - name: web-channel-route
    httpRoute:
      match:
        prefix: / # which means all of the traffic to this service
        headers:  # with the following headers. Maximum number of 10 items.
        - name: X-SW-Traffic-Channel
          match:
            exact: web
      action:
        weightedTargets:
        - virtualNodeRef:
            name: sw-foo-service-web
          weight: 1
  - name: default
    httpRoute:
      match:
        prefix: / # default match with no priority
      action:
        weightedTargets:
        - virtualNodeRef:
            name: sw-foo-service
          weight: 1
  • VirtualNode -Default one for all non-specific traffic
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  podSelector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: default
  listeners:
  - portMapping:
      port: 8080
      protocol: http
    healthCheck:
      ...
  serviceDiscovery:
    awsCloudMap:
      namespaceName: foo.prod.softwheel.aws.local
      serviceName: sw-foo-service
      attributes:
      - key: traffic-channel
        value: default
  • VirtualNode - Dedicated one for all web traffic
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service-web
  namespace: sw-foo-service
spec:
  podSelector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: web
  listeners:
  - portMapping:
      port: 8080
      protocol: http
    healthCheck:
      ...
  serviceDiscovery:
    awsCloudMap:
      namespaceName: foo.prod.softwheel.aws.local
      serviceName: sw-foo-service
      attributes:
      - key: traffic-channel
        value: web
  • Deployment - Default one for all non-specific traffic
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: default
  template:
    metadata:
      labels:
        app: sw-foo-service
        traffic-channel: default
    spec:
      containers:
      - name: sw-foo-service
        image: sw-foo-service-ecr:BUILD-29
        ports:
        - containerPort: 8080
        env:
        - name: "SERVER_PORT"
          value: "8080"
        - name: "COLOR"
          value: "blue"
  • Deployment - Dedicated one for all web traffic
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service-web
  namespace: sw-foo-service
spec:
  replicas: 6
  selector:
    matchLabels:
      app: sw-foo-service
      traffic-channel: web
  template:
    metadata:
      labels:
        app: sw-foo-service
        traffic-channel: web
    spec:
      containers:
      - name: sw-foo-service
        image: sw-foo-service-ecr:BUILD-29
        ports:
        - containerPort: 8080
        env:
        - name: "SERVER_PORT"
          value: "8080"
        - name: "COLOR"
          value: "blue"

Path-based Routing

It could be used in the scenario that we want to route some requests prefixed by some URL Path to a separate set of instances.

For example, we want to route the traffic prefixed  /metrics/csv-export to a set of instances that only serves the CSV files exportation requests, which would take significantly longer than normal requests.
The involved resources type and their manifests are as below (The Service and VirtualService resource manifests are the same as the ones in Weight-based routing)

  • VirtualRouter
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
  name: sw-foo-service-router
  namespace: sw-foo-service
spec:
  listeners:
  - portMapping:
     port: 8080
     protocol: http
  routes:
  - name: feature-route-csv-export
    httpRoute:
      match:
        prefix: /metrics/csv-export
      action:
        weightedTargets:
        - virtualNodeRef:
            name: sw-foo-service-csv-export
          weight: 1
  - name: default
    httpRoute:
      match:
        prefix: / # default match with no priority
      action:
        weightedTargets:
        - virtualNodeRef:
            name: sw-foo-service
          weight: 1
  • VirtualNode - Default one for all non-specific traffic
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  podSelector:
    matchLabels:
      app: sw-foo-service
      feature: default
  listeners:
  - portMapping:
      port: 8080
      protocol: http
    healthCheck:
      ...
  serviceDiscovery:
    awsCloudMap:
      namespaceName: foo.prod.softwheel.aws.local
      serviceName: sw-foo-service
      attributes:
      - key: feature
        value: default
  • VirtualNode - Dedicated one for CSV exportation
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: sw-foo-service-csv-export
  namespace: sw-foo-service
spec:
  podSelector:
    matchLabels:
      app: sw-foo-service
      feature: csv-export
  listeners:
  - portMapping:
      port: 8080
      protocol: http
    healthCheck:
      ...
  serviceDiscovery:
    awsCloudMap:
      namespaceName: foo.prod.softwheel.aws.local
      serviceName: sw-foo-service
      attributes:
      - key: feature
        value: csv-export
  • Deployment - Default one for all non-specific traffic
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service
  namespace: sw-foo-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sw-foo-service
      feature: default
  template:
    metadata:
      labels:
        app: sw-foo-service
        feature: default
    spec:
      containers:
      - name: sw-foo-service
        image: sw-foo-service-ecr:BUILD-29
        ports:
        - containerPort: 8080
          resources:
            limits:
              memory: 400Mi
            requests:
              memory: 200Mi
        env:
        - name: "SERVER_PORT"
          value: "8080"
        - name: "COLOR"
          value: "blue"
  • Deployment - Dedicated one for CSV exportation
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sw-foo-service-csv-export
  namespace: sw-foo-service
spec:
  replicas: 2
  selector:
    matchLabels:
      app: sw-foo-service
      feature: csv-export
  template:
    metadata:
      labels:
        app: sw-foo-service
        feature: csv-export
    spec:
      containers:
      - name: sw-foo-service
        image: sw-foo-service-ecr:BUILD-29
        ports:
        - containerPort: 8080
        resources:
          limits:
            memory: 1000Mi
          requests:
            memory: 500Mi
        env:
        - name: "SERVER_PORT"
          value: "8080"
        - name: "COLOR"
          value: "blue"

Summary

We talked about two routing patterns of microservice governance in the pattern way, which includes the problem that the pattern can solve, the common solution to those problems, and the implementation of that solution.

The first one, Service Discovery, is for discovering the destination of an instance of a service that has multiple replications for handing large-scale requests and providing high availability. AWS App Mesh, based on AWS Cloud Map, provides a declarative way to specify the service discovery endpoints for different sets of instances of certain services.

The second one, Service Routing, is for finding the path of group instances of a service that has multiple deployments for serving various kinds of traffic or releasing different versions in parallel. AWS App Mesh creates a versatile Kubernetes CRD, VirtualRouter, to navigate the incoming requests through the routing rules you defined, which could be based on weight, headers, and paths.

For the implementations of these two patterns, this article gives the very detailed manifests of the resources involved for different scenarios. Please read them carefully and have fun. 🤩