Microservice Governance - Routing Patterns
These patterns deal with how the client-side services discover the locations of the server-side services and are routed over to them. In a cloud-based application, you might have hundreds of microservices, and each microservice might have hundreds of instances running at most. It is needed to abstract away the physical IP address of these services and has a single point of entry for each service so that you can consistently enforce connection and security policies for all service calls. Also, you need to have the capability to route the traffic based on some conditions dynamically. So we will touch on two patterns, which are Service Discovery and Service Routing.
Routing Pattern-1: Service Discovery
Questions To Solve
How to make microservice discoverable so client services can find them without having to hardcode the location of the services? Furthermore, how do we ensure those misbehaving instances of a service are removed from the pool of available instances?
Common Design
In any distributed architecture, we need to find the physical address of where a machine is located. This concept has been around since the beginning of distributed computing and is known formally as service discovery. It is critical to microservice, cloud-based applications for two reasons,
- It offers the ability to quickly horizontally scale up and down the service instances without service consumers' awareness of the scaling,
- It helps increase application resiliency by removing unhealthy or unavailable instances from the pool of available instances.
A more robust approach to implementing this pattern is to have a Service Discovery Service and use Client-side Load Balancing. Figure RP-1 shows the flow of a typical service discovery process.
It is better than DNS + LB on the server-side for the following reasons,
- No single point of failure because the service IPs are stored in the client's local memory, so even if the DNS or ELB or both of them are broken, the client can still request services
- More choices of LB algorithms, the most specifical ability is the client-side LB/Proxy can detect the performance of instances and balance the load over them
- The Service Discovery service has higher availability because it is a distributed storage system, a cluster of multiple nodes. The data is synced up rapidly between each other based on some distributed system consistency algorithm, like Gossip.
The following diagram describes how to implement that client services never have direct knowledge of the IP address of the service, but instead, they get it from a service discovery service.
- When service comes online, it registers its IP address with a service discovery agent.
- A service location can be looked up by a logical name from the service discovery agent.
- Service discovery nodes share service instance health information
- Services send a heartbeat to the service discovery agent. If a service dies, the service discovery service removes the IP of the “dead” instance.
- When a client service needs to call a service, it will check a local cache for the service instance IPs. Load balancing between service instances will occur on the client services.
- If the client service finds a service IP in the cache, it will use it. Otherwise, it goes to the service discovery and then sends the request to the instance IP directly.
- Periodically, the client-side cache will be refreshed by the service discovery service.
Implementation In AWS App Mesh
We will leverage AWS Cloud Map to implement this pattern. To effectively configure outlier detection for a set of instance endpoints, the service discovery method should use AWS Cloud Map. Otherwise, if using DNS, the Envoy proxy would only elect a single IP address for routing to the upstream service, nullifying the outlier detection behavior of ejecting an unhealthy host from a set of hosts. Refer to the Service Discovery method section for more details on the Envoy proxy's behavior in relation to the service discovery type.
The pre-requisite is that the Kubernetes Deployment and Service have been declared for the service you want to be registered and discovered. Thus we could have the cluster local FQDN of the service, which is in pattern $service-name.$namespace.svc.cluster.local
and can be used as the host domain when the client wants to request this service.
After that, create one VirtualNode for each set of instances as below,
- Declare the App Mesh VirtualNode with the same
metadata.name
andpodSelector
as the Kubernetes Deployment declared - Configure Listeners for the VirtualNode expected inbound traffic, specify a Port and Protocol for a Listener. Currently, App Mesh only supports configuring one Listener, and please refer to the Listener API Reference to confirm this limitation.
- Specify the Health Check policy for the Listener
- Declare the Service Discovery attributes for them in the
VirtualNode
definition
As well, it’s better to have a naming specification that describes the mapping between business product type, cloud map namespace, and service name. The following table is an example,
Product Type | Cloud Map Namespace | Cloud Map Service |
---|---|---|
Car Driver | car-driver.${env}.uber.aws.local | ${microservice-name} |
Car Passenger | car-passenger.${env}.uber.aws.local | ${microservice-name} |
Food Store | food-store.${env}.uber.aws.local | ${microservice-name} |
For example, to isolate API and Web requests, we need to deploy a service, say sw-foo-service
, as two separate sets of instances, one is for servicing API traffic that predicts higher throughput, the other is for serving web traffic that has lower throughput but larger queries, the Service Discovery of this scenario will involve the following type of resources:
VirtualNode
For Default Traffic Channel
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
traffic-channel: default
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
port: 8080
protocol: http
path: '/status'
healthyThreshold: 2
unhealthyThreshold: 3
timeoutMillis: 2000
intervalMillis: 5000
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: traffic-channel
value: default
VirtualNode
For Web Traffic Channel
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service-web
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
traffic-channel: web
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
port: 8080
protocol: http
path: '/status'
healthyThreshold: 2
unhealthyThreshold: 3
timeoutMillis: 2000
intervalMillis: 5000
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: traffic-channel
value: web
Deployment
For Default Traffic Channel
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
replicas: 2
selector:
matchLabels:
app: sw-foo-service
traffic-channel: default
template:
metadata:
labels:
app: sw-foo-service
traffic-channel: default
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Deployment
For Web Traffic Channel
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service-web
namespace: sw-foo-service
spec:
replicas: 6
selector:
matchLabels:
app: sw-foo-service
traffic-channel: web
template:
metadata:
labels:
app: sw-foo-service
traffic-channel: web
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Routing Pattern-2: Service Routing
Questions To Solve
How to provide a single entry point for all microservices so that security policies and routing rules are applied uniformly? How do we ensure that developers do not have to come up with their own solutions for routing to other services or service instances? Service Routing includes the following 4 aspects,
- Dynamic routing - based on the data from the incoming requests, such as headers, route the traffic to different sets of instances of a specific service. P-6 patterns would rely on this ability heavily
- Static routing - Put all external service calls behind a single URL or URL Prefix and map those calls to the actual services
- Admission controlling - Check the admission for callers regarding the cross-cutting concerns for all services, such as authentication, authorization, anti-scraping, or access restrictions, in a centralized place.
- Metrics collection and logging - Collect the metrics data for all the incoming requests and ensure some key pieces of information are in place on every user request and response for log correlation across all downstream services.
Common Design
The Dynamic routing might need to be assisted by a Proxy solution, such as Nginx, HAProxy, or Envoy, especially the routing between the internal microservices. We need to leverage some capabilities those proxy provided, like Weighted-based Routing, Path-based Routing, or Header-based Routing, to implement dynamic routing.
The other three requirements, Static routing, Admission controlling, and Metrics collection can be covered by an API Gateway. Please go to Microservice Patterns website for more details about it, we will only talk about Dynamic routing from the traffic governance perspective.
Implementation In AWS App Mesh
We need to leverage VirtualService, VirtualRouter, and VirtualNode all together to implement this pattern, which is
Create VirtualNodes
for multiple sets of instances for servicing different traffic
Create a VirtualRouter
for declaring how to splitting the traffic based on routing rules, how much traffic will route to which VirtualNode
Create a VirtualService
for screening the route rules from the clients by naming it after the name of the K8s Service name
The most important App Mesh API related to the implementation of this pattern is the RoutSpec, which supports multiple protocol routing and prioritizing their route rules, such as grpcRoute
, http2Route
, httpRoute
, and tcpRoute
. Currently, our implementation will heavily rely on httpRoute. HttpRoute uses match objects to specify the criteria for matching some requests, and action objects to determine what actions to take for a match.
So let’s walk through all kinds of Traffic Splitting by making some examples to demonstrate how to implement them.
Weight-based Routing
This routing method could be used to implement a Canary Deployment solution.
The involved resources type and their manifests are as below
Service
#FQDN:sw-foo-service.sw-foo-service.svc.cluster.local
apiVersion: v1
kind: Service
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
ports:
- protocol: TCP
port: 8080
VirtualService
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
awsName: sw-foo-service.sw-foo-service.svc.cluster.local
provider:
virtualRouter:
virtualRouterRef:
name: sw-foo-service-router
VirtualRouter
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
name: sw-foo-service-router
namespace: sw-foo-service
spec:
listeners:
- portMapping:
port: 8080
protocol: http
routes:
- name: canary-route
httpRoute:
match:
prefix: / # which means all of the traffic to this service
action:
weightedTargets:
- virtualNodeRef:
name: sw-foo-service
weight: 99
- virtualNodeRef:
name: sw-foo-service-canary
weight: 1
VirtualNode
- Production one for the last functional release
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service-web
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
traffic-channel: web
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
...
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: traffic-channel
value: web
VirtualNode
- Canary one for the next release
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service-web
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
traffic-channel: web
sw-canary-release: true
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
...
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: traffic-channel
value: web
Deployment
- Production one for the last functional release
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service-web
namespace: sw-foo-service
spec:
replicas: 6
selector:
matchLabels:
app: sw-foo-service
traffic-channel: web
template:
metadata:
labels:
app: sw-foo-service
traffic-channel: web
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Deployment
- Canary one for the next release
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service-web-canary
namespace: sw-foo-service
spec:
replicas: 2
selector:
matchLabels:
app: sw-foo-service
traffic-channel: web
sw-canary-release: true
template:
metadata:
labels:
app: sw-foo-service
traffic-channel: web
sw-canary-release: true
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Header-based Routing
This routing method could be used to implement the separation of different kinds of traffic. We could call the separation feature Traffic Channel and each kind of traffic a channel. For example, we have two kinds of traffic, one is from the custom-facing website, the other is from all the other places, such as backend jobs, or our public APIs.
The involved resources type and their manifests are as below (The Service and VirtualService resource manifests are the same as the ones in Weight-based routing)
VirtualRouter
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
name: sw-foo-service-router
namespace: sw-foo-service
spec:
listeners:
- portMapping:
port: 8080
protocol: http
routes:
- name: web-channel-route
httpRoute:
match:
prefix: / # which means all of the traffic to this service
headers: # with the following headers. Maximum number of 10 items.
- name: X-SW-Traffic-Channel
match:
exact: web
action:
weightedTargets:
- virtualNodeRef:
name: sw-foo-service-web
weight: 1
- name: default
httpRoute:
match:
prefix: / # default match with no priority
action:
weightedTargets:
- virtualNodeRef:
name: sw-foo-service
weight: 1
VirtualNode
-Default one for all non-specific traffic
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
traffic-channel: default
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
...
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: traffic-channel
value: default
VirtualNode
- Dedicated one for all web traffic
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service-web
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
traffic-channel: web
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
...
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: traffic-channel
value: web
Deployment
- Default one for all non-specific traffic
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
replicas: 3
selector:
matchLabels:
app: sw-foo-service
traffic-channel: default
template:
metadata:
labels:
app: sw-foo-service
traffic-channel: default
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Deployment
- Dedicated one for all web traffic
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service-web
namespace: sw-foo-service
spec:
replicas: 6
selector:
matchLabels:
app: sw-foo-service
traffic-channel: web
template:
metadata:
labels:
app: sw-foo-service
traffic-channel: web
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Path-based Routing
It could be used in the scenario that we want to route some requests prefixed by some URL Path to a separate set of instances.
For example, we want to route the traffic prefixed /metrics/csv-export
to a set of instances that only serves the CSV files exportation requests, which would take significantly longer than normal requests.
The involved resources type and their manifests are as below (The Service and VirtualService resource manifests are the same as the ones in Weight-based routing)
VirtualRouter
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
name: sw-foo-service-router
namespace: sw-foo-service
spec:
listeners:
- portMapping:
port: 8080
protocol: http
routes:
- name: feature-route-csv-export
httpRoute:
match:
prefix: /metrics/csv-export
action:
weightedTargets:
- virtualNodeRef:
name: sw-foo-service-csv-export
weight: 1
- name: default
httpRoute:
match:
prefix: / # default match with no priority
action:
weightedTargets:
- virtualNodeRef:
name: sw-foo-service
weight: 1
VirtualNode
- Default one for all non-specific traffic
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
feature: default
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
...
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: feature
value: default
VirtualNode
- Dedicated one for CSV exportation
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: sw-foo-service-csv-export
namespace: sw-foo-service
spec:
podSelector:
matchLabels:
app: sw-foo-service
feature: csv-export
listeners:
- portMapping:
port: 8080
protocol: http
healthCheck:
...
serviceDiscovery:
awsCloudMap:
namespaceName: foo.prod.softwheel.aws.local
serviceName: sw-foo-service
attributes:
- key: feature
value: csv-export
Deployment
- Default one for all non-specific traffic
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service
namespace: sw-foo-service
spec:
replicas: 3
selector:
matchLabels:
app: sw-foo-service
feature: default
template:
metadata:
labels:
app: sw-foo-service
feature: default
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
resources:
limits:
memory: 400Mi
requests:
memory: 200Mi
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Deployment
- Dedicated one for CSV exportation
apiVersion: apps/v1
kind: Deployment
metadata:
name: sw-foo-service-csv-export
namespace: sw-foo-service
spec:
replicas: 2
selector:
matchLabels:
app: sw-foo-service
feature: csv-export
template:
metadata:
labels:
app: sw-foo-service
feature: csv-export
spec:
containers:
- name: sw-foo-service
image: sw-foo-service-ecr:BUILD-29
ports:
- containerPort: 8080
resources:
limits:
memory: 1000Mi
requests:
memory: 500Mi
env:
- name: "SERVER_PORT"
value: "8080"
- name: "COLOR"
value: "blue"
Summary
We talked about two routing patterns of microservice governance in the pattern way, which includes the problem that the pattern can solve, the common solution to those problems, and the implementation of that solution.
The first one, Service Discovery, is for discovering the destination of an instance of a service that has multiple replications for handing large-scale requests and providing high availability. AWS App Mesh, based on AWS Cloud Map, provides a declarative way to specify the service discovery endpoints for different sets of instances of certain services.
The second one, Service Routing, is for finding the path of group instances of a service that has multiple deployments for serving various kinds of traffic or releasing different versions in parallel. AWS App Mesh creates a versatile Kubernetes CRD, VirtualRouter, to navigate the incoming requests through the routing rules you defined, which could be based on weight, headers, and paths.
For the implementations of these two patterns, this article gives the very detailed manifests of the resources involved for different scenarios. Please read them carefully and have fun. 🤩