Sample: WSO2 EI Cache Mediator based Token Caching

This post and the sample code are the results of a particular issue I had to tackle recently. Though the sample code is my own, the idea and the approach have many authors, arising from the collective knowledge on the WSO2 Middleware Stack.

The Typical Case for Caching

Token based authentication is not a new paradigm. The basic story is,

Talk to a Identity Management Service and obtain a token based on a kind of authentication
Call a service provider API, providing the token received in step #1
Service provider validates the token and acts on the privileges translated from the token

If at one point of your developer life, if you have invoked an API based on an access token, you are familiar with this scenario.

All tokens have a timeout period after which the validity of the it is expired and the client (application/user) is required to apply for a new token. During this period though, it makes little sense to repeatedly call the Identity Management Service for a new token, every time an API call has to be made.

So the next sensible thing to do is to cache the retrieved token, until the timeout occurs.

Pseudo-mediation

In this case, the flow of the logic would be as follows.

Receive request start
Check the Token Cache for a valid token
If a token is found, make request with the token
If a token is not found, make a request to the IAM Service and retrieve a new token. Store the newly retrieved token in the Token Cache.

Now the point of this post is to explain how this flow can be achieved through WSO2 EI/Integrator using Apache Synapse.

Cache Mediator

The main mediator to be used in this scenario is the Cache Mediator. The role of the Cache Mediator is to act as a message Cache Lookup and a message Cache Writer. When looking up values in the Cache, if a value is found, the Synapse Sequence specified in the <onCacheHit /> element is executed. If no value is found, the mediation flow continues to the next mediator after the Cache lookup mediator.

For our sample setup, we are going to WSO2 EI/Integrator 6.1.1. For the sake of clarity, we are going to use mock services in place of the IAM Service and the Service Provider API. A Synapse API will be the mediation flow that handles the communication between the client, the IAM Service, and the Service Provider API.

Simple Backend

First let’s write a really simple Synapse API that would act as a dummy backend, AKA the Service Provider API.

<api xmlns="http://ws.apache.org/ns/synapse" name="SimpleBackend" context="/log">
   <resource methods="POST" outSequence="main">
      <inSequence>
        <!-- Let's see the conent of the request -->
         <log level="full" separator="--"/>
        
        <!-- Sending a simple response -->
         <payloadFactory media-type="text">
            <format>{"status":"0"}</format>
            <args/>
         </payloadFactory>
         <property name="messageType" value="application/json" scope="axis2" type="STRING"/>
         <property name="ContentType" value="application/json" scope="axis2" type="STRING"/>
         <respond/>
      </inSequence>
      <faultSequence>
         <log level="custom">
            <property name="MSG_FLOW" value="FAULT"/>
         </log>
         <drop/>
      </faultSequence>
   </resource>
</api>

A curl POST request to the above API would result in the text message {"status":"0"} .

The Carbon log shows the following message from the POST body we attached to the request.

IAM Provider Mock Service

To mock the IAM Provider, I’m going to use a random integer generator (We could use a IAM Middleware like WSO2 Identity Server, and an IAM Middleware could probably be the proper player in a real world scenario, however, to keep the focus on Cache management, I’m using the random integer generator. After all, we just need a “token” token that we can be sure to be random). A GET request to a customized random.org URL will generate a random integer, per call. We will use this mechanism to generate our “tokens”.

Mediation

The mediation logic, according to what we came up in the Pseudo-mediationsection is as follows.

As the diagram explains, the <onCacheHit /> sequence will be CallBESequence. When the Cache lookup fails, the API’s INsequence will make a call to retrieve a new token. The result of that call will be handled by the TokenReceiverSequence.

To get in to the code, following is the API definition for the /calllogging API.

<api xmlns="http://ws.apache.org/ns/synapse" name="AuthAndCallLoggingAPI" context="/calllogging">
   <resource methods="GET">
      <inSequence>
         <log level="custom">
            <property name="MSG_FLOW" value="IN"/>
         </log>
         
         <!-- Do a cache lookup for the Token -->
         <cache id="tokenCache1" scope="per-host" collector="false" hashGenerator="org.wso2.carbon.mediator.cache.digest.DOMHASHGenerator" timeout="120">
            <!-- If a token is found, jump mediation flow to CallBESequence -->
            <onCacheHit sequence="CallBESequence"/>
            <implementation type="memory" maxSize="1000"/>
         </cache>
         
         <!-- If the cache lookup fails, we need to retrieve a new token from IAM Service -->
         <property name="uri.var.num" value="1" scope="default" type="STRING"/>
         <property name="uri.var.min" value="1" scope="default" type="STRING"/>
         <property name="uri.var.max" value="100000" scope="default" type="STRING"/>
         <property name="uri.var.col" value="1" scope="default" type="STRING"/>
         <property name="uri.var.base" value="10" scope="default" type="STRING"/>
         <property name="uri.var.format" value="plain" scope="default" type="STRING"/>
         <property name="uri.var.rnd" value="new" scope="default" type="STRING"/>
         
         <!-- TokenReceiverSequence will handle the response from the random.org endpoint call -->
         <send receive="TokenReceiverSequence">
            <endpoint>
               <http method="GET" uri-template="https://www.random.org/integers/?num={uri.var.num}&amp;min={uri.var.min}&amp;max={uri.var.max}&amp;col={uri.var.col}&amp;base={uri.var.base}&amp;format={uri.var.format}&amp;rnd={uri.var.rnd}"/>
            </endpoint>
         </send>
      </inSequence>
      
      <outSequence>
         <!-- Forward the response to the client -->   
         <send/>
      </outSequence>
      
   </resource>
</api>

Couple of things to notice about the Cache Mediator configuration are,

id is a unique identifier, that we can use to both lookup and persist values. We will later use the same id to persist the token received from random.org
collector is to false. This is because at this stage of the mediation flow, we are looking up values, not collecting them.
timeout is set to a low value of 120 seconds. This will be convenient to test our mediation flow.

Cache Miss Scenario

In the case of a cache miss (which is the initial case), the mediation logic will continue to the rest of the INsequence. The token retrieval logic written there.

A <send /> mediator is used to call the random.org endpoint. The attribute receive defines a sequence that will handle the response from the random.org endpoint. Let’s look at the logic written in the TokenReceiverSequence. Basically, what this sequence should do is to

persist the value retrieved from the endpoint
Continue to the mediation logic where the backend call happens

<?xml version="1.0" encoding="UTF-8"?>
<sequence name="TokenReceiverSequence" xmlns="http://ws.apache.org/ns/synapse">
    <!-- Persist the response from random.org to Cache -->
    <cache collector="true" id="tokenCache1" scope="per-host"/>
  
    <log level="custom">
        <property name="MSG_FLOW" value="TOKEN_RECV"/>
        <property expression="$body//p1:text" name="TOKEN"
            xmlns:ns="http://org.apache.synapse/xsd" xmlns:p1="http://ws.apache.org/commons/ns/payload"/>
    </log>
  
    <!-- Continue to the backend calling logic -->
    <sequence key="CallBESequence"/>
</sequence>

Again, looking at the Cache mediation settings, we can observe the following.

id has the same value as the one we used to lookup values.
collector is set to true, because we are persisting values here

Next the mediation logic continues to CallBESequence. We have not modified the message context during this sequence, which means the message context consists of the value we received from the random.org (our mock IAM Service). The backend calling sequence will extract values from the message context and use them to call the backend properly.

<?xml version="1.0" encoding="UTF-8"?>
<sequence name="CallBESequence" xmlns="http://ws.apache.org/ns/synapse">
    <!-- Retrieve token from the message context -->
    <property expression="$body//p1:text" name="authToken"
        xmlns:ns="http://org.apache.synapse/xsd" xmlns:p1="http://ws.apache.org/commons/ns/payload"/>
  
    <log>
        <property name="MSG_FLOW" value="B/E Call"/>
        <property expression="get-property('authToken')" name="BODY" xmlns:ns="http://org.apache.synapse/xsd"/>
    </log>
  
    <!-- Construct the backend calling payload -->
    <payloadFactory media-type="json">
        <format>{"token":"$1"}</format>
        <args>
            <arg evaluator="xml" expression="get-property('authToken')"
                literal="false" xmlns:ns="http://org.apache.synapse/xsd"/>
        </args>
    </payloadFactory>
  
    <send>
        <endpoint>
            <!-- Our mock backend -->
            <http method="POST" uri-template="http://localhost:8280/log"/>
        </endpoint>
    </send>
</sequence>

Cache Hit Scenario

Going back to the logic where the cache lookup is done, you can see that <onCacheHit /> refers to the same sequence CallBESequence. After a cache hit, the message context is changed to the value retrieved from the cache. Therefore, the <property /> mediator expression $body//p1:text results in the token that was retrieved during the last cache miss scenario.

Piecing it All Together

After deploying the above configuration (2 APIs + 2 Sequences), we can test this story by making a couple of curl requests to the /calllogging API.

curl -v http://localhost:8280/calllogging

The first request results in a cache miss, and the token retrieval mediation logic is invoked. We can see this by the log line TOKEN_RECV printed in the console. Here, the token received is 42945. This is then sent to the backend (evident in the log line request--Payload:).

The next curl request made immediately after the first one results in a cache hit. There is no token retrieval logic to be done (no TOKEN_RECV log line), and the same token is sent to the backend.

The third request is made after more than two minutes, the 120 timeout value in seconds we specified in the Cache mediator. This results in a cache miss, therefore the token retrieval logic is executed again. Hence the appearance of the log line TOKEN_RECV with a new token 67180.

Conclusion

The code, and the Composite Application Archive (CAR file) for the above can be found at GitHub. You can deploy the CAR file in your WSO2 EI/Integrator deployment.

Token based authentication is almost every-day use in the current integration landscape. Adding a caching mechanism is the next natural step towards making the logic smoother. Cache Mediator plays a significant role in this story and this sample will be a good stepping stone in to this standard user story.

Written on November 12, 2017 by chamila de alwis.

Originally published on Medium