Architecting Massively-Scalable Near-Real-Time Risk Analysis Solutions

Recently I held a webinar around architecting solutions for scalable and near-real-time risk analysis solutions based on the experience gathered with our Financial Services customers. In the webinar I also had the honor of hosting Mr. Larry Mitchel, a leading expert in the Financial Services industry, who provided background on the Risk Management domain. Following the general interest in the webinar, I decided to dedicate a post to the subject.

What goes on in the Risk Management domain?

The Finance world continually undergoes changes driven for the most part by the lessons learned from the 2008 financial crash, in an attempt to prevent such catastrophes from reoccurring. Regulations such as Dodd-Frank, EMIR, and Basel III have further formalized it, imposing tighter control and supervision. We see financial institutions addressing these conformance goals by assigning dedicated projects with dedicated budgets (which means more work for solutions architects, lucky me). One of the aspects of this conformance is reducing the risk by shortening the settlement cycles to near-real-time, as seen by initiatives such as Straight-Through Processing.

Traditional architectures, new challenges

Conforming to the new regulations mandates an entirely different approach to risk analysis. This means that the old systems, which relied on overnight batch risk calculations and predefined queries, can no longer suffice, and a more real time approach to risk calculation, with on-the-fly queries, is needed.

From a solution architecture point of view, Risk Analysis is a compute-intensive and a data-intensive process. Looking at our customers’ systems, we see ever-increasing volumes (number of calculated positions and assets, number of re-calculations, data affinity, etc.) and on the other hand we see an ever-increasing demand to reduce the response time, to conform with the regulations or for competitive edge. That makes it a classic Big Data analytics problem.

From a technology point of view, risk analysis solutions traditionally relied on designated compute grid products for the calculations and on relational databases as the data store. That was fine for overnight batch processing, but with the introduction of the new real-time demands databases tend to become bottlenecks under the load, due to the disk and network resources.

Risk Analysis solution architecture revisited

Our experience with such solutions shows that the effective architecture to meet these challenges is a Big Data multi-tiered architecture, in which intraday data is cached in-memory for low-latency response, while historical data is kept in a database for more extensive data mining and reporting. Simple caching solutions cannot provide the scalability of the intraday data under such write-intensive flows (streaming market data, calculation results, and such), and it is therefore an In-Memory Data Grid that has become the standard technology in modern solutions for storing intraday data. Intelligent data grids such as GigaSpaces XAP also provide on-the-fly SQL querying capabilities, which overcome the limitation of predefined queries in traditional architectures.  As for historical data, we see a clear shift from relational databases to NoSQL databases, which perform much better for mining these volumes of semi-structured data.

A piece of the architecture that is often overlooked on initial architecture discussions is the system orchestration. Surprisingly, many of the customers I visit tend to think of risk analysis solutions as the mere sum of a Compute Grid product (for computation scalability) and a Data Grid product (for data scalability). But they neglect to consider the orchestration logic to handle the intersection between the data grid and the compute grid, taking care to avoid duplicate calculations, handling cancellation of calculations, monitoring the state of ongoing calculations, feeding ticks and updates to the client UI, end more. All this amounts to a significant orchestration layer that is traditionally developed in-house.

A much more effective architecture is to embed the orchestration logic together with the data grid within one platform, thereby abstracting the complexities from the clients and removing the need of the clients to interact with anything but the unified platform. GigaSpaces XAP offers the co-location of processing and messaging together with the data, which makes implementing such architectures quite easy. This also enables pre-/post-processing on the data, such as data formatting prior to processing, and result aggregation after calculations, which are requirements often seen in such solutions.

Event-Driven Architecture is highly useful for streaming calculation results to the awaiting clients as they arrive and streaming ticks and other updates to the UI. Using GigaSpaces XAP the implementation of such architecture is made simple by leveraging on the Asynchronous API and on the messaging layer which can treat each data mutation as an event.

To address the real time analytics challenge on the end-to-end Big Data architecture, across both the intraday data (which resides in-memory within the data grid) and the historical data (which resides within a relational/NoSQL database), requires a holistic view of the multi-tier architecture. Intraday data is changed at an extremely high rate with frequent event feeds, whereas historical data can be written in a more relaxed manner, using a write-behind (write-back) caching architectural approach, and consolidating queries across the data stores, making it seem as one unified source for query purposes. Such consolidation is traditionally achieved by combining the various products, but GigaSpaces offers a Real-Time Analytics solution, enabling you to focus on your business logic and leave the rest to the platform.

Future directions

There’s more to discuss in such architectures, such as multi-site deployments over WAN, support for cloud bursting, and more, which should be considered when approaching such solutions. I will not get into these concerns on this post, but you can see coverage of future directions on my webinar.

To get more information on the domain and its challenges, and to hear more on the suggested architecture for Big Data risk analysis solutions I’d recommend watching the full webinar.


Follow Dotan on Twitter!


Filed under Big Data, Financial Services, Market Analytics, Real Time Analytics, Risk Managment

8 responses to “Architecting Massively-Scalable Near-Real-Time Risk Analysis Solutions

  1. Are there recommendations for a OLTP solution ? We are a traditional shop that relies on Oracle RAC’S. NOSQL will not be allowed by the management.


    • are you refering to OLTP solutions for trading? on this article I addressed architectures for risk analysis, which has some of the OLTP patterns of trading systems, though these are different use cases that dictate different architectures. it would be interesting to dedicate an article to trading systems architectures.
      generally speaking it seems that relational databases find it difficult to meet the scaling requirements of the leading banks, stock exchanges and trading firms. I recently consulted to a major bank in Sweden, which used Oracle DB for their trading system, and even though they spent lots of money on the latest-and-greatest DB, they still couldn’t meet the scaling requirements, and ultimately moved to an in-memory solution. I found the the multi-tier architecture approach useful for trading systems as well. I use GigaSpaces XAP as platform to host the entire front-office trading process, to co-locate the orders, prices and reference data together with the OMS and pricing engines processing logic, and writing the data behind asynchronously to the database for the non-real-time part.

      • mr

        Actually I mean the high-throughput low-latency part related to receving messages from external systems and posting them to internal queues for processing like a part of the trading system. This is a typical requirement. Isn’t it ? What lines of research do you recommend for an EDA based approach for this front-end part alone ? Products like XAP seem to be a whole end-to-end solution. Open-source is recommended ?

      • it really depends on the type of processing you do. such processing can be done with ESB, CEP, BPM or even simple message brokers. XAP can offer the EDA where the data itself serves as the event, which makes it easy to construct workflows on top of your application’s inherent domain model, which I found useful in certain types of applications.

  2. Pingback: Architecting Massively-Scalable Near-Real-Time Risk Analysis Solutions - async I/O News

  3. thabach

    Hi Dotan, nice blog. You made me wonder about how to hook up feeds to the GigaSpaces platform though :). Could you tell me if GigaSpaces does

    – offer market data feed adapters as part of its XAP offering ? for ITCH, Bloomberg or other market data interfaces ?
    – … and in general, are feed receivers deployed as PUs to take advantage of PU failover and other platform features ? Is there any documentation of best practices in connecting high rate event feeds ?

    Thanks for your elaboration or pointers to resources on these topics.

    • The XAP product itself doesn’t come shipped with specific data feed adapters. usually we integrate those as part of the customer’s solution architecture, either directly or via finance-specialized ETL engines, depending on the specific system requirements.

      You can deploy a feed receiver as a PU (Processing Unit) to the service grid in order to have it managed for scalability, FO and HA. Furthermore, you can deploy it to the service grid as an *Elastic PU*, to have the grid manager boot up and tear down Grid Service Containers as needed based on the feed load to satisfy your defined SLA, and if you’re running on an on-demand infrastructure (IaaS), it could even boot up and tear down machines for that end.

      if you need further information feel free to contact me.

  4. Pingback: [repost ]Architecting Massively-Scalable Near-Real-Time Risk Analysis Solutions | Intelligence Computing

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s