Data aware routing DataSynapse GridServer and GigaSpaces XAP
Let's consider a very simple data-intensive trading application that will illustrate the following discussion. Our trade application works with a large dataset of Trade objects {id, bookId, xmlData}. A unit of work in this application is to evaluate all trades for a given bookId. For this evaluation, the application code should fetch all the trades by its bookId from the central data source and then perform some calculations with them. Sounds very simple, right?
But what if we want to scale this application to huge numbers of trades in the dataset and ensure high throughput of evaluation jobs? Well, we can put our application on the computational grid, spreading our units of work among a large number of computational engines. Systems like DataSynapse GridServer allow us to easily scale computation-intensive jobs on the grid, effectively putting the power of hundreds of its engines at our disposal. This solves our problem with raw CPU power to perform calculation over the fetched dataset, but doesn't solve the problem of remote data access, which will finally limit the throughput of our solution.
No matter how many engines we add to the processing, they all will just wait for data to arrive from the database layer. A mainstream database is bound by the need to search its disk to find the relevant data and time to move a large result set through the pipes. To exit from this dead end we need to distribute the duties of the central database on the grid and minimize expensive disk I/O. So, In-Memory Data Grids (IMDG), such as GigaSpaces XAP or Oracle Coherence, come to help. They both provide partitioned, replicated, transactional, persistent application memory. We can manage really big datasets in a really timely fashion with those products. Comparing those excellent products is out the scope of this posting, so let's pick GigaSpaces for the sake of example.
We can load all our trade objects into the partitioned clustered space of GigaSpaces XAP and thus distribute data layer workload between a number of hosts. Our book evaluation task on the DS Engine node will use the GigaSpaces API to fetch all the trades by bookId from the clustered space. Thus we will get a performance gain by removing the data access bottleneck caused by a single-host disk I/O-based database. It's good, but can we do better? Read further ...
If trades are randomly partitioned on the cluster, getting all trades by bookId will require the GigaSpaces API to contact EVERY cluster partition in parallel to fetch all relevant trades. This adds synchronization overhead and eats network bandwidth. What we can do remove this overhead? We can introduce data affinity by ensuring that GigaSpaces routes trades to partitions according to their bookId. This way, all trades with same bookId will be collocated within one partition. The GigaSpaces API will explore data affinity and will contact only ONE node within the cluster to fetch all the relevant trades at once. This significantly improves the scalability of our solution and saves cluster bandwidth. That's cool, but can we do even better? Yes, we can.
With data affinity, our book evaluation task runs on DS engines and, using the GigaSpaces API, contacts a single node on the network to fetch many trades. This bulk network fetching consumes network bandwidth and may limit our throughput if we do many such fetches at once. However, we can place the IMDG cluster on the same nodes as the computational grid and employ data aware routing, e.g., ensure that tasks requiring access to particular data will be scheduled to the node that runs the IMDG partition with /relevant/ data. In our example we need to make sure that the task to evaluate a particular bookId will be scheduled on the node running the GigaSpaces partition that contains all the trades with that bookId. Data aware scheduling makes data fetching operations local in the sense that data is fetched via loopback, not going through the network switches. This is not only faster, but also saves cluster network bandwidth.
Data aware routing demo
We have put together a simple demo that illustrates the value of data aware routing and the performance gains it offers. The demo is based on the trading application example we just considered and runs a constant flow of book evaluation jobs.
100K trade objects are loaded into a partitioned data grid. Those trades belong to 10 different equal-size books, e.g., each book has 10K trades. The job here is to evaluate all 10 books on the DataSynapse GridServer cluster in parallel. This means that a job consists of 10 tasks and each task should fetch 10K trades from the clustered space and perform some calculation over them. For simplicity, we just sum the ids of the trades and return the result.
The demo runs a constant flow of such jobs and constantly measures job completion latency. To illustrate data aware routing advantages we introduced 3 different task scheduling modes. “Data aware” mode is a mode where we perform data aware routing, ensuring local space access for the engine. “Neutral” mode is a mode where we do unguided DS scheduling as GridServer sees fit. “Anti data aware” mode is a mode where we deliberately violate data awareness and ensure network space access. The user can change the task scheduling mode on the fly and see the impact of the scheduling mode on performance.

The chart on the right side of the demo screen shows latencies of recently submitted jobs. Bars marked with "+" show the latency of jobs submitted in data aware mode, bars marked with "*" show neutral mode, and "-" show anti data aware mode. So, for these kind of jobs we can see 3x performance gain when using data aware routing over anti data aware routing. Neutral routing performance is unstable due to the fact that some tasks can go to the right node due to random scheduling. The probability of that, however, gets negligible when the number of nodes grows.
This 3x performance gain is in good correspondence with standard remote vs local space performance benchmark from GigaSpaces site.
What's under the hood?
At the heart of the data aware routing implementation there are 2 components: Monitor and DataAwareService.
Monitor server
Monitor performs continuous monitoring of the GigaSpaces XAP cluster and gathers information about clustered spaces that are deployed on the grid. Monitor “knows” on which host each GS Partition is located, e.g., knows the cluster topology.
DataAwareService
DataAwareService is a thin client-side wrapper over the DataSynapse Service with a similar interface. DataAwareService is responsible for providing the correct GridServer Condition, which helps to route tasks to the proper Engines. It uses Monitor to get the host IP address running the relevant space partition. Client code just supplies space name and routing key on service invocation and the task gets routed to the proper Engine.
Below is a diagram of the individual task routing workflow:

Conclusion
Data aware routing offers a good way to increase performance in a computational grid/IMDG combined environment. The routing scenario described here can be applied not only to a GigaSpaces + DataSynapse combination, but basically to any IMDG that supports data affinity (like Coherence) and to any grid engine that supports conditional routing (like SGE, LSF, etc.)
We are going to put our demo on the web, so it will be available online soon.
Stay tuned!
References:
But what if we want to scale this application to huge numbers of trades in the dataset and ensure high throughput of evaluation jobs? Well, we can put our application on the computational grid, spreading our units of work among a large number of computational engines. Systems like DataSynapse GridServer allow us to easily scale computation-intensive jobs on the grid, effectively putting the power of hundreds of its engines at our disposal. This solves our problem with raw CPU power to perform calculation over the fetched dataset, but doesn't solve the problem of remote data access, which will finally limit the throughput of our solution.
No matter how many engines we add to the processing, they all will just wait for data to arrive from the database layer. A mainstream database is bound by the need to search its disk to find the relevant data and time to move a large result set through the pipes. To exit from this dead end we need to distribute the duties of the central database on the grid and minimize expensive disk I/O. So, In-Memory Data Grids (IMDG), such as GigaSpaces XAP or Oracle Coherence, come to help. They both provide partitioned, replicated, transactional, persistent application memory. We can manage really big datasets in a really timely fashion with those products. Comparing those excellent products is out the scope of this posting, so let's pick GigaSpaces for the sake of example.
We can load all our trade objects into the partitioned clustered space of GigaSpaces XAP and thus distribute data layer workload between a number of hosts. Our book evaluation task on the DS Engine node will use the GigaSpaces API to fetch all the trades by bookId from the clustered space. Thus we will get a performance gain by removing the data access bottleneck caused by a single-host disk I/O-based database. It's good, but can we do better? Read further ...
If trades are randomly partitioned on the cluster, getting all trades by bookId will require the GigaSpaces API to contact EVERY cluster partition in parallel to fetch all relevant trades. This adds synchronization overhead and eats network bandwidth. What we can do remove this overhead? We can introduce data affinity by ensuring that GigaSpaces routes trades to partitions according to their bookId. This way, all trades with same bookId will be collocated within one partition. The GigaSpaces API will explore data affinity and will contact only ONE node within the cluster to fetch all the relevant trades at once. This significantly improves the scalability of our solution and saves cluster bandwidth. That's cool, but can we do even better? Yes, we can.
With data affinity, our book evaluation task runs on DS engines and, using the GigaSpaces API, contacts a single node on the network to fetch many trades. This bulk network fetching consumes network bandwidth and may limit our throughput if we do many such fetches at once. However, we can place the IMDG cluster on the same nodes as the computational grid and employ data aware routing, e.g., ensure that tasks requiring access to particular data will be scheduled to the node that runs the IMDG partition with /relevant/ data. In our example we need to make sure that the task to evaluate a particular bookId will be scheduled on the node running the GigaSpaces partition that contains all the trades with that bookId. Data aware scheduling makes data fetching operations local in the sense that data is fetched via loopback, not going through the network switches. This is not only faster, but also saves cluster network bandwidth.
Data aware routing demo
We have put together a simple demo that illustrates the value of data aware routing and the performance gains it offers. The demo is based on the trading application example we just considered and runs a constant flow of book evaluation jobs.
100K trade objects are loaded into a partitioned data grid. Those trades belong to 10 different equal-size books, e.g., each book has 10K trades. The job here is to evaluate all 10 books on the DataSynapse GridServer cluster in parallel. This means that a job consists of 10 tasks and each task should fetch 10K trades from the clustered space and perform some calculation over them. For simplicity, we just sum the ids of the trades and return the result.
The demo runs a constant flow of such jobs and constantly measures job completion latency. To illustrate data aware routing advantages we introduced 3 different task scheduling modes. “Data aware” mode is a mode where we perform data aware routing, ensuring local space access for the engine. “Neutral” mode is a mode where we do unguided DS scheduling as GridServer sees fit. “Anti data aware” mode is a mode where we deliberately violate data awareness and ensure network space access. The user can change the task scheduling mode on the fly and see the impact of the scheduling mode on performance.

The chart on the right side of the demo screen shows latencies of recently submitted jobs. Bars marked with "+" show the latency of jobs submitted in data aware mode, bars marked with "*" show neutral mode, and "-" show anti data aware mode. So, for these kind of jobs we can see 3x performance gain when using data aware routing over anti data aware routing. Neutral routing performance is unstable due to the fact that some tasks can go to the right node due to random scheduling. The probability of that, however, gets negligible when the number of nodes grows.
This 3x performance gain is in good correspondence with standard remote vs local space performance benchmark from GigaSpaces site.
What's under the hood?
At the heart of the data aware routing implementation there are 2 components: Monitor and DataAwareService.
Monitor server
Monitor performs continuous monitoring of the GigaSpaces XAP cluster and gathers information about clustered spaces that are deployed on the grid. Monitor “knows” on which host each GS Partition is located, e.g., knows the cluster topology.
DataAwareService
DataAwareService is a thin client-side wrapper over the DataSynapse Service with a similar interface. DataAwareService is responsible for providing the correct GridServer Condition, which helps to route tasks to the proper Engines. It uses Monitor to get the host IP address running the relevant space partition. Client code just supplies space name and routing key on service invocation and the task gets routed to the proper Engine.
Below is a diagram of the individual task routing workflow:

1. Demo backend invokes DataAwareService, providing space name and routing key.
2. DataAwareService queries Monitor for advice about where to route task with given space name and routing key.
3. Monitor gives advice as an IP address of host running relevant GS partition.
4. DataAwareService creates a Condition for GridServer to run a task on a given node and schedules it.
5. GridServer invokes the task on one of the engines on the proper host.
6. Task connects to GigaSpaces partition and fetches the data.
7-10. Engines process the data and return results back.
Conclusion
Data aware routing offers a good way to increase performance in a computational grid/IMDG combined environment. The routing scenario described here can be applied not only to a GigaSpaces + DataSynapse combination, but basically to any IMDG that supports data affinity (like Coherence) and to any grid engine that supports conditional routing (like SGE, LSF, etc.)
We are going to put our demo on the web, so it will be available online soon.
Stay tuned!
References:
- Data awareness and Low Latency on The EG by Nati Shalom, CTO at GigaSpaces
- Bridging The Paradigms: Convergence of Compute Grids with In-Memory Data Grids by Victoria Livschitz, CEO at GridDynamics
Labels: convergence, data synapse, gigaspaces, grid computing, ~Eugene Steinberg

0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
Links to this post:
Create a Link
<< Home