Mathematics of Grid Computing
Today, I came across the recent blog entry by Nikita Ivanov of GridGain: Scale Out on Grid = Data Partitioning + Affinity Map/Reduce. Interesting read that inspired comments I've posted under that blog, copied here:
So, Nikita, you propose two equations:
(1) Grid Computing = Compute Grid + Data Grid
(2) Scale out on grid = Data Partition + Affinity MapReduce
Both are interesting and important statements that should be carefully analyzed. The first equation basically comes down to data-aware scheduling which (as you eloquently explain in your last week’s blog) means that the algorithm that assigns jobs to specific compute resources must take into account the distribution of data over the grid and affinity property between the job and the data partitions.
Reasonable data-aware schedulers are still rare and I am very glad to see GridGains coming up with a commercial implementation. We recently build a demo that measures the performance of a “typical” job with and without data affinity. The performance is affected by the factor of 2x to 3x, simply based on data-aware routing being switched on and off. Clearly, this is very important direction for grid middleware; I am convinced that data-aware, affinity-capable grid middleware will someday become mainstream.
Let us not forget that all this concerns job-centric processing. For throughput computing that operates under a shower of real-time transactions, the equivalent concept to “data-aware scheduling” of jobs would be “data-aware routing” of these transactions. Mainstream Data Grid middleware, like GigaSpaces and Oracle Coherence, have long been able to handle this scenario. Nowadays, GigaSpaces is moving towards support of “data-aware scheduling” through the concept of Processing Unit on the Service Grid.
Now, your second equation raises a question. Are you talking about scaling out the data grid in a static or dynamics sort of way? In other words, is the objective to allow for “a-priori” arbitrary large number of partitions with scalable MapReduce or to be able to adjust the number of partitions dynamically in response to the sporadic jumps in the payloads across the entire grid fabrics? If it’s the former, then data partitioning with affinity is the traditional answer. If it’s the latter, well, then we need to solve a lot of hard problems for statefull, data-aware services that are outside of your equation. I see dynamic scaling of statefull services as being increasingly important area of research and commercial implementations. Sun’s project Hedeby is an interesting step in this direction.
Labels: convergence, data synapse, gigaspaces, grid computing, ~Victoria Livschitz

2 Comments:
This post has been removed by the author.
This post has been removed by a blog administrator.
Post a Comment
Subscribe to Post Comments [Atom]
Links to this post:
Create a Link
<< Home