Semantic grids
Howdy, I'm Stan, with few quick words about the present and future of grid management.
First, why do we need grid management? Well, the answer is simple: we need it essentially due to the same case we need any kind of management. When the grid is big, you can manage it as a whole. When the grid is too big, you can delegate the management to a small set of nodes. When you have several different grids, each of those too big, you're in trouble.
Well, you're not in big trouble if all of your grids are homogeneous and are not partitioned vertically, so the management issues involve mostly scaling. But what if each of your grids serves a different purpose, managed by own subsystem, and, what's most painful, provided by different vendors with little or no integration points at all?
One of the possible solutions is to build a layer on top of these service subsystems of each grid (Yes, every problem in software engineering can be solved by introducing another level of indirection :) ). But what's important – this level can be totally different from everything that lies beneath it. This level can be built to manipulate terms like “business processes”, “patterns”, “policies”, “configuration” and that sort of high-level stuff.
But that's not new. It's natural for the level sitting on top to operate by metaphors of higher levels than the lower ones. How one could integrate this level with what's underneath?
The idea we're dealing with in present time is to augment the service layers of the different grids with a common interface. It's not clear, however, how common the interface should be? Isn't it another “perfect protocol” no one bothers to support?
Well, yes and no. It's a “perfect meta-protocol”, actually. It's came from W3C and is called RDF. I'm not going to dive into details here, there's enough information about it on the net. Those who's not familiar with it concepts can think about it as a super-XML that allows us to embed semantics in the document, not only the markup. (Yeah, I know that's not correct… technically.)
So what do we need from a service layer of a grid to include it in our “semantic layer”? Hardly anything – to format the data it can provide in the RDF way. Relational databases, REST and SOAP services, configuration files – all of that can be augmented to present RDF in mostly no time.
That semantic meta-layer becomes a grid itself. It has access to all of the information required to operate a grid, but has no idea what to do with it.
Here's when more high-level protocols come into play. The semantic cloud is about to learn what “restart” is, how a given service is “started”, and what it means for the process to be “frozen”. Level of detail, you chose it. The more detailed it is, the more powerful your semantic cloud is. Just don't allow it to attempt world domination :).
Well, that's a large set of data to input. We want our semantic cloud to become our knowledge repository, to keep formalized knowledge of everything implicit in the heads of operators. Fortunately, we don't need to put every bit of that knowledge into the cloud. We can provide it with explicit knowledge, letting it to infer implicit knowledge itself – and correct it if it's wrong.
(It may sound like the neural networks or genetic approach, but it's not. It's about description logic, actually.)
Now your semantic cloud knows about every bit of your system. It can trace every job coming in and out your cluster, has the possibility to optimize the load balancing based on the data it's receiving in realtime, can bring in its perfect copy in no time in case a failure a detected, answers your questions and poses its own :). What's next?
Well… two ways here. One way is down the stack, replacing some of the high-level service layers of the grid with its own parts. The second way is upwards, sharing some of the data with other peer clouds, some day forming The Grid, which can redefine the grid computing the same way The Internet redefined the bulletin boards.
Hope you've enjoyed this essay. I'll continue later, focusing on approaches and technologies we've selected for our semantic affairs.
First, why do we need grid management? Well, the answer is simple: we need it essentially due to the same case we need any kind of management. When the grid is big, you can manage it as a whole. When the grid is too big, you can delegate the management to a small set of nodes. When you have several different grids, each of those too big, you're in trouble.
Well, you're not in big trouble if all of your grids are homogeneous and are not partitioned vertically, so the management issues involve mostly scaling. But what if each of your grids serves a different purpose, managed by own subsystem, and, what's most painful, provided by different vendors with little or no integration points at all?
One of the possible solutions is to build a layer on top of these service subsystems of each grid (Yes, every problem in software engineering can be solved by introducing another level of indirection :) ). But what's important – this level can be totally different from everything that lies beneath it. This level can be built to manipulate terms like “business processes”, “patterns”, “policies”, “configuration” and that sort of high-level stuff.
But that's not new. It's natural for the level sitting on top to operate by metaphors of higher levels than the lower ones. How one could integrate this level with what's underneath?
The idea we're dealing with in present time is to augment the service layers of the different grids with a common interface. It's not clear, however, how common the interface should be? Isn't it another “perfect protocol” no one bothers to support?
Well, yes and no. It's a “perfect meta-protocol”, actually. It's came from W3C and is called RDF. I'm not going to dive into details here, there's enough information about it on the net. Those who's not familiar with it concepts can think about it as a super-XML that allows us to embed semantics in the document, not only the markup. (Yeah, I know that's not correct… technically.)
So what do we need from a service layer of a grid to include it in our “semantic layer”? Hardly anything – to format the data it can provide in the RDF way. Relational databases, REST and SOAP services, configuration files – all of that can be augmented to present RDF in mostly no time.
That semantic meta-layer becomes a grid itself. It has access to all of the information required to operate a grid, but has no idea what to do with it.
Here's when more high-level protocols come into play. The semantic cloud is about to learn what “restart” is, how a given service is “started”, and what it means for the process to be “frozen”. Level of detail, you chose it. The more detailed it is, the more powerful your semantic cloud is. Just don't allow it to attempt world domination :).
Well, that's a large set of data to input. We want our semantic cloud to become our knowledge repository, to keep formalized knowledge of everything implicit in the heads of operators. Fortunately, we don't need to put every bit of that knowledge into the cloud. We can provide it with explicit knowledge, letting it to infer implicit knowledge itself – and correct it if it's wrong.
(It may sound like the neural networks or genetic approach, but it's not. It's about description logic, actually.)
Now your semantic cloud knows about every bit of your system. It can trace every job coming in and out your cluster, has the possibility to optimize the load balancing based on the data it's receiving in realtime, can bring in its perfect copy in no time in case a failure a detected, answers your questions and poses its own :). What's next?
Well… two ways here. One way is down the stack, replacing some of the high-level service layers of the grid with its own parts. The second way is upwards, sharing some of the data with other peer clouds, some day forming The Grid, which can redefine the grid computing the same way The Internet redefined the bulletin boards.
Hope you've enjoyed this essay. I'll continue later, focusing on approaches and technologies we've selected for our semantic affairs.
Labels: management, semantic web, ~Stan Klimoff

0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
Links to this post:
Create a Link
<< Home