October 23, 2007

Fura, open source grid computing middleware – One-day impression

Recently I came across one particularly interesting grid technology:
Fura. This is an open source product of GridSystems, a company based in Spain…

I have had a chance to play with it for a day.

Conceptually, Fura is a collection of collaborating web services based on top of a WSDL/SOAP technology stack, and all its architecture follows the Web Services/SOA paradigm. This approach offers apparent benefits, such as easy standards-based integration—SOAP is well-known, and in addition, a C++/Java SDK is shipped with the product. The only obvious disadvantage of using SOAP for all communications is that a TCP+HTTP+XML+SOAP communication stack introduces unavoidable ~40 ms communication overhead, which makes the system inappropriate for extreme transaction processing scenarios. However, if the system is used for processing long jobs (from seconds to hours), SOAP communication costs may become negligible.

Installation of Fura went smoothly using its friendly text-mode installer. A GUI installer is available on platforms that support it. The only disadvantage in Fura deployment I noticed is that all Fura services currently use static IP to communicate to each other, which is not very convenient in many environments. Anyway, Fura developers are going to support DNS names in the next minor release.

Fura features a very nice and modern-looking web-based GUI, which allows you to inspect and change all needed system parameters, submit jobs and access a Virtual File System—another useful Fura feature. The Virtual File System is basically a folder hierarchy on the master machine in the Fura cluster that is exposed via webservices to other cluster machines. VFS has full ACL support and can be used as a file sharing service by job processing agents with performance similar to FTP.

Each slave host in the Fura grid runs one or several lightweight agents; each agent can do one job at a time. The number of agents defaults to the number of CPUs. Agents report predefined set of attributes, such as CPU utilization and memory consumption, to the master scheduler service, which uses this information to schedule tasks fairly. Custom attributes are supported via assignment of keywords to hosts; e.g., some of the hosts can be tagged as “fast” or “excel” to indicate an Excel installation. Then, resource requirements can be added to the job description and the scheduler will use substring matching to find an appropriate host. Unfortunately, this model doesn't support custom attributes that require numeric matching; for example, size of the temporary space or number of logged in users.

Fura has a well defined model for creating batch jobs, which offers several kinds of iterators that can iterate over indexes and filesets, thus providing an excellent integration framework for grid-enabling legacy applications. Also, with application packages stored on VFS, Fura is able to provision required software to slave hosts before running the job. The execution subsystem offers convenient support for grabbing task output and error streams, as well as other result files, and moving it to VFS to be accessible for the job submitter.

Overall, Fura makes a good first impression. It has a well defined architecture, API and a nice web GUI. The main advantage of this system seems to be its integration capabilities, due to SOAP based architecture. Fura makes it very easy to grid-enable legacy application and looks like it is capable of supporting the computational grid needs of small-to-medium enterprises. It seems that for high-end job and transaction processing, however, the benefits of its webservices architecture can quickly turn into problems with performance and scalability.

Anyway, good job, GridSystems!

Labels: ,

1 Comments:

Blogger Abie said...

This post has been removed by a blog administrator.

September 26, 2009 12:27 AM  

Post a Comment

Subscribe to Post Comments [Atom]

Links to this post:

Create a Link

<< Home