July 13, 2009

OpenLDAP perfomance testing

Introduction

Our team is working on the metadata repository for a large-scale grid file storage project. This repository should make grid storage searchable and browseable. Since grid storage is presumed to store huge amounts of data, metadata repository should be able to store large amounts of metadata as well.

In our solution the metadata is typed, heterogeneous and has a hierarchical structure; metadata storage should be searchable, replicated and scalable. Also, we have strict requirements for new entry types to be definable at runtime. Taking these considerations into account, we chose to push for a LDAP-based solution. Since we had a requirement not to be tied to any particular vendor, OpenLDAP was a natural choice.

We've made a number of OpenLDAP performance tests to examine how OpenLDAP worked with large numbers of entries. Below, we will share some of the results we've achieved. We do not dive deeply into details of the testing machine configuration, since absolute timing values do not give many data points when you are trying to evaluate scalability of the solution. Trends, in this particular case with regards to the time consumed, are much more important.

The benchmarking stand can be described as a single-core Linux box using OpenLDAP version 2.4.16 (the latest at the moment of testing). OpenLDAP was built from the sources, as is recommended by the OpenLDAP documentation. OpenLDAP native HDB (based on Berkeley DB database with hierarchical database layout that supports subtree renames) was used as the database backend. All tests were performed by connecting to the OpenLDAP server through the loopback interface (127.0.0.1) to eliminate the impact of network bandwidth saturation.

For read performance measurements we used three LDAP entry attributes:

  • dn (distinguished name) - distinguished name of the LDAP entry, identifying an entry and its location.
  • cn (common name) - general string attribute of the LDAP entry.
  • entryUUID - builtin, server-maintained unique entry identifier. It remains unchanged when entry is being moved from one subtree into another on the contrary to dn, which is being changed.

Directory reading

Generally speaking, OpenLDAP finds an entry using an index much faster than delivering an entry client through the network. A more interesting case is how OpenLDAP handles concurrent reads and writes with regards to the number of clients.
Test I. Concurrent reading
The case: the OpenLDAP directory consists of a flat list of about 100K entries indexed by cn. Each client sequentially queries 1000 random existing entries by dn, cn and entryUUID. A number of clients (1 - 20) are executed simultaneously. Three curves on the graph show total time with regards to the number of concurrent clients in three different scenarios:
  • search by dn;
  • searching by cn;
  • search by entryUUID.
Testing results are shown in two charts, containing absolute timing values and normalized by dividing timing value by the number of concurrent clients.



Searching without indices was not tested because it is too slow on big data sets. As we can see, searching by distinguished name takes much less time than searching among children of the parent by indexed attribute. Another conclusion is that it is better to avoid searching by entryUUID despite the convenience of this attribute.

Directory reading combined with writing

Test II. Concurrent reading & writing
The case: the OpenLDAP directory consists of a flat list of about 100K entries. Each client sequentially performs 1000 requests. Each request is a read of a random existing entry by dn or by creating a new one. Three cases were tested: 99.9%, 99.0% and 90.0% of read operations against the corresponding number of write operations. A number of clients (1 - 20) were executed simultaneously.

So, we have three curves showing total time with regards to the number of concurrent clients in three different scenarios:
  • 99.9% reads; 0.01% writes;
  • 99.0% reads; 1% writes;
  • 90.0% reads; 10% writes.
Testing results are shown in two charts, containing absolute timing values and normalized by dividing timing value by the number of concurrent clients.






These data contradict the OpenLDAP 2.1.21 measurements of Andreas Krennmair and Rainer Lischka (http://www.lizenzfrei.at/downloads/ldap-paper.pdf), where time consumption grows linearly depending on the number of concurrent clients. Moreover, there is no big difference between 90%-read and 99.9%-read curves. So, we can say that OpenLDAP performance has been significantly improved since version 2.1.21.

Directory writing

Test III. Directory writing with indices
The case: the OpenLDAP directory is empty in the beginning of the test. It is indexed by cn and objectClass. A single client uploads bunches of 100 entries at a time under one particular parent and measures the time of each bunch upload. After 50 000 entries were uploaded, the client uploads 50 000 entries under another parent in the same manner. Then the directory contents are erased and 50 000 entries are uploaded in bunches of 100 entries at a time. Each bunch is stored under a different randomly named parent. So, we have three curves:
  • Upload under first parent depending on the number of entries already existing under this parent.
  • Upload under second parent depending on the number of entries already present under this parent (not regarding 50K entries that exist under first parent already).
  • Upload under different parents depending on the total number of entries in directory tree.

Obviously, storing data as a tree is preferable to storing data as a flat list, because it is 2-3 times faster.
Test IV. Directory writing without indices
The case: the OpenLDAP directory is empty in the beginning of the test. It is not indexed at all. A single client uploads bunches of 100 entries at a time under one particular parent and measures the time of each bunch upload. After 50 000 entries were uploaded, the client uploads 50 000 entries under another parent in the same manner. So, we have two curves:
  • Upload under first parent.
  • Upload under second parent.

As we can see, the children count of one parent merely affects the creation time of objects under another parent. But while creating objects under one particular parent, object creation time grows depending on the number of children of the parent. This proves the conclusion, made in Test III, that storing data as a tree is preferable to storing data as a flat list.
Test V. Lots of objects created with and without indices
The case: the OpenLDAP directory is empty in the beginning of the test. It is not indexed at all. A single client uploads bunches of 1000 entries at a time under different parents and measures the time of each bunch upload. After 500 000 entries were uploaded, directory content is erased, indices on cn and objectClass are added and 500 000 entries are uploaded in the same manner. So, we have two curves:
  • Upload without indices.
  • Upload with indices.

As we can see, indexed and non-indexed directory write performances become similar on big directory sizes.

Dynamic schema extending

Test VI. LDAP schema writing
The case: the OpenLDAP directory is empty. The OpenLDAP schema is the default core schema. A single client creates object classes using a ldapmodify operation. Each operation adds 100 object classes. Each operation execution time is measured.

As we can see, the time to extend the schema grows linearly with regards to the total number of objectClass entries. This may be a bottleneck in LDAP system scalability, because the directory tree may be replicated, clustered some way, but it is impossible to split the LDAP schema into parts. But there is some good news: objectClass creation does not depends on directory data size. (This test gives exactly the same results on an empty directory and on a directory filled with 100 000 objects.)

Conclusion

In our performance tests OpenLDAP has shown pretty good results in most test cases. We were glad to see dramatic concurrent writing performance improvement since version 2.1. But there are still a couple of issues with the current version.

First, the OpenLDAP client developer always faces a dilemma: either to use DN for identifying an entry and face the risk of losing renamed or moved entries, or to use entryUUID for identifying an entry and get much lower performance. It looks like it is very hard to solve this problem due to the distributed directory tree structure.

The second and the most important issue is poor performance of dynamic LDAP schema extension. Also, the slapd daemon starts very slowly when the directory schema contains 10K objectClasses, even with slaptest disabled. (On our test machine OpenLDAP startup took several minutes with slaptest disabled and 10K objectClasses in the schema.)

We are eagerly awaiting schema management performance improvements in the future releases.

Labels: ,

2 Comments:

Anonymous Anonymous said...

what tool did you use then for the testing?

December 22, 2009 8:45 AM  
Blogger Dmitry Korotkov said...

I used trivial load generation utility written in C.

March 9, 2010 2:55 AM  

Post a Comment

Subscribe to Post Comments [Atom]

Links to this post:

Create a Link

<< Home