Java tricks, reducing memory consumption
In this blog post, I want to discuss optimization of java memory usage. The Sun JDK has two simple-but-powerful tools for memory profiling -- jmap and jhat.
jmap has two important capabilities for memory profiling. It can:
• create a heap dump file for any live java process
• show a heap distribution histogram
Neither of these capabilities requires any special parameters for the Java virtual machine (JVM). Below is a heap distribution histogram produced by jmap.
3: 21461 13197200 [S
4: 148131 11862064
5: 231304 11562840
6: 448533 10764792 java.lang.String
7: 14801 9520800
8: 14801 6706536
9: 20178 6060872 [B
10: 250951 6022824 java.util.HashMap$Entry
11: 12530 5532096
12: 26538 4437384 [Ljava.util.HashMap$Entry;
13: 54179 4381328 [I
14: 73294 3939656 [Ljava.lang.Object;
15: 34872 3905664 org.eclipse.swt.graphics.TextLayout$StyleItem
16: 27360 1551928 [Ljava.lang.String;
17: 16056 1541376 java.lang.Class
18: 20222 1294208 org.eclipse.core.internal.resources.ResourceInfo
19: 47006 1128144 java.util.ArrayList
20: 25235 1009400 java.util.HashMap
21: 22911 983784 [[I
22: 13398 857472 org.eclipse.swt.custom.StyleRange
23: 14362 831832 [[C
24: 16264 780672 org.eclipse.ui.internal.handlers.HandlerActivation
25: 36992 591872 org.eclipse.ui.internal.handlers.LegacyHandlerListenerWrapper
26: 423 591304 [Lorg.eclipse.swt.custom.StyleRange;
27: 13427 537080 org.eclipse.jface.text.TreeLineTracker$Node
28: 20254 486096 org.eclipse.core.internal.dtree.DataTreeNode
29: 19700 472800 org.eclipse.swt.internal.win32.SCRIPT_ANALYSIS
30: 19700 472800 org.eclipse.swt.internal.win32.SCRIPT_STATE
31: 17108 410592 org.eclipse.jface.text.Line
32: 1255 401600
33: 16650 399600 java.util.Hashtable$Entry
34: 10952 350464 org.eclipse.ui.internal.services.EvaluationReference
35: 8665 346600 org.eclipse.ui.commands.HandlerSubmission
36: 10624 339968 java.util.TreeMap$Entry
37: 6598 307376 [Lorg.eclipse.swt.graphics.TextLayout$StyleItem;
38: 18520 296320 java.lang.Integer
39: 8567 274144 org.eclipse.ui.commands.ActionHandler
40: 11268 270432 org.eclipse.ui.internal.help.WorkbenchHelpSystem$6
...
Total 2894736 198472504
For each class we can see the class name, the number of instances of the class in the heap, and the number of bytes used in the heap by all instances of the class. The table is sorted by consumed heap space. Although it is very simple, it is extremely useful. This information is enough to diagnose 60% of heap capacity problems.
If a heap distribution histogram is too high-level, you can try jhat. jhat can read and explore a heap dump. It has a web interface and you can click though your dumped object graph. Of course, clicking through a few million objects is not fun. Fortunately, jhat supports query language. Let's give it a try.
>jmap -dump:file=eclipse.heap 16608
Dumping heap to C:\Program Files (x86)\Java\jdk1.6.0_11\bin\eclipse.heap ...
Heap dump file created
>jhat -J-Xmx1G eclipse.heap
Reading from eclipse.heap...
Dump file created Wed Oct 14 08:41:07 MSD 2009
Snapshot read, resolving...
Resolving 2300585 objects...
Chasing references, expect 460 dots.............................................
................................................................................
................................................................................
.......................................
Eliminating duplicate references................................................
................................................................................
................................................................................
....................................
Snapshot resolved.
Started HTTP server on port 7000
Server is ready.
Now open http://localhost:7000 and you can see a summary of all classes. You can use standard queries via links or go straight to the “Execute Object Query Language (OQL) query” link at the bottom and type your own query. Query language is quite awkward (it is based on the java script engine) and may not work well for large numbers of objects, but it is a very powerful tool.
Enterprise applications are 80% strings and maps
Let’s look at the jmap histogram again. In the top row, we can see "[C" class consuming most of the heap space. Actually, these char arrays are part of String objects, and we can see that String instances are also consuming considerable space. From my experience, 60-80% of heaps in an enterprise application are consumed by strings and hash maps.
Strings
Let's look how the JVM is storing strings. String objects are semantically immutable. Each instance has four fields (all except hash are marked final):
• Reference to char array
• Integer offset
• Integer count of character
• Integer string hash (lazily evaluated, and once evaluated never changes)
How much memory does one String instance consume?
Here and below are size calculations for the Sun JVM (32bit). This should be similar for other vendors.
Object header (8 bytes) + 3 refs (12 bytes) + int (4 bytes) = 24 bytes. But the String instance is only a header. Actual text is stored in char array (2 bytes each character + 12 bytes array header).

String instances can share char arrays with each other. When you call substring(…) on a String instance, no new char array allocation happens. Instead, a new String referencing subrange of existing char arrays is created. Sometimes this can become a problem. Imagine you are loading a text file (e.g., CSV). First you are loading the entre line as string, then you seek the position of the field and call substring(…). Now your small field value string object has a reference to an entry line of text. Sometime later, a string header for the text line object is collected, but characters are still in memory because they are referenced via other string object.

In the illustration above, useful data is marked by yellow. We can see that some characters cannot be accessed any more, but still occupy space in the heap. How can you avoid such problems?
If you are creating a String object with a constructor, a new char array is always allocated. To copy content of a substring you can use the following construct (it looks a bit strange, but it works):
new String(a.substring(…))
String.intern() – think twice
String class has a method intern() that can guarantee the following:
a.equals(b) => a.intern() == b.intern()
There is a table inside of the JVM used to store normal forms of strings. If some text is found in a table then value from a table will be returned from intern(), else the string will be added to table. So a reference returned by intern() is always an object from JVM intern table.
String intern table keep weak references to its objects, so unused strings can be collected as garbage when no other references exist except from intern table itself. It looks like a great idea to use intern(), and eliminate all duplicated strings in an application. Many have tried … and many have regretted such decision. I cannot say this is true for every JVM vendor, but if you are using Sun’s JVM, you should never do this. Why?
String intern tables in JVMs work perfectly for the JVM’s needs (new entries are added only while loading new classes so insertion time is not a big issue) and it is very compact. But it is completely unsuitable for storing millions of application strings. It just was not designed for such a use case.
Removing of duplicates
The JVM’s string intern table is not an option but the idea of eliminating string duplicates is very attractive. What can we do?
public
class InternTable {
private
final Map table = new HashMap();
public X intern(X val) {
X in = table.get(val);
if (in == null) {
table.put(val, val);
in = val;
}
return in;
}
}
Above is a simple snippet of a custom intern table. It can be used for strings or any other immutable objects. Looks good, but it has a problem. Such an implementation will prevent objects from being collected by GC. What can we do?
We need to use weak references. There are WeakHashMap classes in the JDK. It is a hash table that uses weak references for keys. Let's try to use it.
public
class InterTable {
private
final Map table = new WeakHashMap ();
public X intern(X val) {
WeakReference ref = table.get(val);
X in = ref == null ? null : ref.get();
if (in == null) {
table.put(val, new WeakReference(val));
in = val;
}
return in;
}
}
Did you notice we also need a weak reference for the wrap value? This will work, but what is the cost?
• Reference from hash table – 1 ref * ratio of unused slots (~6 bytes)
• WeakHashMap$Entry object – 40 bytes
• value WeakReference – 24 bytes
The cost per entry in the table is about 70 bytes. That’s expensive. Can we reduce it?
Right now we have to have two weak references per entry. If we rewrite WeakHashMap to return an entry from the get(...) method (instead of the value), we can drop the second weak reference and save 24 bytes. But the cost of such an intern table is still high. You should analyze/experiment with your data to see if such a trick will bring greater benefits in your case.
UTF8 strings
Java strings are UTF and encoded using UTF16 in memory. If they are to be converted to UTF8, the actual text is likely to consume half the memory. But using UTF8 will break compatibility with the String class. You have to create your own class and convert it to standard String for a majority of operations. This will increase CPU usage, but in some edge cases this approach can be useful for overcoming heap size limitations (e.g., storing large text indexes in memory).
Maps and sets
Good old java.util.HashMap is used everywhere. Standard implementation in JDKs is to use the open hash table data structure.

References to key and value are stored in the Entry object (which also keeps the hash for the key for faster access). If several keys are mapped to same hash slot, they are stored as a list of interlinked entries. The size of each entry structure: object header + 3 references + int = 24 bytes. java.util.HashSet is using java.util.HashMap under the hood so your overhead will be the same. Can we store map/set in more compact form? Sure.
Sorted array map
If the keys are comparable, we can store the map or set as sorted arrays of interleaved key/value pairs (array of keys in the case of a set). Binary search can be used for fast lookups in an array, but insertion and deletion of entries will have to shift elements. Due to the high cost of updates in such a structure, it will be effective only for smaller collection sizes, and for operation patterns that are mostly read. Fortunately, this is usually what we have -- map/set of a few dozen objects that are read more often than modified.
Closed hash table
If we have a collection of a larger size, we should use a hashtable. Can we make the hashtable more compact? Again, the answer is yes. There are data structures called closed hashtables that do not require entry objects.

In closed hashtables, references from the hashtable point directly to an object (key). What if we want to put in a reference to a key but the hash slot is occupied already? In such a case, we should find another slot (e.g., next one). How do you lookup a key? Search through all adjacent slots until key or null reference is found. As you can see from algorithm, it is very important to have enough empty slots in your table. Density of closed hash tables should be kept below 0.5 to avoid performance degradation.
Maps structures summary
The JDK is using an open hashtable structure because in general case it is a better structure. But when you are desperate to save memory, the other two options are worth considering.
Conclusion
Optimization is an art. There are no magical data structures capable of solving every problem. As you can see, you have to fight for every byte. Memory optimization is a complex process. Remember that you should design your data so each object can be referenced from different collections (instead of having to copy data). It is usually better to use semantically immutable objects because you can easily share them instead of copying them. And from my experience, in a well-designed application, optimization and tuning can reduce memory usage by 30-50%. If you have a very large amount of data, you have to be ready to handle it. At Grid Dynamics, that’s our day-to-day job! So, if you are building system capable of handling enormous amounts of data, don’t hesitate to ask us for assistance :)
jmap has two important capabilities for memory profiling. It can:
• create a heap dump file for any live java process
• show a heap distribution histogram
Neither of these capabilities requires any special parameters for the Java virtual machine (JVM). Below is a heap distribution histogram produced by jmap.
>jmap -histo:live 16608
num #instances #bytes class name
----------------------------------------------
1: 464773 45411544 [C
2: 148131 22404200 3: 21461 13197200 [S
4: 148131 11862064
5: 231304 11562840
6: 448533 10764792 java.lang.String
7: 14801 9520800
8: 14801 6706536
9: 20178 6060872 [B
10: 250951 6022824 java.util.HashMap$Entry
11: 12530 5532096
12: 26538 4437384 [Ljava.util.HashMap$Entry;
13: 54179 4381328 [I
14: 73294 3939656 [Ljava.lang.Object;
15: 34872 3905664 org.eclipse.swt.graphics.TextLayout$StyleItem
16: 27360 1551928 [Ljava.lang.String;
17: 16056 1541376 java.lang.Class
18: 20222 1294208 org.eclipse.core.internal.resources.ResourceInfo
19: 47006 1128144 java.util.ArrayList
20: 25235 1009400 java.util.HashMap
21: 22911 983784 [[I
22: 13398 857472 org.eclipse.swt.custom.StyleRange
23: 14362 831832 [[C
24: 16264 780672 org.eclipse.ui.internal.handlers.HandlerActivation
25: 36992 591872 org.eclipse.ui.internal.handlers.LegacyHandlerListenerWrapper
26: 423 591304 [Lorg.eclipse.swt.custom.StyleRange;
27: 13427 537080 org.eclipse.jface.text.TreeLineTracker$Node
28: 20254 486096 org.eclipse.core.internal.dtree.DataTreeNode
29: 19700 472800 org.eclipse.swt.internal.win32.SCRIPT_ANALYSIS
30: 19700 472800 org.eclipse.swt.internal.win32.SCRIPT_STATE
31: 17108 410592 org.eclipse.jface.text.Line
32: 1255 401600
33: 16650 399600 java.util.Hashtable$Entry
34: 10952 350464 org.eclipse.ui.internal.services.EvaluationReference
35: 8665 346600 org.eclipse.ui.commands.HandlerSubmission
36: 10624 339968 java.util.TreeMap$Entry
37: 6598 307376 [Lorg.eclipse.swt.graphics.TextLayout$StyleItem;
38: 18520 296320 java.lang.Integer
39: 8567 274144 org.eclipse.ui.commands.ActionHandler
40: 11268 270432 org.eclipse.ui.internal.help.WorkbenchHelpSystem$6
...
Total 2894736 198472504
For each class we can see the class name, the number of instances of the class in the heap, and the number of bytes used in the heap by all instances of the class. The table is sorted by consumed heap space. Although it is very simple, it is extremely useful. This information is enough to diagnose 60% of heap capacity problems.
If a heap distribution histogram is too high-level, you can try jhat. jhat can read and explore a heap dump. It has a web interface and you can click though your dumped object graph. Of course, clicking through a few million objects is not fun. Fortunately, jhat supports query language. Let's give it a try.
>jmap -dump:file=eclipse.heap 16608
Dumping heap to C:\Program Files (x86)\Java\jdk1.6.0_11\bin\eclipse.heap ...
Heap dump file created
>jhat -J-Xmx1G eclipse.heap
Reading from eclipse.heap...
Dump file created Wed Oct 14 08:41:07 MSD 2009
Snapshot read, resolving...
Resolving 2300585 objects...
Chasing references, expect 460 dots.............................................
................................................................................
................................................................................
.......................................
Eliminating duplicate references................................................
................................................................................
................................................................................
....................................
Snapshot resolved.
Started HTTP server on port 7000
Server is ready.
Now open http://localhost:7000 and you can see a summary of all classes. You can use standard queries via links or go straight to the “Execute Object Query Language (OQL) query” link at the bottom and type your own query. Query language is quite awkward (it is based on the java script engine) and may not work well for large numbers of objects, but it is a very powerful tool.
Let’s look at the jmap histogram again. In the top row, we can see "[C" class consuming most of the heap space. Actually, these char arrays are part of String objects, and we can see that String instances are also consuming considerable space. From my experience, 60-80% of heaps in an enterprise application are consumed by strings and hash maps.
Strings
Let's look how the JVM is storing strings. String objects are semantically immutable. Each instance has four fields (all except hash are marked final):
• Reference to char array
• Integer offset
• Integer count of character
• Integer string hash (lazily evaluated, and once evaluated never changes)
How much memory does one String instance consume?
Here and below are size calculations for the Sun JVM (32bit). This should be similar for other vendors.
Object header (8 bytes) + 3 refs (12 bytes) + int (4 bytes) = 24 bytes. But the String instance is only a header. Actual text is stored in char array (2 bytes each character + 12 bytes array header).


If you are creating a String object with a constructor, a new char array is always allocated. To copy content of a substring you can use the following construct (it looks a bit strange, but it works):
new String(a.substring(…))
String.intern() – think twice
String class has a method intern() that can guarantee the following:
a.equals(b) => a.intern() == b.intern()
There is a table inside of the JVM used to store normal forms of strings. If some text is found in a table then value from a table will be returned from intern(), else the string will be added to table. So a reference returned by intern() is always an object from JVM intern table.
String intern table keep weak references to its objects, so unused strings can be collected as garbage when no other references exist except from intern table itself. It looks like a great idea to use intern(), and eliminate all duplicated strings in an application. Many have tried … and many have regretted such decision. I cannot say this is true for every JVM vendor, but if you are using Sun’s JVM, you should never do this. Why?
JVM string intern tables are stored in PermGen -- a separate region of the heap (not included in -Xmx size) used for the JVM’s internal needs. Garbage collection in this area is expensive and size is limited (though configurable). You would have to insert a new string into the table which has O(n) complexity, where n is table size.
Removing of duplicates
The JVM’s string intern table is not an option but the idea of eliminating string duplicates is very attractive. What can we do?
class InternTable
private
final Map
public X intern(X val) {
X in = table.get(val);
if (in == null) {
table.put(val, val);
in = val;
}
return in;
}
}
Above is a simple snippet of a custom intern table. It can be used for strings or any other immutable objects. Looks good, but it has a problem. Such an implementation will prevent objects from being collected by GC. What can we do?
We need to use weak references. There are WeakHashMap classes in the JDK. It is a hash table that uses weak references for keys. Let's try to use it.
public
class InterTable
private
final Map
public X intern(X val) {
WeakReference
X in = ref == null ? null : ref.get();
if (in == null) {
table.put(val, new WeakReference
in = val;
}
return in;
}
}
Did you notice we also need a weak reference for the wrap value? This will work, but what is the cost?
• Reference from hash table – 1 ref * ratio of unused slots (~6 bytes)
• WeakHashMap$Entry object – 40 bytes
• value WeakReference – 24 bytes
The cost per entry in the table is about 70 bytes. That’s expensive. Can we reduce it?
Right now we have to have two weak references per entry. If we rewrite WeakHashMap to return an entry from the get(...) method (instead of the value), we can drop the second weak reference and save 24 bytes. But the cost of such an intern table is still high. You should analyze/experiment with your data to see if such a trick will bring greater benefits in your case.
UTF8 strings
Java strings are UTF and encoded using UTF16 in memory. If they are to be converted to UTF8, the actual text is likely to consume half the memory. But using UTF8 will break compatibility with the String class. You have to create your own class and convert it to standard String for a majority of operations. This will increase CPU usage, but in some edge cases this approach can be useful for overcoming heap size limitations (e.g., storing large text indexes in memory).
Maps and sets
Good old java.util.HashMap is used everywhere. Standard implementation in JDKs is to use the open hash table data structure.

Sorted array map
If the keys are comparable, we can store the map or set as sorted arrays of interleaved key/value pairs (array of keys in the case of a set). Binary search can be used for fast lookups in an array, but insertion and deletion of entries will have to shift elements. Due to the high cost of updates in such a structure, it will be effective only for smaller collection sizes, and for operation patterns that are mostly read. Fortunately, this is usually what we have -- map/set of a few dozen objects that are read more often than modified.
Closed hash table
If we have a collection of a larger size, we should use a hashtable. Can we make the hashtable more compact? Again, the answer is yes. There are data structures called closed hashtables that do not require entry objects.

In closed hashtables, references from the hashtable point directly to an object (key). What if we want to put in a reference to a key but the hash slot is occupied already? In such a case, we should find another slot (e.g., next one). How do you lookup a key? Search through all adjacent slots until key or null reference is found. As you can see from algorithm, it is very important to have enough empty slots in your table. Density of closed hash tables should be kept below 0.5 to avoid performance degradation.
Maps structures summary
| Open hash map/set Cost per entry:
+ Fast access to hash code. | Closed hash map/set Cost per entry (map):
| Sorted array map/ser Cost per entry (map):
- Expensive update/delete. |
The JDK is using an open hashtable structure because in general case it is a better structure. But when you are desperate to save memory, the other two options are worth considering.
Optimization is an art. There are no magical data structures capable of solving every problem. As you can see, you have to fight for every byte. Memory optimization is a complex process. Remember that you should design your data so each object can be referenced from different collections (instead of having to copy data). It is usually better to use semantically immutable objects because you can easily share them instead of copying them. And from my experience, in a well-designed application, optimization and tuning can reduce memory usage by 30-50%. If you have a very large amount of data, you have to be ready to handle it. At Grid Dynamics, that’s our day-to-day job! So, if you are building system capable of handling enormous amounts of data, don’t hesitate to ask us for assistance :)
Labels: capacity, Java, memory, ~Alexey Ragozin

1 Comments:
That was a great explanation. I am having many troubles with memory consumption and I didn't have any idea about these tools.
Post a Comment
Subscribe to Post Comments [Atom]
Links to this post:
Create a Link
<< Home