With some future prospects of writing a high scale LOB Silverlight application, I thought performance of the application would be crucial. I decided to look into writing optimized code and before that I decided to find out what is the performance impact of the collections that I use most commonly. To be specific, in this post, I only look at the performance of creating various collections of type int.
Large Collections
I wrote a simple performance test class which returns me collections of various types. Each method create a list of pretty large size.
public static int MAX = 19999999;
The results of the test program is shown below.
Name | Collection Type Explained |
Ar | Array (int[] array) |
Ar_Enble | array.AsEnumerable() |
Ar_En_Cast | IEnumerable<int> e = array; |
List | List<int> |
IniList | List<int> l = new List<int>(MAX); |
ArList | ArrayList |
Range | new List<int>(GetArray()); |
LL | LinkedList<int> |
From the results, it is obvious that an array is the fastest collection that you can use and on an average it is 28 times faster than ArrayList and 3 times faster than the List<T>.
Note that for this discussion, I only am using entries with green color. This is based on the assumption that GetArray() should be faster than first getting the array and then IEnumerable out of the array. Some of the key points I made out of this experiment:
I would further like to study the List<T> class in detail since it is one collection that I use the most. In the mean time, if you are further interested in application performance, then look at the link http://msdn.microsoft.com/en-us/library/ms998574.aspx#scalenetchapt13_topic5
Update
I just came across the following points in MSDN.
How to choose between arrays and collections
Arrays are the fastest of all collection types, so unless you need special functionalities like dynamic extension of the collection, sorting, and searching, you should use arrays. If you need a collection type, choose the most appropriate type based on your functionality requirements to avoid performance penalties.
The link : http://msdn.microsoft.com/en-us/library/ms998530.aspx
Why do I find Hashtable slow?
I re-conducted my experiment using Hashtable as a list (key and value) would be the same! It is dead slow, I repeat very-very slow - 1000 times slower than using an array and 80 times slower than the slowest list (un-initialized, added items witout using AddRange). But the guidelines states that Hashtable should be used to store large number of records. Is it really?
Relatively HashSet<int> was much faster. It was atleast 10 times faster to say the least. Note that you cannot use HashSet<int> instead of List<int> (since List<int> is obviously faster) because HashSet does not allow duplicate elements and is designed for high performance set operations. May not be a great test that one should really trust, but I feel simple collections are what used most of the times, so next time I would retest using objects instead of data types. My test results.
Ht - Hashtable and HS - HashSet<int>
From now on, I make it a point to use initialized List<int> since its performance is close to int[] and yet it is dynamic in nature. I hope not to use HashSet unless I have extremely complicated Set operations. I would use HashSet if (time taken to create HashSet + time taken to perform set operations) is less than (time taken to create a List + time taken to perform set operations using LINQ).
Performance is an interesting topic and the more you dig into it, the more knowledgeable you become about a system.