The .NET Base Class Library (BCL) has a wide array of collection classes at your disposal which make it easy to manage collections of objects. While it's great to have so many classes available, it can be daunting to choose the right collection to use for any given situation. As hard as it may be, choosing the right collection can be absolutely key to the performance and maintainability of your application!
This post will look at breaking down any confusion between each collection and the situations in which they excel. We will be spending most of our time looking at the System.Collections.Generic namespace, which is the recommended set of collections.
The Generic Collections: System.Collections.Generic namespace
The generic collections were introduced in .NET 2.0 in the System.Collections.Generic namespace. This is the main body of collections you should tend to focus on first, as they will tend to suit 99% of your needs right up front.
It is important to note that the generic collections are unsynchronized. This decision was made for performance reasons because depending on how you are using the collections its completely possible that synchronization may not be required or may be needed on a higher level than simple method-level synchronization. Furthermore, concurrent read access (all writes done at beginning and never again) is always safe, but for concurrent mixed access you should either synchronize the collection or use one of the concurrent collections.
So let's look at each of the collections in turn and its various pros and cons, at the end we'll summarize with a table to help make it easier to compare and contrast the different collections.
The Associative Collection Classes
Associative collections store a value in the collection by providing a key that is used to add/remove/lookup the item. Hence, the container associates the value with the key. These collections are most useful when you need to lookup/manipulate a collection using a key value. For example, if you wanted to look up an order in a collection of orders by an order id, you might have an associative collection where they key is the order id and the value is the order.
The Dictionary<TKey,TVale> is probably the most used associative container class. The Dictionary<TKey,TValue> is the fastest class for associative lookups/inserts/deletes because it uses a hash table under the covers. Because the keys are hashed, the key type should correctly implement GetHashCode() and Equals() appropriately or you should provide an external IEqualityComparer to the dictionary on construction. The insert/delete/lookup time of items in the dictionary is amortized constant time - O(1) - which means no matter how big the dictionary gets, the time it takes to find something remains relatively constant. This is highly desirable for high-speed lookups. The only downside is that the dictionary, by nature of using a hash table, is unordered, so you cannot easily traverse the items in a Dictionary in order.
The SortedDictionary<TKey,TValue> is similar to the Dictionary<TKey,TValue> in usage but very different in implementation. The SortedDictionary<TKey,TValye> uses a binary tree under the covers to maintain the items in order by the key. As a consequence of sorting, the type used for the key must correctly implement IComparable<TKey> so that the keys can be correctly sorted. The sorted dictionary trades a little bit of lookup time for the ability to maintain the items in order, thus insert/delete/lookup times in a sorted dictionary are logarithmic - O(log n). Generally speaking, with logarithmic time, you can double the size of the collection and it only has to perform one extra comparison to find the item. Use the SortedDictionary<TKey,TValue> when you want fast lookups but also want to be able to maintain the collection in order by the key.
Read more: James Michael Hare
0 comments:
Post a Comment