Two levels of performance optimization

Two levels of performance optimization

I have to match [entries against a list]. I have used a foreach loop to do so but it is taking almost 40 seconds for only 350 records. I want to speed it up. I want my foreach loop to convert into for loop [...]

Original question/answer @ Stackoverflow

At such a point it is a good idea to just step back and think about what we are doing here in general. Performance optimization usually comes on two levels: algorithms and workflow.

Essentially, the problem to be solved here is to find an entry in a potentially large set of records. There may be two causes why this is slow:

The list is very large

Yo dawg!

If we remember our knowledge about the so-called Big-O notation, we may quickly find that an array search takes O(n) while a hash set search for example would only take O(1) in normal cases, only in the worst case we will be again down to O(n). Not bad!

By some lucky coincident (and by the help of the cheat sheet linked above) we find out that a Dictionary<K,V> or alternatively a database index is more or less what we want: A dictionary is basically a "pimped" hash set, and a database index is typically a B-Tree or something similar, which performs with O(log(n)). The final decision whether we should use a dictionary or a database table mostly depends on the amount of data we are talking about.

How can I make it faster?

As a practical example, I recently had a piece of code on my table that iterated through a list in the same linear manner. The code did this inside of two other, nested loops. The algorithm took 4 hours to complete. After introducing two dictionaries at strategic places I had it down to under a minute. I leave it to the amused reader to calculate the percentage gain.

Hence, the real question to ask is not "Is for faster than foreach?" (no) but instead we should ask: "How can I reorganize my data structures and optimize the algorithms to make it perform?"

The code is called quite often

That's another, but related problem. In some cases the data structures can't really be optimized, or it would cause way too many changes in the code. But when we closely look at what the profiler is telling us, we may find that the costly routine is called 800.000 times in 15 seconds and that these calls alone contribute a fair amount to the total time.

Why am I doing this at all?

If we look even closer, we may find that we call the routine with a very limited set of input data, so that essentially a large portion of the calls may just be omitted by caching the results of the costly operation. I just had such a case last week where I was able to reduce the amount of database calls down to 5% of the original amount. One can imagine what that did to overall performance.

In this second case we therefore should ask ourselves a slightly different question: "Why are we doing this at all? Can we change the logical workflow to avoid most of these calls? Is there maybe a completely different way to achieve the same results?".

Summary (TL;DR)

There are two basic approaches with every performance optimization:

  1. Algorithmic or "low level", ask how?: Quicksort or Bubblesort? Tree, Array or Hash table?
  2. Workflow and Logic, ask why?: Why do we have to call that particular costly routine 5 million times?