Signum Framework Logo
"Open framework that encourages convention over configuration, using C# code,
not XML files, to model at the right level of abstraction and achieve deadlines.
...but also has a full Linq provider, and syncs the schema for you!"
Login
RSS

Search

»



Main
Index
Map
Videos
Download
Source code
Tutorials
Forum
FAQ



Image



PoweredBy

Introduction

A Thread is a new stack, so all the local variables and parameters are completely isolated from one thread to another. The problems appear when trying to modify global state, usually through static fields. Like updating a shared data structure concurrently.

Locks and Interlocked class

The normal solution is to surround all the potentially problematic code (critical region) with a lock. Locks are cool, they do the job and are nicely integrated in C# language... but they are SLOW:

  • If the critical region is big enough, all the threads are waiting. It's better to have one thread than five, if four of them are waiting all the time.
  • There's an important overhead in acquiring and releasing a lock.
  • In the case of an exception you have to roolback your action, that's something really hard to do in an imperative language currently, so nobody does.

Interlocked class, on the other hand, provides some lock-free operation written straight in the CPU for very simple scenarios like Add, Increment, Decrement, Read and Exchange simple values. Also, there's a odd operation, CompareExchange:

  public static T CompareExchange<T>( ref T location1, T value, T comparand) where T : class

"Compares two instances of the specified reference type T for equality and, if they are equal, replaces one of them."

This is what it actually does, but atomically in one Processor instruction

public static T CompareExchange<T>( ref T location1, T value, T comparand)
{
   T result = location1;
   if(location1 == comparand)
      location = value;
   
   return result;      
}

It takes 3 parameters, all of the same type, the first by reference and also returns another value. Too hard for a human being.

Sync.SafeUpdate

And here is where Sync. SafeUpdate comes to help. SafeUpdate gives you a easier model to reason about lock-free code:

This method was name SafeSet in the previous version of the Framework

public static void SafeUpdate <T>(ref T variable, Func<T, T> repFunc) where T: class
{
    T oldValue, newValue;
    do
    {
        oldValue = variable;
        newValue = repFunc(oldValue);
 
        if (newValue == null)
            break;
   
    } while (Interlocked.CompareExchange<T>(ref variable, newValue, oldValue) != oldValue);
}

It takes a variable, and a function that, given a value of the variable, generates a new, updated, value.

These functions have to be prepared to be called repeatedly in an nondeterministically way, but on average it will run just once.

Also, if null is returned by the function, the updating operation is aborted.

So, in order to create lock free code, we just need to represent a whole state change as a change in just one reference.

Sync and Immutability

This is where Immutable Data Structures comes to the rescue! Since they have the property of never changing but instead creating new objects, and since they are reference objects, at the end of the day adding an element to a global and concurrent collection is just changing the old collection for a new copy with the new element included.



Sweeeeettt!!!

Example 0: Single Threaded

Let's think we want to calculate the prime factors of the first 10,000 numbers. We write a code like this:

static Dictionary<int, int[]> dictionary = new Dictionary<int, int[]>(); 

(...) for (int i = 0; i < 10000; i++) { int[] primes = Factors(i).ToArray(); dictionary.Add(i, primes); }

Since it takes some time to run, and since this code will be deployed in a 16-core server, you split the work in 10 threads:

Example 1: Multi-Threaded with Lock

static Dictionary<int, int[]> dictionary = new Dictionary<int, int[]>(); 

(...) Thread[] thread = 0.To(10).Select(tn => new Thread(() => { for (int i = 0; i < 1000; i++) { int number = 10 * i + tn; int[] primes = Factors(dictionary).ToArray();

lock (dictionary) dictionary.Add(number, primes); } }) { Name = "Thread " + tn }).ToArray();

thread.ForEach(t => t.Start()); thread.ForEach(t => t.Join());

You know what, very often this code is even more expensive because the overhead of the lock. Let's try with an ImmutableAVLTree and SafeUpdate.

By the way, you shouldn't be controlling the number of threads explicitly. We do here just for pedagogic reasons. Use something as Parallel Extensions instead.


Example 2: Multi-Threaded with SafeUpdate

static ImmutableAVLTree<int, int[]> tree= ImmutableAVLTree<int, int[]>.Empty;

(...) Thread[] thread = 0.To(10).Select(tn => new Thread(() => { for (int i = 0; i < 1000; i++) { int number = 10 * i + tn; int[] primes = Factors(number).ToArray();

Sync.SafeUpdate(ref tree, t=> t.Add(number, primes)); } }) { Name = "Thread " + tn }).ToArray();

thread.ForEach(t => t.Start()); thread.ForEach(t => t.Join());

We have changed factors Dictionary to an InmutableAVLTree, and we are using non-locking Sync.SafeUpdate to make the changes. These solutions, even having to create many tree nodes is faster than the previous one.

But there's something far more important! In the previous example anyone who tried to READ the dictionary must do that locking it first, with immutable data structures this is not necessary since nobody is changing it. Just catch the current reference and you will have a consistent view of the structure.

Let's use this new possibility to improve the algorithm.

Example 3: Multi-Threaded with SafeSet and ImmutableStack

Factors, the function that does the calculations in the previous version, was like this:

static IEnumerable<int> Factors(int value)
{
   int i = 2;
   int n = value; 
   do
   {
       if ((n % i) == 0)
       {
           n = n / i;
           yield return i;
       }
       else i = NextPrime(i);
   }
   while (i <= n);
} 

The problem here is that, when calculating Factors(224), the result is just Factors(112).And(2). This symmetry, however, couldn't be exploited in the previous example because consulting the dictionary would need to lock int first (just in case some thread is modifying the dictionary) making it even slower.

But now it's a different story... we can read the collection in a consistent fashion since nobody is going to modify it.

Also, in order to avoid creating many Arrays with almost the same data, we will use ImmutableStack instead. This way we can concatenate solutions to make bigger solutions in a very cheap way:

Factors(10) = [2]->[5]->[]
Factors(20) = [2]->[2]->[5]->[] = [2]->Factors(19)


This is how the new Factors function looks:

static ImmutableStack<int> Factors(int value)
{
    ImmutableStack<int> result = ImmutableStack<int>.Empty;
    int i = 2;
    int n = value;
    do
    {
        ImmutableStack<int> tail;
        if (tree.TryGetValue(n, out tail))
        {
            return result.Reverse(tail);
        }
        else if ((n % i) == 0)
        {
            n = n / i;
            result = result.Push(i);
        }
        else i = NextPrime(i);
    }
    while (i <= n);

return result.Reverse(); }

Unfortunately, this new code is not very fast:
  • Lots of time goes to make the Reverse operations.
  • There is far too much access to the tree with no success.

The solution for both things comes from:
  • Making the function recursive
  • Adding just one element at a time and avoiding concatenating ImmutableStack using Reverse
  • Catching the results for every single number

The last point, catching the results, will need a bigger sword than SafeSet to do it. Ladies and Gentlemen, here we have SafeGetOrCreate:

Sync.SafeGetOrCreate

SafeGetOrCreate is a mixture of Sync.SafeSet and DictionaryExtension's GetOrGreate.

If works over a ImmutableAVLTree, and takes a key, and a function to generate the value.

public static V SafeGetOrCreate<K, V>(ref ImmutableAVLTree<K, V> tree, K key, Func<V> createValue)
    where K : IComparable<K>
{
    V result;
    if (tree.TryGetValue(key, out result))
        return result;

V value = createValue();

SafeSet(ref tree, t => { if (t.TryGetValue(key, out result)) return null; else { result = value; return t.Add(key, value); } });

return result; }

  • If the tree already has the key, then the value associated is returned, and no change is made.
  • Otherwise createValue is evaluated (just once!) and SafeSet tries to add in to the tree. In every attempt, however, someone could have added the same value that we are trying to add:
    • If the value have been added, this value is returned and no change is made (return null)
    • Otherwise the attempts to add the value continue, and the calculated value is set as the result.

Let's continue our example with this new powerful thing:

Example 4: Multi-Threaded with SafeGetOrCreate

The new recursive version of our algorithm is like this:

static ImmutableStack<int> Factors(int value, int minPrime)
{
    return Sync.SafeGetOrCreate(ref tree, value, () =>
    {
        if (value == 1)
            return ImmutableStack<int>.Empty; //1 is not prime
        else if (value <= minPrime)
            return ImmutableStack<int>.Empty.Push(value);  //the number is prime

int i = minPrime; do { if ((value % i) == 0) return Factors(value / i, i).Push(i); else i = NextPrime(i); } while (i <= value);

return null; //never happens }); }

It uses minPrime to keep the information of the minimum prime value that has been tried to divide value, this information was kept by i variable in previous versions.

It contains two base cases, for one and for any other prime. If none are satisfied the previous algorithm is used, but when the first solution is found a recursive call is made, pushing just one element more to the head to build our solution.

Cool. But the important thing here is the first line, just by calling SafeGetOrCreate we are cathing the value in a global and immutable variable: tree.

Since every single element is being cached, we can remove the line that does just that in the main for, ending up:

Thread[] thread = 0.To(10).Select(tn => new Thread(() =>
{
    for (int i = 0; i < 1000; i++)
    {
        int number = 10 * i + tn;
        Factors(number, 2); //no need to store it
    }
}) { Name = "Thread " + tn }).ToArray();

thread.ForEach(t => t.Start()); thread.ForEach(t => t.Join());

Also note that we use 2 as the initial value for minPrime, since is was the initial value of i in the previous examples (it's the smallest prime value).

Conclusion

Cool, you are lost right? Your brain hurts and you have no idea what's going on at runtime.

Same for me.

Mixing lambdas, recursion and threading is not something our primate brain is designed for.

In fact, this version is just slightly faster than the first, naive, version of the code in my dual-core laptop. Is it worth it? Probably not in this case.

But the important thing is that: By using Immutable Data Structures and Sync primitives properly we can play with many threads in a lock-free-optimistic-transaction-fashion and things just won't go wrong.

And this is something worth remembering.
Creative Commons License Signum Framework Site by Signum Software is licensed under a Creative Commons Attribution 3.0 License.
Powered by ScrewTurn Wiki version 3.0.5.600.