Pondělí, 17. březen 2008

Implement your own Parallel.For in C#

One of the constructs of future Parallel FX library is Parallel.For which works like the ordinary for loop but executes steps in parallel using more cores of your machine.

I was thinking about how Parallel.For could be implemented. I wrote my own, I am using it in my own project and it scales very well.

Here it is:

public delegate void ForDelegate(int i);

public delegate void ThreadDelegate();

 

public class Parallel

{

/// <summary>

/// Parallel for loop. Invokes given action, passing arguments

/// fromInclusive - toExclusive on multiple threads.

/// Returns when loop finished.

/// </summary>

public static void For(int fromInclusive, int toExclusive, ForDelegate action)

{

    // ChunkSize = 1 makes items to be processed in order.

    // Bigger chunk size should reduce lock waiting time and thus

    // increase paralelism.

    int chunkSize = 4;

 

    // number of process() threads

    int threadCount = Environment.ProcessorCount;

    int cnt = fromInclusive - chunkSize;

 

    // processing function

    // takes next chunk and processes it using action

    ThreadDelegate process = delegate()

    {

        while (true)

        {

            int cntMem = 0;

            lock (typeof(Parallel))

            {

                // take next chunk

                cnt += chunkSize;

                cntMem = cnt;

            }

            // process chunk

            // here items can come out of order if chunkSize > 1

            for (int i = cntMem;  i < cntMem + chunkSize; ++i)

            {

                if (i >= toExclusive) return;

                action(i);

            }

        }

    };

 

    // launch process() threads

    IAsyncResult[] asyncResults = new IAsyncResult[threadCount];

    for (int i = 0; i < threadCount; ++i)

    {

        asyncResults[i] = process.BeginInvoke(null, null);

    }

    // wait for all threads to complete

    for (int i = 0; i < threadCount; ++i)

    {

        process.EndInvoke(asyncResults[i]);

    }

}



As noted in the code, by setting chunkSize to 1 we can make the items be processed in order. With bigger chunk size, items can be processed in mixed non-deterministic order, which is ok in some application, like when you want to modify all items of a collection or render lines of a picture.
Bigger chunkSize should reduce lock waiting time and thus increase overall speed. But too big chunkSize is bad too, because if the work is split into only a few big parts, there is not enoug paralellism exposed - we could found ourselves waiting for the last single thread to finish its huge chunk.

Using Parallel.For is simple:

Parallel.For(0, 1000, delegate(int i)

{

    // your parallel code

    Thread.Sleep(100);

    Console.WriteLine(i);

});



kick it on DotNetKicks.com

0 komentářů: