Perils of Async: Locking Out Performance

In a previous post, Perils of Async: Data Corruption, we saw the consequences of inadequate concurrency control in asynchronous code.  The first implementation using Parallel.ForEach did not protect shared data, and its results were wrong.  The corrected implementation used C#’s lock for the necessary protection from concurrent access.

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        lock (mrr)
        {
            ++mrr.Evens;
        }
    }
    else
    {
        lock (mrr)
        {
            ++mrr.Odds;
        }
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        lock (mrr)
        {
            ++mrr.Primes;
        }
    }
});

Some may ask, “Why lock so many times? Can’t the code just lock once inside the loop?”

Parallel.ForEach(input, kvp =>
{
    lock (mrr)
    {
        if (0 == kvp.Value % 2)
        {
            ++mrr.Evens;
        }
        else
        {
            ++mrr.Odds;
        }
        if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
        {
            ++mrr.Primes;
        }
    }
});

Although moving the lock just above the first if clause seems to have some benefits – it simplifies the code, shared data access is still synchronized, etc.  But it also kills performance – making it slower than even the non-parallel SerialMapReduceWorker.

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:30.6871152
[ParallelMapReduceWorker_SingleLock] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:01:35.0778434

This situation highlights the common rule of thumb, “lock late.”  Locking late (or “low” in the code) implies that code should lock just before accessing shared data, and unlocking just afterwards.  This approach reduces the amount of code which executes while the lock is held, so it provides contenders (the threads) with more opportunities to acquire the lock.

 

Perils of Async: Data Corruption

One of the most common bugs occurring in any multi-threaded or multi-process code is corrupting shared data due to poor (or lack of) concurrency control.  Concurrency is one term used to describe code interactions that are not sequential in nature (equivalent or companion terms include parallel, multi-threaded and multi-process).  Within this context, concurrency control indicates the tactics used to ensure the integrity of shared data.

To demonstrate this problem, we’ll use a fairly simple example: Counting even, odd and prime numbers in a large set.  We’ll use different strategies over the same data set for serial and parallel processing.  The serial implementation, SerialMapReduceWorker, is straight-forward since it involves no concurrency issues.

foreach (var kvp in input)
{
    if (0 == kvp.Value % 2)
    {
        ++mrr.Evens;
    }
    else
    {
        ++mrr.Odds;
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        ++mrr.Primes;
    }
}

SerialMapReduceWorker iterates over the set of integers, kvp, determines whether each integer is even, odd or prime, and increments the appropriate counter in the MapReduceResult instance, mrr.  (Although no map-reduce is involved, SerialMapReduceWorker is named for consistency with concurrent workers)
.NET’s Task Parallel Library (TPL) makes it very easy (too easy?) to convert this code to run concurrently.  All a developer has to do is change foreach to Parallel.ForEach, include some lambda syntax, and voila!, the code magically runs much faster!

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        ++mrr.Evens;
    }
    else
    {
        ++mrr.Odds;
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        ++mrr.Primes;
    }
});

Just look at these results – the parallel version executed almost twice a fast!

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker_Unprotected] Evens: 4,996,020; Odds: 4,994,704; Primes: 244,662; Elapsed: 00:00:30.5742845

Unfortunately, this conversion is also a dangerously naive implementation of concurrent code.  Did you notice the problems?  The parallel code found a different number of even, odd and prime numbers within the same set of integers.  How is that possible?  Answer: data corruption due to a lack of concurrency control.

The implementation in ParallelMapReduceWorker_Unprotected does nothing to protect the MapReduceResult instance, mrr.  Each thread involved increments mrr.Evens, mrr.Odds and mrr.Primes.  In effect, the threads might increment mrr.Evens from 4 to 5 simultaneously when the expectation is that one will increment from 4 to 5 and another from 5 to 6.  As you can see in the results above, this unexpected data corruption causes ParallelMapReduceWorker_Unprotected’s count of even integers to be wrong by about 4,000.

In this case, correcting the error is fairly simple.  The code just needs to protect access to MapReduceResult to ensure that only one thread can increment at a time. The corrected ParallelMapReduceWorker:

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        lock (mrr)
        {
            ++mrr.Evens;
        }
    }
    else
    {
        lock (mrr)
        {
            ++mrr.Odds;
        }
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        lock (mrr)
        {
            ++mrr.Primes;
        }
    }
});

Each time this code determines it needs to increment one of the counters, it uses concurrency control by:

  1. Locking the MapReduceResult instance
  2. Incrementing the appropriate counter
  3. Unlocking the MapReduceResult instance

Since only one thread can lock mrr, other threads must wait until it is unlocked to proceed.  This locking now guarantees that, continuing our previous case, mrr.Evens is correctly implemented from 4 to 5 only once.  ParallelMapReduceWorker correctly calculates the counts (as compared to SerialMapReduceWorker), and does with almost the same performance as the unprotected version.

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker_Unprotected] Evens: 4,996,020; Odds: 4,994,704; Primes: 244,662; Elapsed: 00:00:30.5742845
[ParallelMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:30.6871152

 

NOTE: The count of primes appears to be incorrect. The IsPrime.TrialDivisionMethod implementation is intentionally slow to ensure multiple threads contend for access to the same data.  Unfortunately, and unintentionally, it also appears to be incorrect. (c.f., Count of Primes)

Avoid ACTK’s ModalPopupExtender Inside UpdatePanel

We were having some really strange behavior with the AjaxControlToolkit’s ModalPopupExtender – everything was fine on one page, but not on another.  One major purpose of using MPE was to wrap a common control. So, as you can imagine, we got pretty frustrated.  If two pages can’t achieve consistent results with shared components, development and maintenance cost projections are less predictable (although “lower cost” goes right out the window).

After quite a bit of painstakingly slow walk-throughs of the .aspx pages, we finally discovered that one of the pages had the MPE declaration within an UpdatePanel.  Since this was the misbehaving page, we simply moved the MPE outside the UpdatePanel and voilà! – it misbehaved no more.

We didn’t really dig into why our MPE inside a UpdatePanel behaved as it did – we had already burned A LOT of time on the problem and needed to push on to our deadline.  One of the devs found that this post seemed to be on track, so we’ll re-visit it if we find ourselves in need of making an MPE work within a UpdatePanel.

It may be worthwhile to mention that using the debugger didn’t help a bit in this case.  We spent untold hours making a little change here, checking the behavior, adding some extra debug code there, checking the behavior, etc.  Painfully frustrating.  But in the end it came down to hawkish eyeballing of the .aspx files.

 

What Happens When an Azure Role Starts?

 

Cory Fowler (SyntaxC4) has a good post on Windows Azure Role Startup Life Cycle. Notable aspects of the post include:

  • Synchronous and Asynchronous Startup TaskType and problems to watch for
  • A good, step-by-step diagram of how the Azure Fabric Controller turns the Cloud Service Package (.cspkg) and Cloud Service Configuration (.cscfg) files into a running role instance
  • Suggestions on avoiding Role Startup race conditions

Caveat Developer: CSharp-CloudFiles

Rackspace has developed a C# SDK for CloudFiles, which in turn is based on OpenStack Object Storage.  As we’ve used this SDK, csharp-cloudfiles, we’ve encountered some unexpected “issues.”  While we see these issues as flaws, some may consider them to be unintended consequences – we’re open to correction.  Regardless, other developers may benefit from our comments here.

The first item has to do with UserCredentials’ constructors.  com.mosso.cloudfiles.UserCredentials has four constructors:

public UserCredentials(string username, string api_access_key) // AVOID!

public UserCredentials(Uri authUrl, string username, string api_access_key)

public UserCredentials(Uri authUrl, string username, string api_access_key, string cloudversion, string accountname)

public UserCredentials(Uri authUrl, string username, string api_access_key, string cloudversion, string accountname, ProxyCredentials proxyCredentials)

Note that all except the first constructor take the authorization url as the first parameter (authUrl).  It turns out that the authorization url is hard-coded in the first constructor, rendering it useless. This constructor uses https://api.mosso.com/auth (via Constants.MOSSO_AUTH_URL), so it’s only useful if you have valid authorization credentials for that end-point.  Since (it seems) only Rackspace’s internal development team has valid credentials there, they should remove this constructor.

While on the topic it’s important to point out that the authorization url passed in actually becomes part of the UserCredentials’ member data.  Code can retrieve the value from the AuthUrl property, but cannot set the value.  As a matter of fact, all of UserCredentials’ properties are get-only (and their member data fields are readonly)!  The only way to change any of the values is to construct a new UserCredentials.

Ok, so UserCredentials is no memory hog and creating multiples is unlikely in many applications.  The issue is that this class seems to ignore the value of separate concerns.  Why, for example, is proxyCredentials a member of this class?  Are a user, account and authorization url always expected to have a single proxyCredentials?  It seems to us that the endpoint, user credentials and proxy credentials need to be combined when calling down to the REST api, but should be separated for flexibility above that.  We think several of these members should be independent, or at least more flexible by being settable.

What are your thoughts?  We’d like to know; comment below.

BIG Release of Azure Components This Week!

Windows Azure SDK 1.3 (a.k.a., November release) has just been released.  You can download just the SDK, but if you’re using Visual Studio, use the Windows Azure Tools for Visual Studio instead.  This package, VSCloudService.exe, includes the SDK package.

The major features / benefits of this release include:

  • Management Portal – the new Silverlight-based portal may be the most significant improvement of this release.  Managing Roles, Storage, Service Bus, Access Control, etc. are so much easier to access, and the portal’s performance improvements make a substantial impact on management tasks.
  • Full IIS – Finally! Each Web Role can host multiple sites – web apps, services.  Additionally, developers can now install IIS modules as well (some apps haven’t been migrated due to dependence on 3rd party or custom modules)
  • Remote Desktop – I’ve been looking forward to this for a while!  Being able to connect to Azure Roles and VMs via RDP is going to make a huge difference in so many ways – configuration, deployment, debugging, etc.
  • Windows Server 2008 R2 – Azure Roles and VMs can now be based on R2 which brings in IIS 7.5, ability to restrict what apps can run via AppLocker, PowerShell 2.0 for better administration and automation.
  • Elevated Role Privileges – I’m not so sure this is a really good idea, but it’s in now.  Azure Roles allow running with administrator privileges (sounds like “running with scissors”).  I can imaging some scenarios in which a Worker Role does a bit of admin level work, or a Web Role hosting a custom administrative portal.  But, in general, devs need to be very careful with this “feature.”
  • Multiple Admins – Multiple Live IDs can be assigned admin privileges in an Azure account.  This provides better traceability when you’re doing around-the-clock administration.  But it may also introduce risk of “stepping on each other’s toes” problems.

Also in this round of updates are a couple of betas and CTP.

  • Extra Small Instance – in BETA – at just 5 cents per compute hour, the Extra Small Instance is less than half the cost of the Small Instance (12 cents per compute hour). At the time of this writing, the Extra Small Instance is comprised of 1.0 GHz CPU, 768 MB RAM, 20 GB local storage and “low” I/O Performance.
  • Virtual Machine Role – in BETA – Now you can define and manage your own virtual machine. Based on the (very) little info I have right now, the VM is based on a differencing disk over a Windows 2008 Server R2 VM.  That limits the options of what to run in the VM.  IMO, this is the last check-box for Azure qualifying as Infrastructure as a Service (IaaS).
  • Azure Connect – in CTP – Connect provides the ability to create a virtual network between multiple devices. For example, if companies A & B want two of their systems to communicate with each other, those systems connect to Azure, establish the private network, and then communicate directly between A & B.  I really want to test this one out!

Good Explanation of Publishing Metadata for WCF

The blog post, Quick WCF Metadata Publication Walkthrough, is not new, but it give a good explanation of how metadata publication works with WCF.  In particular, it provides good understanding of interaction / dependencies of <baseAddresses />, <endpoint /> and <behavior />.  If you use it’s guidance and tinker around a bit, you’ll also get a good grasp on how you can do IMetadataExchange though HTTP, named pipes or TCP.