.NET Core: No Sophisticated Unit Testing, Please!

In my previous post, I wrote about .NET Core’s limitation regarding directory depth.  I explained that I’m trying to create several related Domain-Driven Design packages for J3DI.  One of .NET Core’s strengths is the ability to use exactly what’s needed.  Apps don’t need the entire .NET Framework; they can specify only the packages / assemblies necessary to run.  Since I want J3DI to give developers this same option — only use what is needed — I broke the code down in to several aspects.

I’ve enjoyed using Microsoft’s lightweight, cross-platform IDE, Visual Studio Code (VSCode), with this project. It has a nice command palette, good Git integration, etc. But, unfortunately, it appears that VSCode will only execute a single test project.

For context, here’s my tasks.json from the .vscode directory:

{
   "version": "0.1.0",
   "command": "dotnet",
   "isShellCommand": true,
   "args": [],
   "tasks": [
      {
         "taskName": "build",
         "args": [ 
            "./J3DI.Domain", 
            "./J3DI.Infrastructure.EntityFactoryFx",
            "./Test.J3DI.Common", 
            "./Test.J3DI.Domain", 
            "./Test.J3DI.Infrastructure.EntityFactoryFx" 
         ],
         "isBuildCommand": true,
         "showOutput": "always",
         "problemMatcher": "$msCompile",
         "echoCommand": true
     },
     {
         "taskName": "test",
         "args": [
            "./Test.J3DI.Domain", 
            "./Test.J3DI.Infrastructure.EntityFactoryFx"
         ],
         "isBuildCommand": false,
         "showOutput": "always",
         "problemMatcher": "$msCompile",
         "echoCommand": true
      }
   ]
}

Notice how args for the build task includes 5 sub-directories. When I invoke this build task from VSCode’s command palette, it builds all 5 sub-directories in order.

Now look at the test task which has 2 sub-directories specified. I thought specifying both would execute the tests in each. Maybe you thought so, too. Makes sense, right? Well, that’s not what happens. When the test task is invoked from VSCode, the actual command invoked is:

running command> dotnet test ./Test.J3DI.Domain ./Test.J3DI.Infrastructure.EntityFactoryFx
...
error: unknown command line option: ./Test.J3DI.Infrastructure.EntityFactoryFx

(BTW, use the echoCommand in the appropriate task section to capture the actual command)

Hmmmm, maybe the build task works differently? Nope. Here’s its output:

running command> dotnet build ./J3DI.Domain ./J3DI.Infrastructure.EntityFactoryFx ./Test.J3DI.Common ./Test.J3DI.Domain ./Test.J3DI.Infrastructure.EntityFactoryFx

Ok, so it seems that dotnet build will process multiple directories, but dotnet test will only process one. To be clear, this is not a bug in VSCode — it’s just spawning the commands as per tasks.json. So I thought maybe multiple test tasks could work. I copied the test task into a new section of tasks.json, removed the first directory from the new section, and remove the second directory from the original section. Finally, I set isTestCommand for both sections.

{
   "taskName": "test",
   "args": [ "./Test.J3DI.Domain" ],
...
   "isTestCommand": true
}
,
{
   "taskName": "test",
   "args": [ "./Test.J3DI.Infrastructure.EntityFactoryFx" ],
...
   "isTestCommand": true
}

I hoped this was the magic incantation, but I was once again disappointed. Hopefully Microsoft will change dotnet’s test task to behave like the build task. Until then, we’re stuck using shell commands like the one shown in this stackoverflow question.

Try .NET Core, but keep it shallow

I’ve been building a Domain-Driven Design (DDD) framework for .NET Core.  The intent is to allow developers to use only what they need, rather than requiring an entire framework.  The project, J3DI, is available on GitHub (get it? Jedi for for DDD?)

The initial layout had 3 projects under src, and 4 under test:

..\J3DI
| global.json
| LICENSE
| NuGet.config
+---src
| \---J3DI.Domain
| \---J3DI.Infrastructure.EntityFactoryFx
| \---J3DI.Infrastructure.RepositoryFactoryFx
\---test
\---Test.J3DI.Common
\---Test.J3DI.Domain
\---Test.J3DI.Infrastructure.EntityFactoryFx
\---Test.J3DI.Infrastructure.RepositoryFactoryFx

The global.json in J3DI included these projects:

{
   "projects": [
      "src/J3DI.Domain",
      "src/J3DI.Infrastructure.EntityFactoryFx",
      "src/J3DI.Infrastructure.RepositoryFactoryFx",
      "test/Test.J3DI.Common",
      "test/Test.J3DI.Domain",
      "test/Test.J3DI.Infrastructure.EntityFactoryFx"
      "test/Test.J3DI.Infrastructure.RepositoryFactoryFx"
   ]
}

Well, that was a mistake.  After building the src projects, the test projects were not able to find the necessary dependencies from within src.

error: Unable to resolve ‘J3DI.Domain (>= 0.1.0)’ for ‘.NETStandard,Version=v1.3’.

Assuming I had something wrong, I tinkered around in global.json, but couldn’t find the magical incantation of path string format.  Finally it dawned on me that dotnet might not be treating the path as having depth.

So, it turns out, .NET Core only lets you go one level down from global.json (as of versions 1.0.0 and 1.0.1).  After pulling each project up a level, effectively removing the src and test levels, I updated the global.json file.

{
   "projects": [
      "J3DI.Domain",
      "J3DI.Infrastructure.EntityFactoryFx",
      "J3DI.Infrastructure.RepositoryFactoryFx",
      "Test.J3DI.Common",
      "Test.J3DI.Domain",
      "Test.J3DI.Infrastructure.EntityFactoryFx"
      "Test.J3DI.Infrastructure.RepositoryFactoryFx"
   ]
}

After that, dotnet got happy. Magic incantation found!

Must Have Tooling for .NET Core Development

Here’s a great set of tools for smoothing your transition to developing in .NET Core.

IDE

  • VSCode – cross platform IDE; great for coding .NET Core

Portability

Porting

Does SLA Impact DSA?

When potential customers are considering your company’s products, naturally everyone wants to put their best foot forward.  When they ask about Service Level Agreements (SLA), it can be easy to promise a little too much.  “Our competitor claims four nines (99.99%) up-time; we’d better say the same thing.”  No big deal, right?  Isn’t is just a matter of more hardware?

Not so fast.  Many people are surprised to learn that increasing nines is much more complicated than “throwing hardware at the problem.” Appropriately designed Distributed System Architecture (DSA) takes availability and other SLA elements into account, so going from three nines to four often has architectural impacts which may require substantial code changes, multiple testing cycles, etc.

Unfortunately, SLAs are often defined reactively after a system is in production.  Sometimes an existing or a potential customer requires it, sometimes a system outage raises attention to it, and so on.

For example, consider a website or web services hosted by one web server and one database server.  Although this system lacks any supporting architecture, it can probably maintain two nines on a monthly basis.  Since two nines allows for 7 hours of downtime per month, engineers can apply application updates, security patches and even reboot the systems.

 

Three nines allows for just 43.8 minutes per month.  If either server goes down for any reason, even for reboot after patches, the risk of missing SLA is very high.  If the original application architecture planned for multiple web servers, adding more may help reduce this risk since updating in rotation becomes possible.  But updating the database server still requires tight coordination with very little room for error.  Meeting SLA will probably be lost if an unplanned database server outage occurs.

This scenario hardly scrapes the surface of the difficulties involved for increasing just one aspect (availability) of a SLA.  Yet it also highlights the necessities of defining SLAs early and architecting the system accordingly.  Product Managers/Planners: Take time in the beginning to document system expectations for SLA.  System Architects: Regardless of SLA, use DSA to accommodate likely expectation increases in the future.

Perils of Async: Locking Out Performance

In a previous post, Perils of Async: Data Corruption, we saw the consequences of inadequate concurrency control in asynchronous code.  The first implementation using Parallel.ForEach did not protect shared data, and its results were wrong.  The corrected implementation used C#’s lock for the necessary protection from concurrent access.

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        lock (mrr)
        {
            ++mrr.Evens;
        }
    }
    else
    {
        lock (mrr)
        {
            ++mrr.Odds;
        }
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        lock (mrr)
        {
            ++mrr.Primes;
        }
    }
});

Some may ask, “Why lock so many times? Can’t the code just lock once inside the loop?”

Parallel.ForEach(input, kvp =>
{
    lock (mrr)
    {
        if (0 == kvp.Value % 2)
        {
            ++mrr.Evens;
        }
        else
        {
            ++mrr.Odds;
        }
        if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
        {
            ++mrr.Primes;
        }
    }
});

Although moving the lock just above the first if clause seems to have some benefits – it simplifies the code, shared data access is still synchronized, etc.  But it also kills performance – making it slower than even the non-parallel SerialMapReduceWorker.

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:30.6871152
[ParallelMapReduceWorker_SingleLock] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:01:35.0778434

This situation highlights the common rule of thumb, “lock late.”  Locking late (or “low” in the code) implies that code should lock just before accessing shared data, and unlocking just afterwards.  This approach reduces the amount of code which executes while the lock is held, so it provides contenders (the threads) with more opportunities to acquire the lock.

 

Perils of Async: Data Corruption

One of the most common bugs occurring in any multi-threaded or multi-process code is corrupting shared data due to poor (or lack of) concurrency control.  Concurrency is one term used to describe code interactions that are not sequential in nature (equivalent or companion terms include parallel, multi-threaded and multi-process).  Within this context, concurrency control indicates the tactics used to ensure the integrity of shared data.

To demonstrate this problem, we’ll use a fairly simple example: Counting even, odd and prime numbers in a large set.  We’ll use different strategies over the same data set for serial and parallel processing.  The serial implementation, SerialMapReduceWorker, is straight-forward since it involves no concurrency issues.

foreach (var kvp in input)
{
    if (0 == kvp.Value % 2)
    {
        ++mrr.Evens;
    }
    else
    {
        ++mrr.Odds;
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        ++mrr.Primes;
    }
}

SerialMapReduceWorker iterates over the set of integers, kvp, determines whether each integer is even, odd or prime, and increments the appropriate counter in the MapReduceResult instance, mrr.  (Although no map-reduce is involved, SerialMapReduceWorker is named for consistency with concurrent workers)
.NET’s Task Parallel Library (TPL) makes it very easy (too easy?) to convert this code to run concurrently.  All a developer has to do is change foreach to Parallel.ForEach, include some lambda syntax, and voila!, the code magically runs much faster!

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        ++mrr.Evens;
    }
    else
    {
        ++mrr.Odds;
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        ++mrr.Primes;
    }
});

Just look at these results – the parallel version executed almost twice a fast!

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker_Unprotected] Evens: 4,996,020; Odds: 4,994,704; Primes: 244,662; Elapsed: 00:00:30.5742845

Unfortunately, this conversion is also a dangerously naive implementation of concurrent code.  Did you notice the problems?  The parallel code found a different number of even, odd and prime numbers within the same set of integers.  How is that possible?  Answer: data corruption due to a lack of concurrency control.

The implementation in ParallelMapReduceWorker_Unprotected does nothing to protect the MapReduceResult instance, mrr.  Each thread involved increments mrr.Evens, mrr.Odds and mrr.Primes.  In effect, the threads might increment mrr.Evens from 4 to 5 simultaneously when the expectation is that one will increment from 4 to 5 and another from 5 to 6.  As you can see in the results above, this unexpected data corruption causes ParallelMapReduceWorker_Unprotected’s count of even integers to be wrong by about 4,000.

In this case, correcting the error is fairly simple.  The code just needs to protect access to MapReduceResult to ensure that only one thread can increment at a time. The corrected ParallelMapReduceWorker:

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        lock (mrr)
        {
            ++mrr.Evens;
        }
    }
    else
    {
        lock (mrr)
        {
            ++mrr.Odds;
        }
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        lock (mrr)
        {
            ++mrr.Primes;
        }
    }
});

Each time this code determines it needs to increment one of the counters, it uses concurrency control by:

  1. Locking the MapReduceResult instance
  2. Incrementing the appropriate counter
  3. Unlocking the MapReduceResult instance

Since only one thread can lock mrr, other threads must wait until it is unlocked to proceed.  This locking now guarantees that, continuing our previous case, mrr.Evens is correctly implemented from 4 to 5 only once.  ParallelMapReduceWorker correctly calculates the counts (as compared to SerialMapReduceWorker), and does with almost the same performance as the unprotected version.

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker_Unprotected] Evens: 4,996,020; Odds: 4,994,704; Primes: 244,662; Elapsed: 00:00:30.5742845
[ParallelMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:30.6871152

 

NOTE: The count of primes appears to be incorrect. The IsPrime.TrialDivisionMethod implementation is intentionally slow to ensure multiple threads contend for access to the same data.  Unfortunately, and unintentionally, it also appears to be incorrect. (c.f., Count of Primes)

Perils of Async: Introduction

As application communications over lossy networks and “in the cloud” have grown, the necessity of performing these communications asynchronously has risen with them. Why this change has been occurring may be an interesting topic for another post, but a few simple cases demonstrate the point:

  • Web browsers make multiple, asynchronous HTTP calls per page requested. Procuring a page’s images, for example, have been asynchronous (“out-of-band”) operations for at least decade.
  • Many dynamic websites depend on various technologies’ (AJAX, JavaScript, jQuery, etc.) asynchronous capabilities – that’s what makes the site “dynamic.”
  • Similarly, most desktop and mobile applications use technologies to communicate asynchronously.

Previously, developing asynchronous software – whether inter-process, multi-threaded, etc. – required very talented software developers. (As you’ll see soon enough, it still does.) Many companies and other groups have put forward tools, languages, methodologies, etc. to make asynchronous development more approachable (i.e., easier for less sophisticated developers).

Everyone involved in software development – developers, managers, business leaders, quality assurance, and so on – need to be aware, however, that these “tools” have a down-side. Keep this maxim in mind: Things that make asynchronous software development easier also make bad results Ibugs!) easier. For example, all software involving some form of asynchronicity

  • Not only has bugs (as all software does), but the bugs are much, much more difficult to track down and fix
  • Exhibits higher degrees of hardware-based flux. Consider, for example, a new mobile app that is stable and runs well on a device using a Qualcomm Snapdragon S1 or S2 (single-core) processor. Will the same app run just as well on a similar device using (dual-core) Snapdragon S3 or above? Don’t count on it – certainly don’t bet your business on it!

This series of posts, Perils of Async, aims to discuss many of the powerful .NET capabilities for asynchronous and parallel programming, and to help you avoid their perilous side!

Windows Azure Management Portal in Firefox, Moonlight on Linux

 

We have Arch Linux running on a 7 year old Dell desktop. It’s an oldie, but a goodie.  The combination of Arch with LXDE makes for a good administrative machine – email, web browsing, bittorrents, etc.  We had tried using the Silverlight-based Windows Azure Management Portal on this machine – using Mono’s Moonlight as the the Silverlight for Linux – but found enough hiccups that we stopped wasting our time.  When Windows Azure began offering its HTML5-based management portal, our interest in managing our Azure systems from Linux was renewed.  Here’s a brief review of our experience:

Using Firefox 13.0.1 on Arch Linux, we opened http://windows.azure.com/. After signing in, we were left on what appeared to be a blank page.  On right-clicking the page, we learned that it was actually trying to use Silverlight, and the Moonlight implementation didn’t seem to be rendering correctly.  We wondered why we hadn’t been given a choice between Silverlight and HTML5 – we seem to remember that in Win7+ IE.

We uninstalled Moonlight in hopes that the portal’s page code would opt for HTML5 when no Silverlight support was detected. Unfortunately, the portal’s entry page just showed the familiar “To view this content, please install Silverlight….”

Disappointed, again.  The management portal doesn’t seem to detect the lack of Silverlight support and redirect to the HTML5 version.  The user is not presented a choice of which to use. And either the Moonlight implementation or the portal implementation in Silverlight don’t work correctly.

UPDATE: After tweeting that the portal wasn’t working in our config, we quickly received a response from @ScottGu saying that we need to use http://manage.windowsazure.com/ for the HTML5 portal. (Whether the tweet came from the real Scott Guthrie or a ghost tweeter, we don’t know). We were immediately pleased to find that the HTML5 portal worked very well in our non-Microsoft config! Kudos to Microsoft and the Windows Azure team for delivering cross-platform, cross-browser management tools – well done!

UPDATE 2: The portal link/button on WindowsAzure.com navigates to Windows.Azure.com (which requires Silverlight).  If you want to use the HTML5-based management portal, be sure to open http://manage.windowsazure.com/ directly.

A PowerShell Script to Assist WCF Service Hosting / Testing

When developing for WCF, I find situations in which I need to manually host the WCF services.  (“Manually” == “not from within Visual Studio”).  Sometimes I have to go back and forth between different services, etc.  The command I really wanted was to be able to “just host from here.”  So, I created a simple PowerShell script that does just that.

WCF-HostMe starts in the current directory and looks for a service to host.  It simply looks for a config file (matching *.[exe|dll].config) and assumes that the config file’s name-matching exe or dll is the service.  After formatting and building the parameter values, it launches the service using WcfSvcHost.exe.

Again, this is a simplistic approach and really only works well for development and testing purposes.  Obviously this is not a good approach for production environments.

 

###### WCF-HostMe.ps1
#    Hosts a WCF service using WCF Service Host (WcfSvcHost.exe) for testing purposes.
#    How it works:
#        Beginning in current dir, recursively searches for *.exe.config or *.dll.config
#        Assuming the .config file is asso'd with the WCF service, launches WCFSvcHost
#            using the service's and config's paths
#
#    History:
#        3/16/2012, J Burnett    Created
#####

# Find .config file
# TODO: handle multiple search results
# TODO: detect assembly is not hostable? (not WCF, WF)
$configPath = (gci . -include *.exe.config, *.dll.config -recurse).VersionInfo.FileName

# Build WCFSvcHost param inputs - the full paths of service & config
$serviceArg = (" /service:""" + $configPath -replace '.config','') + """ "
$configArg = " /config:""$configPath"" "

# Launch WCFSvcHost to host the service
echo ''
echo "Attempting to host $serviceArg as WCF Service..."
start-process WcfSvcHost.exe -ArgumentList ($serviceArg + " /config:""$configPath"" ")
echo ''


### Copyright & Disclaimer
#####
# This software is provided "as is"; there are no warranties of any kind.  This software 
# may not work correctly and/or reliably in some environments. In no event shall
# AltaModa Technologies, LLC, its personnel, associates or contibutors be liable for 
# ANY damages resulting from the use of this software.
#####

Debugging .NET-based Windows Services

We haven’t needed to implement a Windows Service in a while, but certainly have experienced the pain of debugging them when run by Service Control Manager (SCM).  Here are a couple of links to good tools and discussion on the topic:

Run Windows Service as a Console Program by Einar Egilsson provides a good example of how to debug your service easily as a console app.  It also includes some good discussion of single- and multi-service hosting services, various ways to end or break out of the service process when running as console app, etc.

Windows Service Helper on CodePlex is a SCM replacement which provides for F5 debugging from Visual Studio. The UI it provides gives you the SCM-esque start, stop, pause capabilities to facilitate debugging the associated functionality in your service.

Hopefully these may be useful to you, but if nothing else, this will be a good reminder for the next time we do a Windows Service.