Is JSON API an REST Anti-Pattern?

JSON API is an anti-pattern of REST (at least partially).  JSON API’s core problem is that it restricts 1/3 of the fundamental concepts, namely representation of resources. In the Content Negotiation section of the JSON API spec we learn:

  • Clients must pass content-type: application/vnd.api+json in all request headers
  • Clients are not allowed to use any media type parameters
  • Servers must pass content-type: application/vnd.api+json in all response headers
  • Servers must reject requests containing media type parameters in content-type (return error code 415 – Unsupported Media Type)
  • Servers must reject requests lacking an unadorned Accept header for application/vnd.api+json (return error code 406 – Not Acceptable)

In other words, application/vnd.api+json is the only representation allowed.  This restriction may be temporary – v1 spec indicates these requirements “exist to allow future versions of this specification to use media type parameters for extension negotiation and versioning.”  Will the restrictions be lifted in v1.1, v2.0, v3.0?

So What?

“Ok, so JSON API is overly restrictive on representations.  Big deal.  Why should I care?”  As always “it depends” (typical, right?).  Teams meeting the following criteria may not need to care about this issue:

  • Simple / Single Application – if the application is single purpose; service is not expected to serve multiple clients, client types
  • JSON Only – if the application is never expected to provide media formats other than JSON
  • Simple Representations – if the application is never expected to provide different representations; IOW, if media type parameters will always be sufficient

Does SLA Impact DSA?

When potential customers are considering your company’s products, naturally everyone wants to put their best foot forward.  When they ask about Service Level Agreements (SLA), it can be easy to promise a little too much.  “Our competitor claims four nines (99.99%) up-time; we’d better say the same thing.”  No big deal, right?  Isn’t is just a matter of more hardware?

Not so fast.  Many people are surprised to learn that increasing nines is much more complicated than “throwing hardware at the problem.” Appropriately designed Distributed System Architecture (DSA) takes availability and other SLA elements into account, so going from three nines to four often has architectural impacts which may require substantial code changes, multiple testing cycles, etc.

Unfortunately, SLAs are often defined reactively after a system is in production.  Sometimes an existing or a potential customer requires it, sometimes a system outage raises attention to it, and so on.

For example, consider a website or web services hosted by one web server and one database server.  Although this system lacks any supporting architecture, it can probably maintain two nines on a monthly basis.  Since two nines allows for 7 hours of downtime per month, engineers can apply application updates, security patches and even reboot the systems.

 

Three nines allows for just 43.8 minutes per month.  If either server goes down for any reason, even for reboot after patches, the risk of missing SLA is very high.  If the original application architecture planned for multiple web servers, adding more may help reduce this risk since updating in rotation becomes possible.  But updating the database server still requires tight coordination with very little room for error.  Meeting SLA will probably be lost if an unplanned database server outage occurs.

This scenario hardly scrapes the surface of the difficulties involved for increasing just one aspect (availability) of a SLA.  Yet it also highlights the necessities of defining SLAs early and architecting the system accordingly.  Product Managers/Planners: Take time in the beginning to document system expectations for SLA.  System Architects: Regardless of SLA, use DSA to accommodate likely expectation increases in the future.

Perils of Async: Locking Out Performance

In a previous post, Perils of Async: Data Corruption, we saw the consequences of inadequate concurrency control in asynchronous code.  The first implementation using Parallel.ForEach did not protect shared data, and its results were wrong.  The corrected implementation used C#’s lock for the necessary protection from concurrent access.

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        lock (mrr)
        {
            ++mrr.Evens;
        }
    }
    else
    {
        lock (mrr)
        {
            ++mrr.Odds;
        }
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        lock (mrr)
        {
            ++mrr.Primes;
        }
    }
});

Some may ask, “Why lock so many times? Can’t the code just lock once inside the loop?”

Parallel.ForEach(input, kvp =>
{
    lock (mrr)
    {
        if (0 == kvp.Value % 2)
        {
            ++mrr.Evens;
        }
        else
        {
            ++mrr.Odds;
        }
        if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
        {
            ++mrr.Primes;
        }
    }
});

Although moving the lock just above the first if clause seems to have some benefits – it simplifies the code, shared data access is still synchronized, etc.  But it also kills performance – making it slower than even the non-parallel SerialMapReduceWorker.

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:30.6871152
[ParallelMapReduceWorker_SingleLock] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:01:35.0778434

This situation highlights the common rule of thumb, “lock late.”  Locking late (or “low” in the code) implies that code should lock just before accessing shared data, and unlocking just afterwards.  This approach reduces the amount of code which executes while the lock is held, so it provides contenders (the threads) with more opportunities to acquire the lock.

 

Perils of Async: Data Corruption

One of the most common bugs occurring in any multi-threaded or multi-process code is corrupting shared data due to poor (or lack of) concurrency control.  Concurrency is one term used to describe code interactions that are not sequential in nature (equivalent or companion terms include parallel, multi-threaded and multi-process).  Within this context, concurrency control indicates the tactics used to ensure the integrity of shared data.

To demonstrate this problem, we’ll use a fairly simple example: Counting even, odd and prime numbers in a large set.  We’ll use different strategies over the same data set for serial and parallel processing.  The serial implementation, SerialMapReduceWorker, is straight-forward since it involves no concurrency issues.

foreach (var kvp in input)
{
    if (0 == kvp.Value % 2)
    {
        ++mrr.Evens;
    }
    else
    {
        ++mrr.Odds;
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        ++mrr.Primes;
    }
}

SerialMapReduceWorker iterates over the set of integers, kvp, determines whether each integer is even, odd or prime, and increments the appropriate counter in the MapReduceResult instance, mrr.  (Although no map-reduce is involved, SerialMapReduceWorker is named for consistency with concurrent workers)
.NET’s Task Parallel Library (TPL) makes it very easy (too easy?) to convert this code to run concurrently.  All a developer has to do is change foreach to Parallel.ForEach, include some lambda syntax, and voila!, the code magically runs much faster!

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        ++mrr.Evens;
    }
    else
    {
        ++mrr.Odds;
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        ++mrr.Primes;
    }
});

Just look at these results – the parallel version executed almost twice a fast!

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker_Unprotected] Evens: 4,996,020; Odds: 4,994,704; Primes: 244,662; Elapsed: 00:00:30.5742845

Unfortunately, this conversion is also a dangerously naive implementation of concurrent code.  Did you notice the problems?  The parallel code found a different number of even, odd and prime numbers within the same set of integers.  How is that possible?  Answer: data corruption due to a lack of concurrency control.

The implementation in ParallelMapReduceWorker_Unprotected does nothing to protect the MapReduceResult instance, mrr.  Each thread involved increments mrr.Evens, mrr.Odds and mrr.Primes.  In effect, the threads might increment mrr.Evens from 4 to 5 simultaneously when the expectation is that one will increment from 4 to 5 and another from 5 to 6.  As you can see in the results above, this unexpected data corruption causes ParallelMapReduceWorker_Unprotected’s count of even integers to be wrong by about 4,000.

In this case, correcting the error is fairly simple.  The code just needs to protect access to MapReduceResult to ensure that only one thread can increment at a time. The corrected ParallelMapReduceWorker:

Parallel.ForEach(input, kvp =>
{
    if (0 == kvp.Value % 2)
    {
        lock (mrr)
        {
            ++mrr.Evens;
        }
    }
    else
    {
        lock (mrr)
        {
            ++mrr.Odds;
        }
    }
    if (true == AMT.Math.IsPrime.TrialDivisionMethod(kvp.Value))
    {
        lock (mrr)
        {
            ++mrr.Primes;
        }
    }
});

Each time this code determines it needs to increment one of the counters, it uses concurrency control by:

  1. Locking the MapReduceResult instance
  2. Incrementing the appropriate counter
  3. Unlocking the MapReduceResult instance

Since only one thread can lock mrr, other threads must wait until it is unlocked to proceed.  This locking now guarantees that, continuing our previous case, mrr.Evens is correctly implemented from 4 to 5 only once.  ParallelMapReduceWorker correctly calculates the counts (as compared to SerialMapReduceWorker), and does with almost the same performance as the unprotected version.

9999999 of 9999999 input values are unique
[SerialMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:51.6998025
[ParallelMapReduceWorker_Unprotected] Evens: 4,996,020; Odds: 4,994,704; Primes: 244,662; Elapsed: 00:00:30.5742845
[ParallelMapReduceWorker] Evens: 5,000,533; Odds: 4,999,466; Primes: 244,703; Elapsed: 00:00:30.6871152

 

NOTE: The count of primes appears to be incorrect. The IsPrime.TrialDivisionMethod implementation is intentionally slow to ensure multiple threads contend for access to the same data.  Unfortunately, and unintentionally, it also appears to be incorrect. (c.f., Count of Primes)

Perils of Async: Introduction

As application communications over lossy networks and “in the cloud” have grown, the necessity of performing these communications asynchronously has risen with them. Why this change has been occurring may be an interesting topic for another post, but a few simple cases demonstrate the point:

  • Web browsers make multiple, asynchronous HTTP calls per page requested. Procuring a page’s images, for example, have been asynchronous (“out-of-band”) operations for at least decade.
  • Many dynamic websites depend on various technologies’ (AJAX, JavaScript, jQuery, etc.) asynchronous capabilities – that’s what makes the site “dynamic.”
  • Similarly, most desktop and mobile applications use technologies to communicate asynchronously.

Previously, developing asynchronous software – whether inter-process, multi-threaded, etc. – required very talented software developers. (As you’ll see soon enough, it still does.) Many companies and other groups have put forward tools, languages, methodologies, etc. to make asynchronous development more approachable (i.e., easier for less sophisticated developers).

Everyone involved in software development – developers, managers, business leaders, quality assurance, and so on – need to be aware, however, that these “tools” have a down-side. Keep this maxim in mind: Things that make asynchronous software development easier also make bad results Ibugs!) easier. For example, all software involving some form of asynchronicity

  • Not only has bugs (as all software does), but the bugs are much, much more difficult to track down and fix
  • Exhibits higher degrees of hardware-based flux. Consider, for example, a new mobile app that is stable and runs well on a device using a Qualcomm Snapdragon S1 or S2 (single-core) processor. Will the same app run just as well on a similar device using (dual-core) Snapdragon S3 or above? Don’t count on it – certainly don’t bet your business on it!

This series of posts, Perils of Async, aims to discuss many of the powerful .NET capabilities for asynchronous and parallel programming, and to help you avoid their perilous side!

Best Practice for Endorsing on LinkedIn

“And now for something completely different!” Yes, this is a bit off-beat for us, but we think you’ll be glad to learn a better way to endorse people on LinkedIn.

Recently LinkedIn has more aggressively elicited your endorsement for people in your network.  You are presented with four people from your network along with just one skill per person.

image

You have the option of endorsing all of them at once, or one person at a time. Regardless of which path you take, you are only able to endorse one skill per person.  We want endorsements on LinkedIn to be meaningful, so we prefer to endorse multiple skills for one person at a time.  Here’s what we do…

First, go to the person’s profile page.  From the four person endorsement grid, you can right-click their picture and open a new browser tab or window. Alternatively, you can search for them or find them in your network other ways.

Once you are on the person’s profile page, simply use your mouse to hover over the drop-down indicator to the right of the Send a message button.

image

Hovering will cause the drop-down menu to appear, from which you will select Endorse skills & expertise.  Now LinkedIn adds endorsement to the top of the person’s profile page.

image

Within the endorsement area, you can add skills you want to endorse, remove skills you do not want to endorse, etc.  After completing the set of skills you want to endorse for that person, click the Endorse button. 

By the way, the person you have endorsed can remove your endorsement if they disagree with it for any reason. So we think it’s worthwhile to add skills you believe the person demonstrates.

azureQuery vs. Azure SDK for Node

 

One of our recent projects involved using JavaScript to access Windows Azure data and features.  When considering the overall design, we discussed client- or server-side execution models (where the “meat” of the code will execute).  In this post we hope to expose what we learned in this process to others.  Although quite a few JavaScript libraries exist for accessing parts of Azure, the two we’ll analyze here are azureQuery and Azure SDK for Node.

First, a little context about each library:

azureQuery Azure SDK
Publisher David Pallman – Neudesic Windows Azure – Microsoft
URL azureQuery Windows Azure Node.js Developer Center
Code URL azureQuery on CodePlex azure-sdk-for-node on GitHub
Initial Release July, 2012 September, 2011

 

Next, some characteristics of the libraries:

azureQuery Azure SDK
Execution Locale Client-side (browser) Server-side (node)
Fluent (chaining) language support? Yes No
Storage Support?

Blob

Yes Yes

Queue

*Not Yet Yes

Table

*Not Yet Yes
Service Bus Support? ^No Yes
Identity & Access Control? No No
* As of 9/12/12, azureQuery only provides access to Windows Azure Blob Storage ^ We are not clear whether azureQuery plans to support Service Bus integration.

 

The table above highlights that, in its current state, azureQuery is very limited in its support of Azure features.  Actually, that’s to be expected. azureQuery was first published in late July, 2012; Azure SDK for Node was 10 months old at that point. We expect azureQuery will deliver support more areas of Azure, especially as the level of developer contribution improves (David Pallman has a full-time job, after all!).

 

Which should you use?

So, which of these libraries should you use for projects now?  If you’re thinking, “That’s not even the right question!” you are right!  Decisions regarding which code runs client-side or server-side has a great deal more to do with application requirements, scale expectations, data change rates, etc.

However, it is pretty clear at this point that azureQuery is still in its infancy.  If your goal is to rapidly deliver a solution using Windows Azure (beyond Blobs), then you should use Azure SDK for Node.  This decision will change as azureQuery fulfills its (assumed) mission. If solution demands client-side execution (e.g., rich visualization of changing data), then we encourage you to invest in azureQuery and contribute to its advancement.

How To Use Mocha for Node Testing in Windows

Even with some advice from other sites, we had trouble getting Mocha testing to work well in Windows.  Well, we could get tests to run, but inefficiently and error-prone from a developer perspective.  Our solutions are cross-platform, so we wanted to use a consistent mechanism for Windows and Linux (and assume Linux covers us well on Mac OS X).

Most of the suggestions from other sites worked, so we’ll just walk through them quickly.  But getting test execution via makefiles was a headache.  Aside: The “Unix Way” uses makefiles for almost everything.  The Unix Way for a developer to execute Mocha tests is to simply type ‘make test‘  Without getting into any philosophical or pragmatic rationales, this is the best way we’ve found for cross-platform Mocha testing.

Setting Up Your Environment

We’ll assume that you already have Node configured and working.  The first step is to get Mocha installed and configured. As with most Node packages, installing Mocha is as easy as

npm install mocha

Once installed, Mocha shows up in the list of packages

Mocha in Node Packages List

Now you’re ready to create tests.  Just follow the instructions, 1. 2. 3. Mocha!, on Mocha’s GitHub site.  Their instructions are written for a Unix-like environment (.e.g., “$EDITOR test/test.js”), but translation is simply:

  1. Create a test directory
  2. Use your favorite editor to copy the provided JavaScript in to test\test.js file
  3. Run Mocha

Oops! Step 3 doesn’t work on Windows like they expect it will on Linux.  Yes, you could add Mocha to PATH, but that’s not the better solution – especially since we aim to use make to run the tests.  So we’ll just skip step 3 and get help from Alex Young’s excellent post, Testing With Mocha.  You can skip the first bit about creating a package.json file and using npm to install Mocha.  (Eventually you’re going to want to leverage the power of the Node Package Manager!)

Just copy Alex’s simple, 3-line makefile to your directory (the parent of the test directory where you stored test.js).   If you were using a “typical” unix developer environment, you could simply run the tests in test.js by using make:

make test

In most Windows developer environments, however, makeeither doesn’t exist or isn’t going to work correctly.  Other have written, for example, that Visual Studio’s nmake.exe isn’t a good proxy for make in this case.  As you search for how to remedy this situation, you’re likely to come across Mocha requires make. Can’t find a make.exe that works on Windows on StackOverflow.com. The elements of this post that proved to be helpful are:

  1. Install Cygwin
  2. Alias make to Cygwin’s make.exe ([install location]\cygwin\bin\make.exe)
  3. (Optional) Use the makefile template from Richard Turner’s response

Now you should be able to make test and see the test run, right?  Well, not so fast professor.  Cygwin’s make.exe conforms to some White Space Persnicketiness Specification.  We thought surely we were ready to execute our tests, but kept suffering make errors about “missing separator.”

Make missing separator errors

Well, sir, that’s what we call a helpless error.  It’s an error alright, but sure isn’t helpful.  It turns out, however, that had we been steeped in unix-land makefiles, we might know that white space matters.  That is to say that make treats a series of spaces differently than tabs.  If your favorite editor is converting tabs to spaces, then make complains of “missing separator.”  [UPDATE: some have suggested that GNU’s Make for Windows handles tabs or spaces.]

Ok, we’re almost there.  You’ll need to configure your editor to keep tabs rather than convert them to spaces.  Ideally, your editor will also support writing files with unix-style line endings, LF (\n) rather than CRLF (\r\n) typically used on Windows.  Below is an example of how to configure Notepad++.

Notepad++ Configuration to Keep Make Happy

Since make doesn’t like tabs to be converted to spaces, we change our Notepad++ settings to save the tabs rather than convert them.  Visual Studio and other developer-oriented text editors have similar settings.  In Notepad++ select Preferences from the Settings menu, and then the dialog tab for Language Menu/Tab Settings. Select Makefile in the Tab Settings, listbox on the right; turn Use default value and Replace by space off.  After you save these settings, you’ll also need to replace the spaces in your makefile with tabs.  For example, we replaced the beginning spaces on Line 2 below with a beginning tab.  Just save your changes, and you should be able to make test successfully.

Notepad++ Makefile Tab Settings
If you’re also interested to configure Notepad++ to use unix-style line endings, select Unix Format from the EOL Conversion menu item of the Edit menu.

Notepad++ End of Line Configuration

 

Whew! That wraps it up.  Now your developers should be able to use make test in Windows, Linux and Mac environments.  One less difference to remember between platforms means one more boost to developer productivity.

We hope this is helpful to you.  Feel free to leave comments if this works for you, if it doesn’t, etc.

 

Outlook.com / Live.com Enabling Spammers?

Microsoft’s Outlook.com email site is all the rage this week.  With a clean, responsive interface, many are harping it as a symbol of “the new Microsoft.”  Hope springs eternal.

I was disappointed, however, to see this error message after incorrectly typing a password:

Outlook.com's Wrong Password Error Message

Hmmm. Did I miss a change in the security world regarding email address privacy?  Hopefully Microsoft will remedy this situation quickly and use the typical (and more private) approach – “That email address and password do not match our records.”

 

Windows 8 Installation Fails in VirtualBox

We wanted to tinker with the Windows 8 Release Preview, so we tried to install it on our tinkering box.  Having two tinkering boxes, we started with the trashable tinkering box — a Dell Dimension from mid-2005.  Admittedly, we attempted this installation to see if “it will set the drive on fire.”  We tend to throw crazy things at this Dimension, but it keeps on tickin’! Its spec are:

  • Hardware:
    • Dell Dimension 4700
    • CPU: Pentium 4 @ 3.0 GHz
    • RAM: 4 GB (physical)
    • Disk: 1 TB; > 750 GB free space
  • Software:
    • Arch Linux; Kernel: 3.4.6-1-ARCH #1 SMP PREEMPT
    • VirtualBox 4.1.18_OSE r78361

We created a new VM and hooked it up to the 32-bit ISO for Window 8 Release Preview.  After starting the VM, VBox presents an error dialog stating:

VT-x/AMD-V hardware acceleration is not available on your system. Certain guests (e.g. OS/2 and QNX) require this feature and will fail to boot without it.

This dialog gives the user a choice of closing the VM or continuing.  It’s the tinker box, so we chose to continue!  Next the Windows logo appeared, so we got excited that this just might work.  But, after just a couple of minutes, the installer died and gave this message:

Your PC needs to restart.
Please hold down the power button.
Error Code: 0x0000005D
Parameters:
0x030F0304
0x756E6547
0x49656E69
ox6C65746E

Well, that’s clearly a bad result.  We had no interest in even attempting to push past this kind of problem.  Installing Windows 8 on our tinkering server (Windows 2008 R2, Hyper-V).

After some discussion, we decided that this configuration probably should not work.  So, chalk this up to “Yep, we have demonstrated that what should not work actually does not work.”  Hopefully no one else will be tempted to try Win8 on this kind of config, but maybe this will save them some time if they do.