Patterns for Pushing Data to the Cloud

Let's explore and evaluate some mechanistic patterns for pushing data to the Cloud. Cloud providers such as Azure provide scalable storage solutions for storing massive amounts of structured and unstructured data. Successful companies are centralizing their data for enterprise-wide processing and analysis. One common application scenario is the manufacturing industry. They're using this data to benchmark performance, and even predict when equipment will fail. Technologies such as Hadoop are gaining popularity as companies are seeing the value in data aggregation and analysis.

The Scenario


In this scenario, we have hundreds or even thousands of clients, located behind corporate firewalls, and they all need to push data frequently to a centralized location. This data is fairly generic, and includes data gathered from equipment, various meters, and even data from business/ERP systems.

I'm using Windows Azure as the cloud provider, and I'm using a Windows Service as the proxy for getting data to Azure. These principles remain unchanged for other cloud providers and other operating systems. You could just as easily use Node.js and Amazon. This information is less relevant for systems that are inherently updatable such as mobile apps that use an online store as a distribution mechanism (Windows 8 apps for example).

Solution #1: Direct Push


Our first option is to push data directly into the azure services. This could be a storage mechanism such as SQL Azure, table storage, or blob storage. This simple approach has a number of advantages:

  • Fast and easy to write the code
  • Azure endpoints are already designed to automatically scale to limits that you're not likely to reach

However, there are a number of significant disadvantages to this approach as well:

  • In the rare case where an Azure API changes, the clients will most likely break
  • There are limited options for compressing the data
  • The Azure API is optimized to be flexible and generic, not business domain driven
  • Portability to other cloud providers and/or other architectures is limited

Solution #2: Push to Business API


The easiest way to overcoming the disadvantages of the direct push approach is to create an endpoint in Azure to receive the data, and relay it to the backend services. This gives us 100% control over the communication channel, at least in the development phase.

Our API methods can be domain driven. Examples:

[PUT] /api/meter_reading/{reading}/?timestamp={timestamp}
[PUT] /api/equipment/{machine}/?name={machine_name}&amp;type={machine_type}&amp;class={machine_class}</pre>

In addition to solving all of the limitations of the direct connect scenario, this gives us some additional flexibility:

  • You can add optimization features that the Azure API's don't support such as custom batching/streaming and proprietary compression algorithms
  • You can control API version changes, including supporting multiple versions simultaneously
  • As Azure adds or changes features, you can continue to optimize the backend architecture
  • You can actually remain cloud provider agnostic by using a custom domain that you own

You do lose the inherent auto-scaling that the Azure services provide, but you can auto-scale your API nodes (web sites, web roles) as needed. You can even scale across subscriptions using the Azure Traffic Manager.

Solution #3: Push to a Queue


There is a solution that combines the advantages of pushing to a business API with the advantages of pushing data directly. Instead of talking to API's, you can use a queue service whose interface is much less likely to change. In fact, there are queuing standards available such AMQP or Advanced Message Queuing Protocol which allow you to send messages using a standard protocol. Later, if we decide to switch providers, you just need to point our clients at a new queue address (which would require trivial code to automate). The Azure service bus supports the AMQP protocol today.

While this solution drastically minimizes the risk of change, there is still some risk in relying on continued support for a particular protocol.

The biggest advantage to this approach is the scalability you get without any management requirements. The current scalability targets are 2,000 messages per second, but you can scale further just by increasing the number of queues. You simply push messages to the cloud, and build a message processor that interprets the messages and sends them to the appropriate cloud services. Multiple versions of messages can be supported, and updating the backend processes is easily changed in one centralized location.

Solution #4: Updatable Upload Code

In all of the previous strategies, the weakest link has been the process running behind the firewall. Without an automated update system, we end up supporting legacy systems indefinitely. Combining legacy systems with 3rd party services outside of our control is going to become an issue at some point.

There are far too many auto-update mechanisms available to cover here, but I want to share one design that is fairly easy to implement if you use dependency injection to decouple interfaces from implementations. In my case, I'm using Unity with automatic, convention-based wire-up.

This is the typical wire-up code that you would use in Unity: Container.RegisterTypes( AllClasses.FromAssembliesInBasePath(), WithMappings.FromMatchingInterface, WithName.Default, WithLifetime.ContainerControlled);

Notice that we're able to specify the assemblies to examine, and in this case it's all of the assemblies in the base path.

If we call RegisterTypes again, but this time specify a downloaded assembly (or multiple assemblies), we can overwrite the existing implementations. (The strategy for scheduling these updates is outside the scope of this post.) var newStuff = System.Reflection.Assembly.LoadFrom("NewStuff.dll");

    overwriteExistingMappings: true);

This instructs unity to load all types that can from the new assembly. Any implementations in the new assembly will override the existing implementations. It can be as little as 1 class, or even replace all existing classes. Keep in mind that to use the new implementation, the IoC container must be queried again to get the new version. Ideally, the process should restart to ensure that all of the components have up-to-date implementations.

Like this post? Please share it!

See a mistake? Edit this post!

The Sad State of Mobile Password Security

Just when password security began to improve, mobile computing became ubiquitous and redefined the problem we thought we had solved. This post is long overdue, and I would like to explore the issues facing passwords and security today. I'm certainly not a security expert, but I believe these are issues that face everyone.


Mobile passwords are terrible, but first we need to talk about how we got here.

Poor Passwords on the Desktop

First, let's examine where we came from. Passwords on the desktop were terrible in the early days of computing. People would often use the name of one of their children, name of their dog, an object in the room, or whatever happened to pop into their mind. Thankfully, the rise of the Internet and the fear of hackers and security caused most to reevaluate their passwords. A number or an extra word made everyone feel better about their password.

When you log on to your computer, it generates a result known as a hash using a one-way formula and compares the results to the data stored in a file called the "SAM" file. This allows the computer to verify your password without actually storing it. It's a simple and elegant solution.

In the early 2000's, I was setting up PC computer images, and found myself in a situation where I needed to recover the administrator password on a Windows machine. I downloaded a brute force password cracking tool, and let it get to work on the SAM file from the computer. The first step for these programs is to hash every word in the dictionary and compare these to the values in the SAM file (called a "dictionary" hack). Surprisingly, it found multiple password matches within seconds.

I was the network administrator at the time, and I decided to run that same software against our central password directory. I was astonished with a third of the passwords were cracked within seconds, and were words like "digital" and "crayon". Other passwords that were dictionary words with numbers or symbols as a prefix or suffix took only minutes to crack. My view on passwords changed completely.

Password Patterns

In reaction to bad passwords, the industry decided to start defining password complexity requirements. Here are the default complexity requirements for strong passwords for a Windows Active Directory system:

Passwords must contain characters from three of the following five categories:

  • Uppercase characters of European languages (A through Z, with diacritic marks, Greek and Cyrillic characters)
  • Lowercase characters of European languages (a through z, sharp-s, with diacritic marks, Greek and Cyrillic characters)
  • Base 10 digits (0 through 9)
  • Nonalphanumeric characters: ~!@#$%^&*-+=`'(){}[]:;"'<>,.?/_
  • Any Unicode character that is categorized as an alphabetic character but is not uppercase or lowercase. This includes Unicode characters from Asian languages.

Where most see good passwords, I see a pattern for passwords where users will pick the simplest password that meets the given requirements. What we end up with is a guidebook to cracking the very passwords we're trying to secure! We end up with uncreative passwords that look random, but are seriously flawed.

Forced Password Changes

This security hole is a favorite of many. Every public company in the US is governed by Sarbanes Oxley (SOX) rules. These SOX rules were reactionary to the bad business practices by Enron. During SOX audits, you will be scolded in your SOX report if you allow users to keep the same password for more than 90 days. Not only do we have absurd password complexity rules that create passwords that are already hard enough to remember, but now we have to remember a new password every 3 months! Everyone is familiar with the behavior that this brings along with it. Passwords on Post-It notes:

Password Managers

A few of us use password managers. There are several variants with different technology choices such as hash-based, encrypted, web-based, etc. The idea is that we use a single, strong password that allows us to open our password vault, and once inside, we have complete access to strong passwords. These passwords are all different, and randomly generated. The idea is that if one site is compromised, there is no concern that a hacker can use the same password on another site. We're able to compensate for poor site security, and make our password too costly to crack.

The best part of using a password manager is the integration. Once you install the plug-in to your browser of choice, most of them will actually automatically enter your user credentials for the sites you visit. We end up with the best of both worlds, a more streamlined experience with greater security.

Mobile Passwords

How secure is your phone? In many ways, it's the most secure device you have. It's typically always in your possession, and it's likely to have a passcode. It's also likely that you, and possibly your IT department have a way to remotely wipe the device in the event that it's stolen. Phone OS vendors across the board have made this assumption, and I think that we all generally agree.

Once you unlock the phone in your possession, you "own" the device. This is similar to multi-factor authentication systems. They enforce security by having something, and knowing something. When you unlock your phone, you have full access to email, settings, messages, calling, etc.Someone that has access to your email, has the Skelton key to all of your online accounts.

Most third party app vendors have taken the worst possible approach to passwords. All of the major banking apps such as Mint, Citibank, and Fidelity do not have the option to save your password. If you can't save your password, and users want to use your app, what will they do?

  1. Generate a secure password and type in 20 characters each time they open your app? NOPE.
  2. Think of a unique secure password and memorize it? NOPE.
  3. Chuck Testa? NOPE.
  4. Delete your app? YEP.
  5. Use an insecure password? YEP.

This app design has to stop. This pattern promotes poor passwords. The biggest irony is that someone with your phone can reset your password via email and get into your account anyway!

Better Mobile Options

As a simple solution, allow me, your customer*, your user**, to save my password if I wish (securely of course). Please don't make me choose between security and your app.

If you're a mobile OS developer, please give us developers a better way. Here are some options that I'll offer for free!:

  • Provide a mobile password manager.
  • Implement multiple layers of security. For example, after I unlock, I should be able to designate certain applications as "high-security". Give me options to re-enter my pin code, or use a separate pin code for high-security apps.
  • Provide an API to allow password managers to integrate with apps and your browser. It would rock my world if I could sign in to LastPass, and it would automatically authenticate all of my apps with secure passwords
  • Provide an OS setting that allows saving of passwords.

More Information

Like this post? Please share it!

See a mistake? Edit this post!

Dead Simple Web Deployment Process

Setting up even a basic website with any server-side logic used to be difficult and time consuming. You needed a place to store the code (hopefully versioned), a build process, a deployment mechanism, and a hosting provider. Not to mention all of the setup that needed to occur before that.

I'm going to show you a dead simple process for getting started with a simple workflow that is free to start, and nearly free to sustain.

Free PRIVATE Git Hosting

If you're project is public, you can use a service like GitHub. There are, however, many who want to keep their source code private. In this case we turn to FogBugz and their Kiln product.

Note: TFS is a more comprehensive solution, and has a free tier for 5 users. I'm using Kiln in this example for simplicity, and to show third party interoperability.

Most people are not aware that they offer free bug tracking and unlimited source control, and they have a unique system called Kiln Harmony that allows you to seamlessly transition between Mercurial and Git.


Take a look a their pricing page, which confirms that as long as you are a startup or student and have 1 or 2 users, their service is completely free. If you have a company that only has 1 or 2 employees, I think it's safe to assume you're a startup.

Go ahead and sign-up, log in, and move on to the next step.

Push Your Code


Now, while logged in, click on the Kiln button to access the hosted source control product. Then, create a new repository (I'm not going to include a lot of detail here).


Once you have set up your repository, clone it locally using the provided URL, and add your source code:

`git clone [http://URL_SUPPLIED_BY_KILN](http://URL_SUPPLIED_BY_KILN)`   

I like to use the GitHub for Windows client, because it provides a great history viewer, an easy way to see pending changes, and provides a simple commit interface. You can use it without using the GitHub website. However, if you DO want to use as well, it can automatically pull down your GitHub project list.

Adding local projects to GitHub is a breeze. Simply drag the project folder you cloned into the Window and you'll see it activate.


Once you have a repository added to GitHub, the context menu provides an easy way to view the folder in explorer, or even open a shell to use command line functionality. Double-clicking on the project will show revision history, pending changes with a commit dialog, and a sync button for the online repo (works for any Git remote, not just


Set up Azure

Set up an Azure account. If you have an MSDN account, you'll get free Azure time for dev/test. For production use, go ahead and get a free trial. Don't worry, your website will be inexpensive, possibly even free to host. Azure offers a free hosting tier for websites, and a shared tier for only $9.68 as of the time of this writing.

Once you log in, familiarize yourself with the interface. To add an additional hosted service, use the image button at the bottom to bring up the service creation dialog and choose Compute -> Web Site -> Custom Create.


Creating a web site only requires a name. We'll also be checking the "Publish from source control" checkbox to create an Azure Git repository. We'll configure Kiln to push changes to this repository later.


To enable automated publishing, we need to tell our Azure website where our source code is located. There are options for TFS, GitHub, even Dropbox. Select the "Local Git repository" option.


On the right side, you'll want to choose the "Reset your deployment credentials" option. This will enable you to configure the username and password for your Git repository.


Now head over to the "Deployments" tab. This screen contains 2 key elements:

  • The Git URL that when pushed to, will publish/deploy our project for us.
  • A nearly real-time list of deployments and any success/error messages.



Now, head back over to your Kiln repository, and click "Settings", and then "Add, remove..." under the Web Hooks section.


Now, fill in the publishing information:

  1. Friendly Rule Name (whatever helps you identify it).
  2. Choose Azure.
  3. The Git URL copied from the website in the Azure portal. Notice that it contains your username from your deployment credentials.
  4. The password you supplied for the deployment credentials in an earlier step.
  5. Select the repository that will be deployed.


Wasn't That Easy?

From this point on, every time you push your commits from your local repository, Kiln will automatically push your commits to the Azure repository and deploy your site. If you have problems, Kiln now supplies a web hook log on the hook configuration page, which will confirm if a call to Azure was made. The Azure deployments dialog will show the deployments and the corresponding log files.

Additional Options

The goal of this post was to demonstrate only one of the many ways that you can quickly and easily go from code to website with minimal time and effort. There are some additional variations that you will likely want to investigate:

  • If you want more fine-grained control over deployments, you can configure Kiln to push a branch. You can then selectively push to that branch when you want a new deployment.
  • Instead of using a Kiln hook, you can simply set up an alternative Git endpoint in Azure, and push directly from the Git command line when you want to deploy. For example, you would run "git push Azure master".
  • For additional deployment configuration such as specifying the project to deploy, project Kudu is the deployment mechanism. Their GitHub site has additional deployment customization options.

Like this post? Please share it!

See a mistake? Edit this post!

Jason Young I'm Jason Young, software engineer. This blog contains my opinions, of which my employer - Microsoft - may not share.

@ytechieGitHubLinkedInStack OverflowPersonal VLOG