.NET Development Perf Testing in a Cloud VM (EC2)

If you haven't heard, Amazon's EC2 service provides cloud-hosted virtual machines. Initially, they just supported Linux machine images, but recently have allowed Windows machine images. This means that you can create on-demand hosted virtual machines accessible from anywhere.


I decided to do some simple, informal performance testing. To do development performance testing, I like to run a build process and time it, since compiling is typically the bottleneck on a development machine (other than the IDE and the developer).

I downloaded the source code for SharpDevelop, since I knew it would be a fairly large, yet automated build process. The only thing I needed to install was .NET 3.5 SP1. As a baseline, I ran the build on my personal laptop, with these specs: 2.0GHz Core 2 Duo, 3GB RAM, 250GB 5400RPM hard drive. To test the performance of the build, I ran it once, ran a "clean" operation, then ran the build a second time, timing it only on the second run.

  • My laptop: 1 minute, 37 seconds
  • EC2 Small Instance: 2 minutes
  • EC2 Medium Instance: 41 seconds

As you can see, the EC2 "medium" instance, was over twice as fast as my local machine.

To continue my testing, I installed Visual Studio 2008 Professional, ReSharper, TortoiseSVN, and the Silverlight toolkit. My initial impression was very positive, and I could certainly see myself using it on a regular basis. From a professional standpoint, I would probably prefer a dedicated development machine. However, for an occasional hobby development environment, this might be a viable alternative.

EC2 has many advantages over running VMware or Virtual PC on your own computer:

  • Can take snapshots of drives
  • Doesn't use resources from your computer
  • CPU can be upgraded/downgraded as needed
  • Theoretically ultra-stable host
  • Very fast Internet connection (I downloaded 800mb in less than 30s!)
  • Theoretically updated virtual hardware as time goes on
  • Potentially faster (especially if you use a laptop)

However, there are a few obvious disadvantages:

  • Pay-per-hour can get expensive if you use it full-time
  • Can't drag and drop in and out of the VM like desktop virtualization can
  • Need to remote connect using something like remote desktop so graphics performance isn't the best
  • Only available when you have access to the Internet
  • Not necessarily great multi-monitor support
  • Virtual machines take a while to start and snapshot


Right now, Windows based machines are priced starting at $.15/hr (medium for $.30/hr). For a machine that runs 24/7, this can get expensive compared to dedicated hosting. However, for a machine that's used for only a couple of hours each day, the pricing is very reasonable.

As an example, if you run a medium instance machine for 8 hours/day, 20 business days/month, you'll end up paying $48/month.


Having virtual, dedicated computers available on-demand for pennies per hour is very exciting. This is half of the cloud computing equation, and I believe it's going to be an important part of the future of the web.

Like this post? Please share it!

See a mistake? Edit this post!

Cloud Computing (and Azure) - Right for your site?

Everyone seems to be getting excited lately about the prospect of cloud computing. Just like many others, I get excited by the idea that I wouldn't have to worry about adding servers to scale up. Theoretically, a guy (or girl) could make the next YouTube, in his basement, for free. However, there is one huge advantage that most people ignore, and that's the fact that's also perfect for a small scale website.


I've tried or considered many different ways of hosting my content:

  • Shared hosting - Cheaply host your sites, but be at the mercy of their IT guy messing with your computer and rebooting it for automatic updates. Also, in my experience, the performance is terrible if your traffic spikes. They typically have hundreds of users on the same server as you, and you all get to compete for performance.
  • Dedicated hosting - This is what I use now, because it ensures that I get the full performance of a machine. The disadvantage is that I have a single point of failure, and I have to manage the machine myself.
  • Hosting from home - Yes, people actually do this. If you have a high enough upload speed it shouldn't be too bad. The problem is that your connection typically won't be able to handle traffic spikes. You'll also potentially be a victim of power or Internet outages, where professional hosts would have redundant systems in place (in theory).

Now, let's talk about cloud computing. That magical cloud that many don't understand. There are two potentially viable cloud computing methods available right now:

  • Cloud virtual machines - Amazon's EC2 solution is probably the most popular in this category. Basically, you can create, start, and stop virtual machines remotely. You just pay an hourly rate while the computer is running. You can even upgrade and downgrade the hardware as needed. The advantage is that you can treat the computer like a physical machine and configure and use it however you like. The disadvantage is that maintaining individual machines can be time consuming and is not necessarily part of your core business.
  • Cloud application server - Instead of creating virtual machines, a cloud application server runs your application directly. You no longer worry about the constraints of a physical machine. You application could potentially be run on dozens or hundreds of servers simultaneously. The major advantage is that there is little to no maintenance, because that is the job of the provider.

I see the cloud application server as having some of the greatest advantages. You're free to write your application with a level of abstraction, which allows you to solve the problems you really want to solve.

One of the most well known cloud application services is the Google App Engine, which currently supports Python applications. Microsoft joined the game recently with Azure for ASP.NET.

As I mentioned, not only do application servers let your applications scale up, they let you pay only for what you use. This is great for the small to medium website's that are stuck with bad shared hosting or difficult-to-manage dedicated hosting. The fact is that most sites get a few hundred visitors daily or less. If you start to think about how often a page is actually requested, you'll realize that it's not very much. Even with 500 users requesting 5 pages each in a 12 hour period can easily be handled with a very low end server from years ago.

The reason that application servers are so much more efficient than shared hosting is because they're built from the ground up to spread the load around. This results in higher utilization, but more headroom for any single application. Shared hosting providers can move users between servers, but it's usually a manual, and often difficult process. You're bound to a specific physical machine (unless it's VPS hosting), and if it goes down, so does your site.

Cloud computing is also a great way to handle traffic spikes such as the Digg effect. Let's say that you only have 500 visitors today, but might get 10, 100, or 1000 times more in a single day. It happened to FaceStat. They went from 10,000 page views per day to almost a million because of a story on the front page of Yahoo. They had to scramble to add application servers and develop a scaling strategy immediately.

Conclusion - Cloud Application Server Benefits

Cloud computing has tremendous benefits. You no longer have to worry about scaling the underlying hardware, you simply pay as you go, and you can handle traffic spikes with ease. Once cloud computing becomes mainstream and absolutely reliable, there will be few reasons to not use it.

Like this post? Please share it!

See a mistake? Edit this post!

RIP - Lessons Learned

In my last post, I started talking about two sites that I'm shutting down. Today I'm talking about the history, and the lessons I learned.

RankTrend-Logo was an idea and a vision that I had while we were working on It was designed to help people who were trying to optimize their sites for search engines, but didn't really have any insight into what was actually working. Here were the goals:

  • Track metrics about your site including your current rank in search engines, PageRank, AdSense income, advertising costs, visitors, etc. This data would be tracked daily, or even more often depending on the type of data.
  • Track metrics using a client application the user installs so that the requests came from their computer and not our server.
  • Import any other metrics from other services such as AwStats.
  • Track events so that changes in the tracked metrics could possibly be attributed to those events.
  • Data-mining would allow us to transform the data into meaningful information using charts, statistical analysis reports, and correlation diagrams. You could actually answer a question like "Which search engine can turn my advertising dollars into the most profit?".
  • Provide notifications when certain thresholds were met. For example, I want to know when my PageRank changes, or when my site drops in the search engines.

We were successful in building a generic system that met most of the goals. From the beginning, the system was set up to be extremely generic so that any type of data could be tracked and stored.

Here is a screenshot of the correlation diagram. This is kind of an extreme example, but it shows which items were correlated with other items. The wider the line, the stronger the correlation.

This is a neat report because of the algorithms it uses. The standard correlation formula is used to determine the correlation coefficient. For laying out the diagram, it uses a force based algorithm by simulating the edges being springs (using Hooke's law), and the nodes are electrically charged particles (Coulomb's law). It was very fun project.


Here is a screenshot of the main report. As you can see, there are vertical lines that represent events (in this case they were actually blog posts). From this data, you're able to see the effect of a blog post on your visitors and other metrics. The wavy lines help stabilize the day to day changes in the data.


Here is a screenshot of the thumbnail dashboard. This report provided a quick way to glance over every metric you were tracking for a site. The background was color coded to the trend of the data. If the data was improving, it would be a darker green based on the amount of improvement. Red indicated that the value was getting worse.

The charts were actually using the Google charting API, which provided a great way to generate a lot of charts very quickly, and without increasing server load for us.


Once you got past the initial setup and started getting data, the service was admittedly very awesome. Many of the features should be adopted by Google Analytics, and some already have been rolled out in their latest update. The biggest problem is that Google doesn't integrate the position of your keywords with the other data.

Lessons Learned

This is the important part, because it's all that remains from the site. I learned an incredible amount while building this site, so I'll share it with you.

The main reason we had another failed site (depending on your definition of "failure"), was because we were not able to reach critical mass. We simply didn't have an army of followers that we could use as our initial beta testers. If you're starting a new website, this should be your top concern! The best site will go nowhere if nobody knows about it.

  • A simple UI usually means more code. When designing how the user would configure their datasources, I designed it so that it made the most sense from a technical standpoint. However, it didn't make sense from a user standpoint. If I had to design it again, it would be much simpler to the user, but would take a lot more code. Nobody said a good design was easy.
  • Simplicity wins. The site had a competitor that only had a fraction of the features, and even had users pleading for new ones. Even though we seemly filled the need, the users didn't come over in a mass exodus as expected.
  • Don't do more than you need, focus on the core design. The first iteration of the site had over a dozen options on the control panel for logged in users. In hindsight, they must have been very confused. This is part of keeping your UI simple, but it's also a matter of making the common parts easy to find, and the advanced features hidden. The power users will usually find them.
  • Don't compete with the big guys. Our site came out at a time when Google Analytics was still in beta. Guess who you saw in the headlines? I'm still frustrated to this day about some of the features Google Analytics is blatantly missing, but they have nothing to worry about. When you're giving away a "good enough" product for free, it takes a miracle to compete with that. If we had been able to find a hidden niche, maybe we could have gotten the ball rolling, but that just wasn't the case.


This was another fun project that I will never regret working on. This was my first project using NHibernate, so it was a great learning experience. It also gave me a chance to try out the Google charting API, as well as a third party charting control called ChartDirector. This is a tough one to let go, but I want to make sure that I do a better job focusing my time instead of diluting it between many projects.

Like this post? Please share it!

See a mistake? Edit this post!

Jason Young I'm Jason Young, software engineer. This blog contains my opinions, of which my employer - Microsoft - may not share.

@ytechieGitHubLinkedInStack OverflowPersonal VLOG