Google+

Sending Real-Time Sensor Data to Clients Using SignalR

You don't have to go very far into the past to find yourself in a time where pushing data from a server to a client was a huge pain. Sure, you could use tricks like polling or long-polling, but it was difficult and error prone. Fortunately, we have technologies today that make it drop dead simple without complexity or bad performance.

If you're using a technology like Node.js, then something like Socket.io or Faye is for you. They're not the focus of this post, but they are both extremely easy to use.

Instead, I'll be focusing on an ASP.NET MVC application that will send data in near real-time to a web client.

SignalR

SignalR is based on the concept of hubs. You create hubs in your .NET application (ASP.NET/MVC/etc). Here is what a simple hub can look like:

public class ChatHub : Hub
{
    public void Send(string name, string message)
    {
        Clients.All.broadcastMessage(name, message);
    }
}

If you haven't used SignalR, Be sure to familiarize yourself with the above example before moving on. The server calls the Send method, and that method calls broadcastMessage on all of the connected clients.

SignalR is awesome for a few reasons:

  • It makes our life easy by hiding the complex process of determining how to connect to a server.
  • SignalR is also extremely fast. It's hard to find benchmarks, but I've read reports of getting 10,000+ calls/sec with a single server.
  • It is possible to scale out to multiple servers.

The Scenario

In my manufacturing projects, I wanted an easy way for web clients to subscribe to real-time data streams.

Real-Time Data Feeds

On the left is a list of streams that the user can subscribe to, and the chart on the right plots those values as they arrive from the server.

The Server Implementation

The easiest way to wire up SignalR for our scenario is to use the Microsoft ASP.NET SignalR OWIN Nuget Package. OWIN gives us an easy way to inject functionality into our server. To enable OWIN, we'll also need to pull in the Microsoft.Owin.Host.SystemWeb Nuget Package.

Once those packages are installed, SignalR can be wired up simply by adding a file with the following contents:

using Microsoft.Owin;
using Owin;

[assembly: OwinStartup(typeof(MyNamespace.Startup))]

namespace MyNamespace
{
    public class Startup
    {
        public void Configuration(IAppBuilder app)
        {
            app.MapSignalR();
        }
    }
}

That's how easy it is.

Let's talk hubs. A hub is just a regular class that inherits from Hub. Just keep in mind that hubs are transient, meaning that anything you store in a member property will be lost in the next call. That's actually a good thing since we may be spanning multiple servers. Storing any state should happen outside of the hub.

Now, let's create our first Hub:

using Microsoft.AspNet.SignalR;
using Microsoft.AspNet.SignalR.Hubs;

namespace Manufacturing.Api.Hubs
{
    public class DatasourceRecord : Hub
    {
    }
}

It doesn't do anything yet, but it gives us a place to put our methods.

To allow clients to subscribe to the streams they're interested in, I'm going to use a feature in SignalR called Groups. Groups give us a way to fan out data to clients that belong to that group. Remember how I said that hubs can't store state? Groups do store the mapping between clients and group names.

Let's provide some methods for the client to register for streams they're interested in:

(Note: I call them Datasources)

    public const string GroupLabelPrefix = "Datasource_";

    public void Register(int datasourceId)
    {
        Groups.Add(Context.ConnectionId, GroupLabelPrefix + datasourceId);
    }

    public void Unregister(int datasourceId)
    {
        Groups.Remove(Context.ConnectionId, GroupLabelPrefix + datasourceId);
    }

Context is an inherited member that allows us to look up the unique ID of the client making the request. By adding and removing clients in the groups, they're explicitly subscribing to what they're interested in. Security could easily be layered into these methods as needed.

Here is the method call to get data to the subscribed clients:

public static void NewDataRecord(DataRecord record)
{
    var context = GlobalHost.ConnectionManager.GetHubContext<DatasourceRecord>();

    var groupName = GroupLabelPrefix + record.DatasourceId;
    var group = context.Clients.Group(groupName);
    if (group != null)
    {
        group.newRecord(record);
    }
}

Notice that this method is static, which was intentional. This allows us to easily call into the hub from code elsewhere. In my case, I'm receiving data through Azure Event Hubs.

The Client Implementation

For the web client, we need to pull in the SignalR client libraries. We can get these with the Microsoft.AspNet.SignalR.JS Nuget package.

Now add a reference to the client library in your HTML. We also need to reference a special script called signalr/hubs. This second reference is a dynamically generated client library based on the server methods that we'll define later.

<script src="Scripts/jquery.signalR-2.2.0.min.js"></script>
<script src="signalr/hubs"></script>

Now, we can reference our hub and handle server events:

dataHub = $.connection.DatasourceRecord;

dataHub.client.newRecord = function (record) {
    console.log('Record from server: ' + msg);
}

SignalR is handling all of the serialization/deserialization of the record, so we actually get back a JavaScript object, not a string.

It's important to subscribe to at least 1 event before starting SignalR so that the hub gets started. Now, let's initiate the connection:

$.connection.hub.start().done(function () {
    console.log('Connected to SignalR hub')
}).fail(function (err) {
    console.log('Failed to connect to SignalR hub ' + err);
});

And when we want to call the server to subscribe to a stream:

dataHub.server.Register(1);

CORs

CORS is a horrible, horrible pain that masquerades as security. If you want your SignalR hub to be hosted on a server that is different than the server that hosts your front-end, you'll need to include the Microsoft.Owin.Cors Nuget package and use the following code in your OWIN startup:

app.Map("/signalr", map =>
{
    map.UseCors(CorsOptions.AllowAll);
    var hubConfiguration = new HubConfiguration();
    map.RunSignalR(hubConfiguration);
});

More Information

Like this post? Please share it!

See a mistake? Edit this post!

How We Produce the MS Dev Show Podcast

We've gotten a lot of questions about how we get great sound on the MS Dev Show. From the start, Carl and I knew that great sound wouldn't make us successful, but bad sound could definitely hurt us.

A lot of podcasts have just... ended. It seems to happen somewhere between episode 20 and 100. I knew how important it was to make the podcast take the absolute minimum time commitment.

Today, I'm sharing our entire process.

Many podcasts won't talk about this. I'm not sure if it's too much inside baseball, or if they think they are trade secrets. Well, I'm all about sharing.

From a high-level, we find guests, prepare, record, edit, and publish.

Guests

Guests are a big part of our show. We've always wanted to have interesting guests. We occasionally have some names you've heard of, but we also love to have guests that haven't been on a podcast before.

I believe that everyone has a story, and we want to hear it.

Since guests have to take time out of their important schedule for us, our goal is to make it as easy as possible, with a minimal time commitment. Once a guest accepts and we work out a time slot, we have a template invite we send them. Templates are absolutely key to our communications, and allow us to be clear and concise. As we improve our process, we evolve our templates.

Email Template Example

Around 24 hours before the episode is scheduled to record, we send another email template that contains a rough idea of the questions Carl and I want to ask, and everything the guest needs to know to get set up. More on that later.

Preparation

OneNote is what powers the MS Dev Show. All of our processes, templates, and episode details are in a OneNote notebook that Carl and I share.

OneNote Screenshot

As Carl and I come across stories we think would be worth discussing, we put links into a OneNote page for the associated episode.

We also use OneNote while we're recording, back to that in just a bit.

Hardware

You might think this is the most important part, and you'd be partially right. For our mics, we copied TWiT, and use the Heil PR-40. At ~$330, it's pricey relative to other mics, but cheap compared to the computer you're plugging it into.

Heil PR-40

Lesson time. The Heil is a dynamic microphone. Some people use condenser mics, but condenser mics do a horrible job cutting out background sound. Dynamic mics do a good job of only picking up the sound right in front of them. This is key for Carl and I since we're recording from our home offices and have kids and pets.

If you want to hear the difference between a USB headset, and the mics Carl and I use, check out this track that I recorded shortly after getting the Heil:

Our mics are connected to the Alesis IO2 Express. This is what converts the signal from our mics to USB to connect to our computers.

Alesis

Using an insert cable, we route the mic sound through the Behringer MDX1600 compressor/gate/limiter. This primary serves to "gate" our audio, or essentially turn it off when we're not speaking. This is our first line of defense against barking dogs, screaming kids, and loud keyboards. I now believe this equipment is optional thanks to improved software processing that I'll describe later.

Behringer Compressor/Gate/Limiter

Recording

Around 24 hours before we record, we send the guest an email reminder with additional details.

Pre-Show Email

We use Skype to talk to our guests. One lesson learned with Skype is that it auto adjusts your microphone volume by default. It has a tendency to boost the gain if you don't talk for a while. When you start talking again, it blows out the audio, and sounds terrible. I manually adjust this setting by speaking at a normal level, allow it to auto-adjust, and then uncheck the box.

Skype Auto-Adjust Mic

Carl and I both use Callburner to record all sides of the conversation. Since we both have complete copies of the call, we can be fairly confident that even if we have a technical failure, we'll still be able to fall back to a second copy.

Callburner

We always record our tracks in raw WAV format:

Callburner Settings

Additionally, we ask our guests to record their own microphone input. We include simple instructions to make it as painless as possible. When everything goes according to plan, we have a separate track for every person on the call. One track for me, one track for Carl, and one track for the guest.

As we go through the episode, we use OneNote as a guide. We use it to make sure we don't forget any big, important questions, and we use it to mark off questions that were already asked. Since OneNote updates on both sides in near real-time, it allows us to run the show as we go, without stopping or IM'ing.

Editing

Short version: trim, Auphonic processing, truncate silence, finishing touches.

First, I use Audacity to trim the tracks. Audacity is free software that works amazingly well. There is always some pre-show and post-show chat, so I cut them down to the meat, and make them all the same length.

Next is noise reduction. This is an area that can eat up a lot of time if you let it. In the early episodes, I did minimal manual edits. As I started to desire higher quality, I found myself spending more and more time on editing. We're not talking about major edits, it was more about removing breaths, clicks, etc. I was getting desperate to cut this down. I was willing to try anything. I even tried Adobe Audition, but it is obvious that it wasn't really designed to edit a podcast. Don't get me wrong, it's fully capable, it's just not optimized for a podcast workflow.

Then, a miracle. I found a website called Auphonic. It's unbelievable at processing audio. It has amazing noise reduction, which is key for guests since they don't have gates. It also intelligently focuses on the track of the person that is speaking, and attenuates the other voices. This is amazingly effective. Other than our gate, this is the only audio processing we do. This software is good enough that you could skip the compressor/gate/limiter completely. It works fine if you have 1 track, but even better if you have separate tracks for each speaker. All of the settings we use for the show have been saved as a preset, so it takes less than 60 seconds to submit a processing job.

Auphonic Output

After Auphonic works its magic, I bring it back into Audacity. Then, I use a feature called truncate silence (under the "Effect" Menu). This is one of our best kept secrets. It takes out pauses in the audio that are longer than a certain duration, and shortens them up. The end result is that even if someone takes a moment to answer a question, it will sound like they answered without pausing. In a typical hour long episode, this takes out around 4 full minutes.

Truncate Silence Menu Item

Adding the Intro/Outro

I record the intro text directly into audacity. "Welcome to the MS Dev Show, episode number....". I place this track below the intro music track. Then, I use the "Auto Duck" option under the "Effect" menu. This automatically turns down the volume while I'm talking. If I had a long pause in the intro voice, the theme music would actully come back up and fill it in. Lastly, I use the Envelope Tool to make the intro sound fade in. I started doing this since the riff at the start of the intro can be a bit glaring.

Ducking the Audio

The outro is pre-recorded for convenience, and I just put it at the end.

Publishing

The easiest way to publish your podcast is to use Libsyn. It's fairly inexpensive, and you pay monthly for new episodes. The great part is that you don't pay for old episodes. They handle everything for you from hosting the files to providing the feeds that you can submit to aggregators like iTunes and Stitcher. Make sure you check on those services to ensure the feed is set up the way you want.

Carl handles the shownotes, and these are created by exporting them from OneNote to an mht document, and then using PanDoc to convert to markdown to publish to our website.

Our website is completely open source. You can see all of the code in GitHub. You can even fork the site, create your own, or issue pull requests. The website itself is hosted in Azure, and automatically redeploys when we check in a change to GitHub.

Feel free to watch the commit log. You might even get a sneak peek at an episode before it's published!

GitHub Changelog

Thanks!

I made a quick video showing the editing process:

Credits

Like this post? Please share it!

See a mistake? Edit this post!

IoT Data in Manufacturing - My Thoughts

I've been spending a lot of time recently thinking and literally dreaming about IoT (Internet of Things) applications. I wanted to share some of my current thinking on where we're at, what is happening, and what things might look like in the future.

Manufacturing Data Today (the boring part)

Within a manufacturing plant today, we can categorize the software into 3 high-level layers, which are not necessarily easy to delineate:

These systems are all extremely complex, and I'll never fully understand them. I'm primarily concerned the MES/SCADA portion dealing with the raw plant sensor data, how it flows, and how we turn that into meaningful information.

Optical sensors, pressure sensors, temperature sensors, and any other sensor you can imagine probably already exists and is in use within Manufacturing today. Manufacturing generates far more data than any other vertical. I suspect manufacturing has been one of the key drivers behind the dropping prices of sensors over the past few decades.

The communication network within your typical plant is based on standards that were defined over 4 decades ago. These networks exist to multiplex and centralize all of the data in the plant. Of course you'll find siloed subsystems that work independently, and aggregate data is sent to the central location. You'll find pockets of newer TCP/IP networks, but you'll also find a lot of low-speed serial communications.

Centralized Network Pattern

At the center of this system is a high-performance, time series database, known as a Historian. This is the center of the universe. All data is stored here. Security is handled through virtue of being only internally accessible.

For corporate-wide reporting, data needs to be aggregated from this historian, either through additional software, or through the ERP system and processes. This tends to be expensive, difficult, and incomplete due to the delta between the vast amount of data collected from the source, and the aggregated enterprise data.

The IoT / Cloud Transition (the fun part)

We have all of this data at the source, great. Now what? The real power is in unlocking the data.

There is some very low hanging fruit that is driving change today. Thanks to falling storage and compute costs in cloud environments, there is a big incentive to centralizing our data. Having all of our data aggregated in the cloud means that we can run massive, scalable jobs and generate reports at a scale that used to be difficult and costly. We can not only start to benchmark multiple facilities, we can drill down to any level. Slicing and dicing the data moves from being the job of a report writer, to that of the report viewer.

The cloud is where we aggregate storage and compute

Machine learning is the new frontier, and has far reaching implications. Previously, we had to know exactly what questions to ask, and having enough compute power to explore the data was expensive. Today, the cloud provides the massive horsepower we need to not just explore the data quickly, but to also glean insights that we never thought of.

Throughout history, we've spent a significant amount of time analyzing data, looking for reasons why and when things fail, trying to predict order volumes, trying to figure out how to maximize employee productivity, and the list goes on. These are questions that can be explored, and potentially optimized by data scientists and machine learning. Machine learning as a service makes it a commodity, available to any sized business, on-demand.

The Future (the exciting part)

I hope that once the dust settles, we'll have standards that allow devices from various companies to inter-operate in a reliable, secure manner.

Plummeting device costs are a given, so it's safe to assume we'll have more computing power available to us almost universally. To really get value from the data, we first need to allow devices to share it. If device A knows what device B is up to, Device A can operate more intelligently. This is a collective intelligence. This collective intelligence will also require a management hierarchy. A management hierarchy allows higher levels to have a greater understanding of what should be accomplished, and less about how it should be accomplished.

Device Collaboration & Supervision

Does this sound familiar? This is how employees are traditionally organized within an organization. As you go up the management chain, the goals become more focused on the overall organizational goals. As you go down the management chain, you get to where the real work happens, and there may be far less context in the larger organization.

Organizations are starting to evolve into a more networked design, and so will devices. Devices will have a roughly hierarchical organization, but will realize advantages to direct communications. Features like high availability can exist at lower levels.

In other words, we'll have redundancy, inter-device communication, but we'll also have a logical model that defines how the system operates. As a simple example, imagine if we had 3 temperature sensors measuring the same thing. Now imagine one of the sensors fails, or starts to report irregular values. Using a logical model that overlays the physical model allows us to define operation separating the concerns of the low-level details.

Virtual Sensor

Now, we want to get data from any point in the hierarchy to where it is needed. A machine operator needs to know what is happening in the machine in real-time. The supervisor needs to know how multiple lines are operating in real-time. The plant manager needs to know how the overall plant is running, again, in real-time. We'll also need to store historical data, operational reporting data, and so on.

Further Reading

Also check out my manufacturing projects on GitHub.

Like this post? Please share it!

See a mistake? Edit this post!

Jason Young I'm Jason Young, software developer at heart, technical evangelist for Microsoft by day. This blog contains my opinions, of which my employer may not share.

@ytechieGitHubLinkedInStack Overflow