Google+

Value type comparison pitfall with == vs Equals

I recently ran into a situation that momentarily confused me, because it was non-intuitive to me at first. I'm working on a class that tracks changes made in UI controls in Silverlight, and I wrote code similar to the following:

private void checkChanges(UIElement control)
{
    object oldValue = getOldValue(control);
    object newValue = getNewValue(control);

    if(oldValue == newValue)
        return;

    Debug.WriteLine("The value has changed");
}

The data type I was working with in this case was a DateTime, which happens to be Struct, which is a value type. I know that this code works as expected:

DateTime time1 = DateTime.Parse("1-1-09");
DateTime time2 = DateTime.Parse("1-1-09");

//This is true
Assert.IsTrue(time1 == time2);

The code above works because I'm comparing 2 DateTime structures. The original code does not work because the structures are being boxed, in other words, they're wrapped in objects. When you use "==" on two objects, it's comparing the memory references, determining if they're the same instance. In this case, the DateTime objects are each boxed into separate boxes.

The workaround is to use the "Equals" method which exists on all objects, and is overridden for the most common framework elements you'll use. For example, DateTime overrides .Equals to determine if the date/times are equivalent.

So if I wanted to fix my original code, it would look like this:

private void checkChanges(UIElement control)
{
    object oldValue = getOldValue(control);
    object newValue = getNewValue(control);

    if(oldValue.Equals(newValue))
        return;

    Debug.WriteLine("The value has changed");
}

Of course this problem applies to all values types such as int, double, and any custom structs you may have created.

It goes without saying that this pitfall is not an issue when you're working with reference types, because they have no need to be boxed.

Like this post? Please share it!

See a mistake? Edit this post!

Is Quality Important?

Joel Spolsky and Jeff Atwood stirred up some debate when they said "Quality just doesn't matter that much". At first, I was a little outraged. My entire development process is built around quality. Without it, airplanes would fall from the sky and your car wouldn't start in the morning.

Levels-of-Quality

So can we definitively put the quality question to rest? Unfortunately, "No".

First of all, we need to understand that quality isn't a Boolean. It's not "yes", you have quality, or "no", you don't have quality. Quality is a gradient, but it's even worse than that. Everyone sees it differently, and everyone experiences a different aspect of it. In short, quality is a multidimensional gradient!

I used to work at a small development company where I worked very closely with the President of that company. He was concerned with quality, but that took a backseat to the features that went into the product. The features themselves sold the product, and wowed the people writing the checks. Once they purchased our software, the integration efforts were large enough that the customer was essentially locked-in. Throw an expensive support contract into the mix, and it was a money making machine.

The company ended up being very successful, and was eventually assimilated by a huge company. The owners ended up walking away with a few million each. Try to explain to them that quality is more important than features!

Now fast-forward a few years, and we can examine what eventually happened. The product did work, and honestly it was the best in its class simply due to the scope of the problems it was trying to solve, and the high barrier of entry for competitors. However, the quality issues eventually caught up with the product. It became difficult to maintain and add extra features. The only solution was to slowly rewrite sections of it.

I think a great analogy is the turtle and the hare. If you're in for the long haul, you want to be the steady turtle. If you're in it for the short term, you want to be as quick as possible, even at the cost of stopping to nap. The problem is, you're making others suffer for your negligence.

If you want the best of both worlds, build quality into your development process. I'll be covering this in a series of articles that discuss unit testing (and testing in general) in exhaustive detail. They should be coming out by the middle of March. Stay tuned!

Like this post? Please share it!

See a mistake? Edit this post!

Using C# Yield for Readability and Performance

I must have read about "yield" a dozen times. Only recently have I began to understand what it does, and the real power that comes along with it. I'm going to show you some examples of where it can make your code more readable, and potentially more efficient.

To give you a very quick overview of how the yield functionality works, I first want to show you an example without it. The following code is simple, yet it's a common pattern in the latest project I'm working on.

IList<string> FindBobs(IEnumerable<string> names)
{
    var bobs = new List<string>();

    foreach(var currName in names)
    {
        if(currName == "Bob")
            bobs.Add(currName);
    }

    return bobs;
}

Notice that I take in an IEnumerable, and return an IList. My general rule of thumb has been to be as lenient as possible with my input, and as strict as possible with my output. For the input, it clearly makes sense to use IEnumerable if you're just going to be looping through it with a foreach. For the output, I try to use an interface so that the implementation can be changed. However, I chose to return the list because the caller may be able to take advantage of the fact that I already went through the work of making it a list.

The problem is, my design isn't chainable, and it's creating lists all over the place. In reality, this probably doesn't add up to much, but it's there nonetheless.

Now, let's take a look at the "yield" way of doing it, and then I'll explain how and why it works:

IEnumerable<string> FindBobs(IEnumerable<string> names)
{
    foreach(var currName in names)
    {
        if(currName == "Bob")
            yield return currName;
    }
}

In this version, we have changed the return type to IEnumerable, and we're using "yield return". Notice that I'm no longer creating a list. What's happening is a little confusing, but I promise it's actually incredibly simple once you understand it.

When you use the "yield return" keyphrase, .NET is wiring up a whole bunch of plumbing code for you, but for now you can pretend it's magic. When you start to loop in the calling code (not listed here), this function actually gets called over and over again, but each time it resumes execution where it left off.

Typical Implementation

  1. Caller calls function
  2. Function executes and returns list
  3. Caller uses list

Yield Implementation

  1. Caller calls function
  2. Caller requests item
  3. Next item returned
  4. Goto step #2

Although the execution of the yield implementation is a little more complicated, what we end up with is an implementation that "pulls" items one at a time instead of having to build an entire list before returning to the client.

In regards to the syntax, I personally think the yield syntax is simpler, and does a better job conveying what the method is actually doing. Even the fact that I'm returning IEnumerable tells the caller that its only concern should be that it can "foreach" over the return data. The caller can now make their own decision if they want to put it in a list, possibly at the expense of performance.

In the simple example I provided, you might not see much of an advantage. However, you'll avoid unnecessary work when the caller can "short-circuit" or cancel looping through all of the items that the function will provide. When you start chaining methods using this technique together, this becomes more likely, and the amount of work saved can possibly multiply.

Ayende has a great example of using yield for a slick pipes & filters implementation. He even has a version that is multi-threaded which I find very intriguing.

One of my first reservations with using yield was that there is a potential performance implication. Since c# is keeping track of what is going on in what is essentially a state machine, there is a bit of overhead. Unfortunately, I can't find any information that demonstrates the performance impact. I do think that the potential advantages I mentioned should outweigh the overhead concerns.

Conclusion

Yield can make your code more efficient and more readable. It's been around since .NET 2.0, so there's not much reason to avoid understanding and using it.

You can find detailed information about how the yield keyword works under the hood here.

Have you been using yield in interesting ways? Have you ever been bitten by using it? Leave a comment and let me know!

Like this post? Please share it!

See a mistake? Edit this post!

Jason Young I'm Jason Young, software engineer. This blog contains my opinions, of which my employer - Microsoft - may not share.

@ytechieGitHubLinkedInStack OverflowPersonal VLOG