<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>YTechie.com &#187; sql</title>
	<atom:link href="http://www.ytechie.com/category/sql/feed" rel="self" type="application/rss+xml" />
	<link>http://www.ytechie.com</link>
	<description>Productive software development using ASP.NET, C#, Adobe Flex, and other technologies and tools.</description>
	<lastBuildDate>Fri, 06 Nov 2009 21:16:21 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Speeding up data access by using Linq to SQL or EF</title>
		<link>http://www.ytechie.com/2009/10/speeding-up-data-access-by-using-linq-to-sql-or-ef.html</link>
		<comments>http://www.ytechie.com/2009/10/speeding-up-data-access-by-using-linq-to-sql-or-ef.html#comments</comments>
		<pubDate>Mon, 26 Oct 2009 20:53:53 +0000</pubDate>
		<dc:creator>superjason</dc:creator>
				<category><![CDATA[LINQ]]></category>
		<category><![CDATA[c#]]></category>
		<category><![CDATA[entity framwork]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.ytechie.com/2009/10/speeding-up-data-access-by-using-linq-to-sql-or-ef.html</guid>
		<description><![CDATA[Recall that LINQ based object relational mappers (ORM) use expression trees to effectively translate your C# (or other language) LINQ code into SQL. Many DBA’s and developers that don’t fully understand this technology are often quick to discredit it. I’m going to show how significant performance, simplicity, and clarity can be gained by using Linq [...]]]></description>
			<content:encoded><![CDATA[<p>Recall that LINQ based object relational mappers (ORM) use expression trees to effectively translate your C# (or other language) LINQ code into SQL. Many DBA’s and developers that don’t fully understand this technology are often quick to discredit it. I’m going to show how significant performance, simplicity, and clarity can be gained by using Linq to SQL.</p>
<p>A recent DBA asked me the question “I thought inline SQL was bad, so why are we using it again?” LINQ may *smell* like inline SQL, but it is not. Let’s first take a look at some simple LINQ that is easy to read: </p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:0511d0cf-0c6f-4895-961f-7ab27960207d" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">from d in Devices
where d.CZone == 4
&amp;&amp; d.Type == "X"
select d.Id</pre>
</div>
<p>So how is this different than inline SQL? To be technical, you’re writing a query against a data model, with full intellisense. You’re also writing a provider agnostic query. This same query can be performed against SQL Server, Oracle, or even the Facebook API if there was a supporting framework in place. We now have a truly unified query architecture.</p>
<p>Let’s keep taking about this simple LINQ query, and see how you would write it if you didn’t want to use LINQ. Most developers before the days of LINQ would probably use a stored procedure. Stored procedures are great. They’re efficient, reusable, and easily updatable. Here is what it may look like:</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:4e3748fa-09f3-42ae-9b35-05ea2bc23f29" class="wlWriterEditableSmartContent">
<pre name="code" class="sql">Create Procedure GetMyStuff
As
Select Id
From Devices
Where CZone = 4
And Type = 'X'
Select Id
GO</pre>
</div>
<p>A nice, simple SQL query. There are a few disadvantages that may not be immediately apparent:</p>
<ul>
<li>If you need a second, similar query, you have to either have to create and maintain two stored procedures. As an alternative, you could modify the stored procedure to operate differently based on a parameter. Both of these options are not idea, but LINQ does give us an alternative that I’ll discuss in a bit. </li>
<li>You don’t get intellisense when you’re writing your code. </li>
<li>You have to be concerned with two different “programming” paradigms, and also have to manually manage the translation in both directions. </li>
</ul>
<p>Now, let’s take our example to the other end of the spectrum, which will help show where LINQ can really shine where straight SQL does not. This example is a query for a search page. I set up a simple ASPX page to demonstrate. Here is a sample of the user interface:</p>
<p><a href="http://www.ytechie.com/post-images/2009/10/image.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://www.ytechie.com/post-images/2009/10/image_thumb.png" width="477" height="211" /></a> </p>
<p>The user enters a number of search criteria, and the results are displayed. I literally coded this in under 5 minutes. If you’re used to using stored procedures to retrieve this type of data, think about how you would go about creating this. You have a couple of options that I’m aware of:</p>
<ul>
<li>Write a separate stored procedure for every combination of parameters. In this case that would be 7 stored procedures. This would certainly not be ideal. </li>
<li>Write a single stored procedure that can handle each parameter as nullable parameters, and use “Where @Param Is Null Or Param = @Param”. This option is easy, but has some potential performance implications. </li>
<li>Write a single stored procedure that can handle each parameter as nullable parameter, and “If” statements to handle each scenario. This would be time consuming an error prone. </li>
</ul>
<p>In LINQ, we’re able to <strong>dynamically build up a query</strong>. For the search example, my LINQ looks like this:</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:3f959401-5933-40c2-9b78-fdfdf1f457d9" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">var dataContext = new DataClassesDataContext();

var query = (IQueryable&lt;Device&gt;) dataContext.Devices;

if (txtCZone.Text.Length &gt; 0)
    query = query.Where(device =&gt; device.CZone == int.Parse(txtCZone.Text));
if (txtUCZone.Text.Length &gt; 0)
    query = query.Where(device =&gt; device.UCZone == int.Parse(txtUCZone.Text));
if (txtLZone.Text.Length &gt; 0)
    query = query.Where(device =&gt; device.LZone == int.Parse(txtLZone.Text));

dgResults.DataSource = query.ToList();
dgResults.DataBind();</pre>
</div>
<p>And of course we can use the query syntax instead (replacing lines 5-10 above):</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:bcfec0c3-d2bc-49b0-b45a-9878b8717f6f" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">if (txtCZone.Text.Length &gt; 0)
	query = from device in query where device.CZone == int.Parse(txtCZone.Text) select device;
if (txtUCZone.Text.Length &gt; 0)
	query = from device in query where device.UCZone == int.Parse(txtUCZone.Text) select device;
if (txtLZone.Text.Length &gt; 0)
	query = from device in query where device.LZone == int.Parse(txtLZone.Text) select device;</pre>
</div>
<p>The result is that the SQL code is specifically written to support only the parameters that the user has entered. No extra SQL, and no specific SQL to maintain. Remember that LINQ can be chained together without querying the underlying data. The actual querying of the data only occurs when enumerating the results, using an operation like “ToList()”.</p>
<p>To support paging we need to run 3 different types of underlying queries. Here is where LINQ is really going to shine. We can use the same base query for all of these operations, and not have to worry about the drastically different underlying SQL statements.</p>
<ol>
<li><strong>Result count</strong> &#8211; Simply by calling the “.Count()” method, we can retrieve the number of rows the query will return in total. The underlying SQL will be a simple and efficient <em>Count</em> operation. </li>
<li><strong>Page n query</strong> – By utilizing <em>Skip</em> and <em>Take</em>, any page within the results can be queried. The work of generating a common table expression is handled for you. </li>
<li><strong>First page query</strong> – If the underlying provider has an optimization for using the SQL <em>TOP</em> command, the first page of data you query will be able to avoid a common table expression. This has the advantage of being more efficient when the first (and often most common) page of results is displayed. </li>
</ol>
<p><strong>Real-world Results</strong></p>
<p>I initially ran into this in a real application that was primarily used to search through a large table of records. It had originally used the stored procedure approach, and was causing the entire system to slow down to the point of being unusable. Thanks to LINQ, we were able to make the search usable. In fact, the results were drastic:</p>
<table border="1" cellspacing="0" cellpadding="2" width="309" align="center">
<tbody>
<tr>
<td valign="top" width="66">&#160;</td>
<td valign="top" width="136"><strong>Stored Procedure</strong> </td>
<td valign="top" width="105"><strong>LINQ to SQL</strong> </td>
</tr>
<tr>
<td valign="top" width="66"><strong>Reads</strong></td>
<td valign="top" width="136">Over 4,000,000 </td>
<td valign="top" width="105">8948 </td>
</tr>
<tr>
<td valign="top" width="66"><strong>Duration</strong></td>
<td valign="top" width="136">3249ms </td>
<td valign="top" width="105">189ms </td>
</tr>
</tbody>
</table>
<p>In addition to the improved performance, the code was easier to maintain. The stored procedure was extremely cluttered, had large <em>where</em> clauses, and even contained two nearly identical copies of the query. One for calculating the count, and one for paging support.</p>
<p><strong>Conclusion</strong></p>
<p>LINQ gives us much more than “inline SQL”. It gives us a unified query syntax, delayed execution, query expression building, and dynamically created SQL output. Additionally, the generated queries are optimized based on the exact query being performed instead of making generic SQL that is optimized for multiple scenarios. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.ytechie.com/2009/10/speeding-up-data-access-by-using-linq-to-sql-or-ef.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>LINQ to SQL &amp; Entity Framework Pitfalls</title>
		<link>http://www.ytechie.com/2009/09/linq-to-sql-entity-framework-pitfalls.html</link>
		<comments>http://www.ytechie.com/2009/09/linq-to-sql-entity-framework-pitfalls.html#comments</comments>
		<pubDate>Thu, 01 Oct 2009 02:03:06 +0000</pubDate>
		<dc:creator>superjason</dc:creator>
				<category><![CDATA[LINQ]]></category>
		<category><![CDATA[c#]]></category>
		<category><![CDATA[entity framwork]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.ytechie.com/2009/09/linq-to-sql-entity-framework-pitfalls.html</guid>
		<description><![CDATA[In my last post describing the differences between LINQ to objects and LINQ to SQL, I mentioned how LINQ to SQL and Entity Framework “interpret” your LINQ code, and create the corresponding SQL. Forgetting this fact is extremely dangerous, because LINQ to SQL and other object relational mappers are extremely leaky abstractions. LINQ is obviously [...]]]></description>
			<content:encoded><![CDATA[<p>In my last post describing the <a href="http://www.ytechie.com/2009/09/understanding-linq-and-linq-to-sql-and-ef.html">differences between LINQ to objects and LINQ to SQL</a>, I mentioned how LINQ to SQL and Entity Framework “interpret” your LINQ code, and create the corresponding SQL. Forgetting this fact is extremely dangerous, because LINQ to SQL and other object relational mappers are extremely leaky abstractions. LINQ is obviously a wonderful technology, but this post will be talking about some potential pitfalls you may run into.</p>
<p><strong>SQL Query Complexity Disproportional to LINQ Complexity</strong></p>
<p>Recall the example from my last post:</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:5641d3eb-6590-4d9a-8cac-c0db294cd564" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">//Query Syntax:
from device in Devices
where device.Type != null
select device.DeviceId

//SQL Syntax:
SELECT [t0].[DeviceId]
FROM [Devices] AS [t0]
WHERE [t0].[Type] IS NOT NULL</pre>
</div>
<p>In this case, LINQ to SQL has done something wonderful. It’s saved us from having to understand or worry about the translation of syntax between C# and SQL. Now, what happens when we write something a little more advanced, such as a nested group by?</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:62df480a-95e2-4910-95b5-9ec5eb45f0be" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">from d in Devices
group d by d.CZone into czoneGroup
select new { Key = czoneGroup.Key, val = from d2 in czoneGroup
	group d2 by d2.LZone into lzoneGroup
	select lzoneGroup.Key }</pre>
</div>
<p>And the corresponding SQL:</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:f50bc355-9467-4d6a-9aaa-63ff31cecff9" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">SELECT [t0].[CZone] AS [Key]
FROM [Devices] AS [t0]
GROUP BY [t0].[CZone]
GO

DECLARE @x1 Int = 3
SELECT [t0].[LZone]
FROM [Devices] AS [t0]
WHERE ((@x1 IS NULL) AND ([t0].[CZone] IS NULL)) OR ((@x1 IS NOT NULL) AND ([t0].[CZone] IS NOT NULL) AND (@x1 = [t0].[CZone]))
GROUP BY [t0].[LZone]
GO

DECLARE @x1 Int = 1
SELECT [t0].[LZone]
FROM [Devices] AS [t0]
WHERE ((@x1 IS NULL) AND ([t0].[CZone] IS NULL)) OR ((@x1 IS NOT NULL) AND ([t0].[CZone] IS NOT NULL) AND (@x1 = [t0].[CZone]))
GROUP BY [t0].[LZone]
GO

//Reminaing SQL removed....</pre>
</div>
<p>What just happened? Our innocent nested join has turned into a monster! This is an example of a query that is simple to do in LINQ, but has no translation to a simple SQL statement. Instead of just bombing, the LINQ to SQL engine comes up with a solution that a user may not have written themselves. A typical SQL developer may have looked for a different approach.</p>
<p><em>Side note: In the nested group-by, notice that LINQ to SQL uses multiple queries. This differs from the Entity Framework approach, which uses outer joins to achieve the same effect.</em></p>
<p>Does it matter? The answer isn’t so simple. In this simplified example, the performance impact is minimal. Unfortunately, with a large amount of data in this type of query, you could start to experience terrible performance. I personally saw a nested query that was only a few lines of code turn into a 27 page SQL statement. The SQL statement was technically correct, but took seconds to execute, when it should have taken a fraction of a second.</p>
<p>One simple solution that I have found to be very effective, yet not intuitive, is breaking apart the initial query and forcing it to execute using the ToList() method. You’ll have to have a decent “where” clause to avoid excessive amounts of data being returned. Once we have the raw data, LINQ to objects will provide us the same set of tools to further manipulate our data. For instance, here is a modified version of the example presented earlier:</p>
</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:fe1b1c69-41cf-42f9-9ab2-feed64283c70" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">//Simple &amp; fast initial query from the database
var rawData = (from d in Devices
where d.Location = 'B3').ToList();

//This operation happens "disconnected"
var results = from d in rawData
group d by d.CZone into czoneGroup
select new { Key = czoneGroup.Key, val = from d2 in czoneGroup
	group d2 by d2.LZone into lzoneGroup
	select lzoneGroup.Key };</pre>
</div>
<p>The reason this works well is that it’s taking advantage of the strength of SQL Server, which is to query data, and the strength of .NET, which is to process and manipulate data.</p>
<p><strong>LINQ Abstracting Away Problems it Can’t Solve</strong></p>
<p>Here is a simplified version of a query I saw recently:</p>
</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:6b58a736-2368-485e-ad30-4726c9267ae1" class="wlWriterEditableSmartContent">
<pre name="code" class="c#">int sum = (from d in Devices
where 1 == 2 &amp;&amp; d.CZone != null
select d.CZone.Value).Sum()</pre>
</div>
<p>To make it extremely clear what I’m trying to accomplish, I put “1 == 2” in the “where” clause, so that no rows match the condition. The “Sum()” method returns the type that it’s acting on. For example, if you’re summing integers, the result is an integer. If you’re summing nullable integers, the result is a nullable integer. This is perfectly valid LINQ. This is effectively the SQL that is generated (I simplified it for clarity):</p>
<div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:812469c5-0cb0-4c63-8c15-c81123a09de7:8e0404af-4082-4891-80e6-4e96545bd024" class="wlWriterEditableSmartContent">
<pre name="code" class="sql">Select SUM(CZone)
From Devices
Where 1 = 2</pre>
</div>
<p>Since the result of this SQL statement is NULL, it can’t be converted back to an integer. The exception is “<em>InvalidOperationException: The null value cannot be assigned to a member with type System.Int32 which is a non-nullable value type.</em>”</p>
<p>When the LINQ is translated to SQL, there is no such operation as converting a nullable value to a non-nullable, so the “.Value” operation is ignored. This would be fine if the sum function still expected a nullable return type, but it’s now expecting an integer. When it can’t find any rows to return, it tries to return NULL. Since it’s trying to package up a NULL value into a standard integer type, it has no choice but to throw an exception.</p>
<p><strong>Conclusion</strong></p>
<p>Getting started with LINQ is fairly straightforward, but you can’t forget the fact that whatever query you’re writing must be converted into a SQL statement, and the results must be converted back to data that is understandable to .NET. Every LINQ query you write should be checked with a tool such as <a href="http://www.linqpad.net/" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.linqpad.net/?referer=');">LINQPad</a> to ensure that the SQL is efficient, and matches what you expect.</p>
<p>Also keep in mind that when you upgrade your data provider, your queries can change. For example, converting a statement from LINQ to SQL to Entity Framework can generate different SQL queries, just as updating to a newer version of the same ORM can.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ytechie.com/2009/09/linq-to-sql-entity-framework-pitfalls.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Stored procedure reporting &amp; scalability</title>
		<link>http://www.ytechie.com/2008/05/stored-procedure-reporting-scalability.html</link>
		<comments>http://www.ytechie.com/2008/05/stored-procedure-reporting-scalability.html#comments</comments>
		<pubDate>Tue, 27 May 2008 21:40:26 +0000</pubDate>
		<dc:creator>superjason</dc:creator>
				<category><![CDATA[software development]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.ytechie.com/2008/05/stored-procedure-reporting-scalability.html</guid>
		<description><![CDATA[Today&#8217;s post is a case study of sorts, about my former employer, who had an interesting architecture. It&#8217;s roots were VB6 and SQL server (version 6 I believe). They decided to put as much logic in their stored procedures as possible. The arguments being:

Easy to update (fix, improve) on-the-fly. 
Hard to work with data (multiple [...]]]></description>
			<content:encoded><![CDATA[<p>Today&#8217;s post is a case study of sorts, about my former employer, who had an interesting architecture. It&#8217;s roots were VB6 and SQL server (version 6 I believe). They decided to put as much logic in their stored procedures as possible. The arguments being:</p>
<ul>
<li>Easy to update (fix, improve) on-the-fly. </li>
<li>Hard to work with data (multiple tables, arrays, etc) in VB6 and ASP, at least compared to .NET. </li>
</ul>
<p>Given the circumstances, I don&#8217;t think that I would go back and change history if given the chance.</p>
<p>The problem is that the world has changed, and their architecture just doesn&#8217;t scale well. The biggest problem is in the area of reporting (which is now primarily .NET). Since all the data manipulation and logic is in the stored procedures, the database is forced to do all the work. In a small application with not a lot of data, this probably isn&#8217;t a big deal. The report runs for a few seconds and no one really experiences a slow down.</p>
<p>The problem arises when there is a lot of data. Easily millions of rows in dozens of tables. The current fix is to scale vertically, meaning that they throw more hardware (AKA money) at the problem.</p>
<p>The real solution to the problem is to scale horizontally. There are two main ways of doing this. The first is to add more database servers. The problem is that this isn&#8217;t really all that easy. The software to do this is getting better, but the fact remains that two different systems now have to stay synchronized. I&#8217;ve read articles about businesses that have scaled their business, and a common theme is that <strong>databases are one of the hardest parts of your architecture to scale</strong>.</p>
<p>The second aspect of scaling is to reallocate where your work is being done. You need to start thinking of the database as a&#8230;well&#8230;.a database! The first and foremost purpose of a database is to store massive amounts of data, and allow you to quickly retrieve that data.</p>
<p>Databases are amazingly fast when you use them simply as a place to store data. If you design your database correctly, and set up indexes that are optimized for the ways you want to retrieve your data, there should be no reason to wait for your data. SQL Server can easily handles millions or even billions of rows, and query any of them almost instantaneously. Even with multiple queries being executed concurrently or in succession.</p>
<p>Consider the following diagrams:</p>
<p><img height="382" alt="Architecture" src="http://www.ytechie.com/post-images/2008/05/architecture.gif" width="445" border="0" /> </p>
<p>The existing architecture is on the left, and the proposed structure is on the right. It&#8217;s not a big change, I&#8217;m simply suggesting that the business logic be moved from the stored procedures, into the code.</p>
<p>The first reaction that I usually receive from this suggestion is that the code is going to be ugly compared to the corresponding SQL. The reality is that the SQL is ridiculously complicated (at least in my experience). SQL is great for set based logic, but really starts to break down when trying to do object oriented or procedural operations.</p>
<p>The fact is that .NET is great at <strong>processing</strong> massive amounts of data. First of all, it&#8217;s incredibly fast. If you&#8217;re writing your code correctly, you&#8217;ll be amazed at how fast it can process data.</p>
<p>More importantly, if you&#8217;re using a programming language to manipulate your data instead of T-SQL, you can really start to break down the problem at hand. Databases are traditionally very bad at breaking a large problem into smaller problems. Sure, you can call other functions and stored procedures, but you can tell that it&#8217;s not the strongest feature of the database. A programming language like .NET lets you pass data around in any structure that you can conceive.</p>
<p>The other major advantage of processing the data in your code, is that you can now easily build testable code. Any code that you can easily test will have less bugs, and should be easier to maintain in the long run.</p>
<p>If your data is now being processed and organized in your code and not the database, you are probably ready to scale horizontally. It&#8217;s relatively easy to add more front end servers. Since they all hit a common database, there aren&#8217;t really any synchronization issues.</p>
<p>So does this really work? Of course. I&#8217;m not the first to think of it by any means. Take any of the largest websites on the Internet, and look at how they have <a href="http://highscalability.com/" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/highscalability.com/?referer=');">designed a scalable</a> architecture. Digg.com has always had <a href="http://highscalability.com/digg-architecture" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/highscalability.com/digg-architecture?referer=');">database scalability issues</a> with their MySql servers, so they try to minimize them as much as possible. Twitter uses <strong>ONE</strong> database server to handle <a href="http://pragmati.st/2007/5/20/scaling-twitter" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/pragmati.st/2007/5/20/scaling-twitter?referer=');">thousands of requests per second</a>. eBay.com doesn&#8217;t even do joins in their database. They would never be able to scale if they put that burden on a database that handles <strong></strong><a href="http://highscalability.com/ebay-architecture" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/highscalability.com/ebay-architecture?referer=');">26 billion SQL queries each day</a>.</p>
<p>Microsoft has made matters worse by integrating the .NET CLR into SQL Server. You can now write .NET code that gets executed on your database server. This is a great tool, but it must be used with great care. This isn&#8217;t an excuse to write more code in your database.</p>
<p>In summary, databases are an expensive commodity, don&#8217;t abuse them! Be careful how much you do in the database, and ask yourself if you can move some processing into code, where it is more easily scaleable.</p>
<p>Just recently, Scott Hanselman came out with a <a href="http://www.hanselman.com/blog/HanselminutesPodcast114WebsiteScalingWarStoriesWithRichardCampbell.aspx" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.hanselman.com/blog/HanselminutesPodcast114WebsiteScalingWarStoriesWithRichardCampbell.aspx?referer=');">podcast about website scaling</a>. If you&#8217;re looking for more site scaling concepts, it&#8217;s worth a listen.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ytechie.com/2008/05/stored-procedure-reporting-scalability.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SQL Server NULL values and &quot;Order By&quot; order</title>
		<link>http://www.ytechie.com/2008/05/sql-server-null-values-and-order-by-order.html</link>
		<comments>http://www.ytechie.com/2008/05/sql-server-null-values-and-order-by-order.html#comments</comments>
		<pubDate>Mon, 19 May 2008 18:21:25 +0000</pubDate>
		<dc:creator>superjason</dc:creator>
				<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.ytechie.com/2008/05/sql-server-null-values-and-order-by-order.html</guid>
		<description><![CDATA[I have a few tables that contain a column called &#34;Order&#34;, which is used to sort by when retrieving the data. The purpose is to keep the data in a certain order when displayed to the end user.



Black Linen
NULL


Navy Blue Linen
NULL


Dark Green Linen
NULL


Burgundy Linen
NULL


Ivory Vellum
NULL


Grey Felt
NULL


Natural Linen
NULL


White Coated Two Sides
1


White Cast Coated One Side
2


White SemiGloss [...]]]></description>
			<content:encoded><![CDATA[<p>I have a few tables that contain a column called &quot;Order&quot;, which is used to sort by when retrieving the data. The purpose is to keep the data in a certain order when displayed to the end user.</p>
<table style="width: 261px; border-collapse: collapse" cellspacing="0" cellpadding="0" width="261" border="2">
<tbody>
<tr style="height: 15pt" height="20">
<td style="width: 167pt; height: 15pt" width="216" height="20">Black Linen</td>
<td style="width: 29pt" width="41">NULL</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">Navy Blue Linen</td>
<td width="41">NULL</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">Dark Green Linen</td>
<td width="41">NULL</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">Burgundy Linen</td>
<td width="41">NULL</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">Ivory Vellum</td>
<td width="41">NULL</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">Grey Felt</td>
<td width="41">NULL</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">Natural Linen</td>
<td width="41">NULL</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">White Coated Two Sides</td>
<td align="right" width="41">1</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">White Cast Coated One Side</td>
<td align="right" width="41">2</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">White SemiGloss Coated One Side</td>
<td align="right" width="41">3</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">White Smooth</td>
<td align="right" width="41">4</td>
</tr>
<tr style="height: 15pt" height="20">
<td style="height: 15pt" width="216" height="20">White Linen</td>
<td align="right" width="41">5</td>
</tr>
</tbody>
</table>
<p>The problem is that SQL Server puts null values above non-null values when doing an &quot;order by&quot;. To reverse this behavior, this was the most elegant and efficient solution that I found:</p>
<pre class="sql" name="code">Select FooValue
From foos
Order by (Case When [Order] Is Null Then 1 Else 0 End), [Order]</pre>
<p>I found information about the <a href="http://cfsilence.com/blog/client/index.cfm/2006/2/21/TSQL-Dealing-with-Null-Values-in-an-Order-By-Clause" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/cfsilence.com/blog/client/index.cfm/2006/2/21/TSQL-Dealing-with-Null-Values-in-an-Order-By-Clause?referer=');">original problem here</a>, and the solution was from <a href="http://www.zeroesque.com/" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.zeroesque.com/?referer=');">Tim</a> in the comments. Thanks!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ytechie.com/2008/05/sql-server-null-values-and-order-by-order.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>When should you use database constraints?</title>
		<link>http://www.ytechie.com/2008/05/when-should-you-use-database-constraints.html</link>
		<comments>http://www.ytechie.com/2008/05/when-should-you-use-database-constraints.html#comments</comments>
		<pubDate>Tue, 13 May 2008 11:56:48 +0000</pubDate>
		<dc:creator>superjason</dc:creator>
				<category><![CDATA[productivity]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.ytechie.com/2008/05/when-should-you-use-database-constraints.html</guid>
		<description><![CDATA[A discussion came up at work recently about the extent of constraint usage in your databases. There were basically 2 camps:

Constrain everything humanly possible. If it&#8217;s an integer that wouldn&#8217;t normally be negative, add a &#34;&#62;= 0&#34; constraint. 
Constrain primarily where it&#8217;s necessary to maintain referential integrity. 

Consider the following diagram. It&#8217;s a map of [...]]]></description>
			<content:encoded><![CDATA[<p>A discussion came up at work recently about the extent of constraint usage in your databases. There were basically 2 camps:</p>
<ol>
<li>Constrain everything humanly possible. If it&#8217;s an integer that wouldn&#8217;t normally be negative, add a &quot;&gt;= 0&quot; constraint. </li>
<li>Constrain primarily where it&#8217;s necessary to maintain referential integrity. </li>
</ol>
<p align="left">Consider the following diagram. It&#8217;s a map of the flow of data from your user, which eventually makes its way into the database.</p>
<p align="center"><img height="76" alt="Validation Layers" src="http://www.ytechie.com/post-images/2008/05/validation-layers.gif" width="388" border="0" />&#160; </p>
<p>Since we&#8217;re getting input from a user, and they&#8217;re the one that can fix invalid data, we validate data at the top layer. There&#8217;s usually no getting around this. In fact, for the best user experience on the web, you&#8217;re going to perform some JavaScript validation. Then you&#8217;ll probably validate it again on the server, in case they have JavaScript disabled.</p>
<p>At this point, unless there is a bug in your code, you&#8217;re sure that the data is valid. You may not know if it&#8217;s referentially valid. Validating the input a third time in the database is <em>probably</em> overkill. It&#8217;s also a potential performance bottleneck.</p>
<p>Yes, there are many times when this doesn&#8217;t apply. For example, when <strong>multiple systems are interacting with the same database</strong>, and one counts on the data in a certain format. The only way to guarantee you get data in a format you expect is to constrain it at the database level.</p>
<p>In general, I avoid strict constraints at the database level. The biggest reason is that it requires your to synchronize all of your validators. They all have to agree on the same set of restrictions, or the code will fail. That goes against the LEAN and Agile philosophies. When I want to allow negative numbers in my integer field, it&#8217;s much easier to simply change it in my application. This is amplified if you have to talk to a DBA to make changes.</p>
<p>Another reason to avoid constraints is that they can&#8217;t always understand the data like the application can. For example, should a constraint attempt to ensure that valid email addresses are entered? If you&#8217;re storing a persons age, do you constrain it so that it can&#8217;t go above 500? 200? 100?</p>
<p align="center"><img height="210" alt="iStock_000005716223XSmall" src="http://www.ytechie.com/post-images/2008/05/istock-000005716223xsmall.jpg" width="279" border="0" /> </p>
<p>Now let&#8217;s assume that there is a bug in your application code, and you didn&#8217;t have a trusty constraint to stop it. You now have invalid data in your database. The good news is that you now have the potential to clean it up, or adapt your code to deal with it. The bad news is that if that value is used in a calculation that could have bad side effects, you could have big problems.</p>
<p>As with anything, there is no hard and fast rule for every situation, but we can at least make some general guidelines.</p>
<p>Think LEAN. Anything that doesn&#8217;t provide value to the customer is waste. Think Agile, requirements change, be adaptable.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ytechie.com/2008/05/when-should-you-use-database-constraints.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>LINQ, I&#8217;m not ready for you just yet</title>
		<link>http://www.ytechie.com/2008/04/linq-im-not-ready-for-you-just-yet.html</link>
		<comments>http://www.ytechie.com/2008/04/linq-im-not-ready-for-you-just-yet.html#comments</comments>
		<pubDate>Mon, 28 Apr 2008 21:45:24 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[c#]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.ytechie.com/2008/04/linq-im-not-ready-for-you-just-yet.html</guid>
		<description><![CDATA[Today I was between features on the current project I&#8217;m working on, so I had some free time start researching some technologies I&#8217;ve been meaning to learn and start using. The topics at the top of my learning list are LINQ and MVC. I gave LINQ a few months to mature, so I figured it [...]]]></description>
			<content:encoded><![CDATA[<p>Today I was between features on the current project I&#8217;m working on, so I had some free time start researching some technologies I&#8217;ve been meaning to learn and start using. The topics at the top of my learning list are <a href="http://msdn2.microsoft.com/en-us/netframework/aa904594.aspx" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/msdn2.microsoft.com/en-us/netframework/aa904594.aspx?referer=');">LINQ</a> and <a href="http://weblogs.asp.net/scottgu/archive/2007/10/14/asp-net-mvc-framework.aspx" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/weblogs.asp.net/scottgu/archive/2007/10/14/asp-net-mvc-framework.aspx?referer=');">MVC</a>. I gave LINQ a few months to mature, so I figured it was a good time to investigate.</p>
<p align="center"><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="212" alt="Boy Crying" src="http://www.ytechie.com/post-images/2008/04/boy-crying.jpg" width="265" border="0" /> </p>
<p align="center">(there is no emoticon to express my anger!)</p>
<p>The picture above shows how I felt when I started writing my first LINQ expression. The biggest problem was the fact that the latest version of <a href="http://www.jetbrains.com/resharper/download/new-VS-support.html" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.jetbrains.com/resharper/download/new-VS-support.html?referer=');">ReSharper</a> doesn&#8217;t support any .NET 3.0+ language features. Not only does it not support LINQ, it&#8217;s IntelliSense severely interrupts you while writing it. So much so that it makes it unusable.</p>
<p>I went ahead and downloaded the latest development build (Build 783). On their <a href="http://www.jetbrains.net/confluence/display/ReSharper/ReSharper+4.0+Nightly+Builds" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.jetbrains.net/confluence/display/ReSharper/ReSharper+4.0+Nightly+Builds?referer=');">download page</a>, it&#8217;s listed as &quot;Works here&quot;. That wasn&#8217;t encouraging. It does work a little better with LINQ, but it&#8217;s still a steaming pile of you know what (dog poop for the not-so-smart). This further reinforces my love/hate relationship with ReSharper.</p>
<p>Anyway, I was eventually able to write some LINQ code. A great tool to get started is <a href="http://www.linqpad.net/" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.linqpad.net/?referer=');">LinqPad</a>, which is basically a query analyzer but with LINQ expressions. Writing LINQ is very difficult with a SQL background, because everything is backwards. You think you know what you&#8217;re doing, but you don&#8217;t.</p>
<p>Right now, we&#8217;re using <a href="http://www.hibernate.org/343.html" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.hibernate.org/343.html?referer=');">NHibernate</a> in the main project that I&#8217;ve been working on for the past couple of months. It&#8217;s amazing, but there are a couple of things that would be nice:</p>
<ul>
<li>Better optimization of queries &#8211; It looks like LINQ does an amazing job with this.</li>
<li>Batched reads &amp; writes &#8211; LINQ does batched writes, but lazy loading by default. Maybe not as big a deal as I think.</li>
<li>Cross session saving &#8211; I spent hours battling with some code that loaded a complex object with relationships in one session, and then saved them in another. It appears that LINQ solves this, but I&#8217;ll have to run some tests to be sure.</li>
<li>Less work generating mapping files and relationships.</li>
</ul>
<p>One thing that is nice about LINQ to objects is the fact that it will generate all of the model classes, plus the glue that connects the model to the database. You can either use Visual Studio and drop the tables into a mapping file, or you can use <a href="http://msdn2.microsoft.com/en-us/library/bb386987.aspx" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/msdn2.microsoft.com/en-us/library/bb386987.aspx?referer=');">SqlMetal</a> to script the class generation.</p>
<p>One of the biggest questions I&#8217;m trying to answer write now, is how unit testing fits in with LINQ. We&#8217;re currently testing our data access layer by using an in memory <a href="http://www.sqlite.org/" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.sqlite.org/?referer=');">SQLite</a> database, which let&#8217;s us perform <a href="http://www.ayende.com/Blog/archive/2006/10/14/7183.aspx" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.ayende.com/Blog/archive/2006/10/14/7183.aspx?referer=');">close to real world saves and loads</a>. We also use interfaces for our data access methods, which makes it easy to create testable classes that can simply be supplied a database interface.</p>
<p>I&#8217;m also not sure if it even makes sense to put my LINQ queries in a data access layer. The code would almost seem trivial, and would just create a lack of flexibility. Ironically, it <em>almost</em> feels like you should use LINQ to query <em>against</em> your data access layer.</p>
<p>For now, there are more questions than answers. For now, I don&#8217;t plan on retrofitting my last project with LINQ, but I&#8217;m going to investigate if it will be a good foundation for the data access logic in my next project. Of course if I go that route, you&#8217;ll be sure to hear about it!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ytechie.com/2008/04/linq-im-not-ready-for-you-just-yet.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Disable constraints in &quot;After Insert&quot; trigger</title>
		<link>http://www.ytechie.com/2008/04/disable-constraints-in-after-insert-trigger.html</link>
		<comments>http://www.ytechie.com/2008/04/disable-constraints-in-after-insert-trigger.html#comments</comments>
		<pubDate>Wed, 23 Apr 2008 15:20:00 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://207.36.235.13/2008/04/disable-constraints-in-after-insert-trigger.html</guid>
		<description><![CDATA[I have a table that stores extra information (Users) that gets associated with the &#34;aspnet_Membership&#34; table in my application. Since my table references the membership table, I have a foreign key for referential integrity.
I added a trigger to the membership table so that rows automatically get inserted into my table. The problem is, the trigger [...]]]></description>
			<content:encoded><![CDATA[<p>I have a table that stores extra information (Users) that gets associated with the &quot;aspnet_Membership&quot; table in my application. Since my table references the membership table, I have a foreign key for referential integrity.</p>
<p>I added a trigger to the membership table so that rows automatically get inserted into my table. The problem is, the trigger violates the foreign key constraint! Here is the trigger code:</p>
<pre class="sql" name="code">Create Trigger dbo.Trigger_CreateExtraUserRecord ON aspnet_Users
After Insert
As
Begin
    Set Nocount On
    Insert Into tfs_Users
    (MembershipUserId)
    Select UserId From inserted
End</pre>
<p>As you can see, it&#8217;s an &quot;After Insert&quot; trigger, so the first insert will be done at this point (I have verified that).</p>
<p>It must be using some kind of transaction, and the foreign key is violated because it&#8217;s not committed.</p>
<p>The solution (not perfect, but it works), is to use this before the insert to disable the foreign key check:</p>
<pre class="sql" name="code">Alter Table tfs_Users Nocheck Constraint All</pre>
<p>And use this after:</p>
<pre class="sql" name="code">Alter Table tfs_Users Check Constraint All</pre>
<p>Instead of disabling all constraints, you could also specify the constraint name, which would obviously be better in most cases.</p>
<p>Does anyone have a better explanation for this behavior?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ytechie.com/2008/04/disable-constraints-in-after-insert-trigger.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
