Linq-to-Sql - what are the implications? - Ernst Kuschke

   Ernst Kuschke

     Arbitrary thoughts and musings on life, the universe and everything else

Syndication

News

    ernst kuschke (v1.0)

    My Photos

    Microsoft Most Valuable Professional

    Member in good standing

    View Ernst Kuschke's profile on LinkedIn

    Add to Technorati Favorites

Blogs I read

Books I recommend

General Links

Linq-to-Sql - what are the implications?

Craig has been posting on the topic a bit lately, and innitially I wanted to respond to his latest post with a comment; it got so long that it warranted a post on itself.

For those uninitiated ones, Linq-to-Sql is what used to be called DLinq - this will replace the ADO.NET code we currently use to access data in a relational database. It's important to note that Linq-to-Sql only supports SQL Server as a database. There are many reasons for this, one being that SQL code is generated by the framework, and to optimise data-access, Linq will generate T-SQL. (Other vendors, Oracle for example, can create a Linq-to-Oracle library if they wish - here's an example of an open-source implementation for MySql).

Just as Craig, I was also quite sceptical about LINQ when I heard about it at first - the intermingling of queries with my code reminded me a lot of Foxpro, which (no offense!) really isn't my cup of tea. I was pleasantly surprised when I investigated LINQ and realised how brilliant its OO design is. Craig said:

I am a firm believer in the current database development best practice of providing a stored procedure interface to a database to provide a layer of abstraction, increase maintainability and allow for performance improvement through database query optimisation and tweaking. Sometimes this includes blurring the ideals of having the business logic implemented completely outside of the database in the hypothetical business logic layer or object workspace.

I completely agree with him. Actually, the "blurring" of boundaries betweeb data-layers and business-layers is a major problem - we all know, when using stored procedures (sprocs) (as great as the set-based advantage is), how difficult it is to keep business logic out of SQL code. Before you know it, your sprocs are littered with complex logic that gets difficult to maintain, and completely invisible to calling code. I know it doesn't have to happen, but it inevitably does in reality. To sum up my feelings: Linq will change the way we programmatically interact with data for (I am tempted to say for EVER EVER) at least the next few years, and I think other programming domains (java?!) will catch onto this idea eventually.

The use of stored procs for CRUD operations has been advocated by Microsoft for many years, and, they were in fact wrong. It will be difficult (and interesting to see) for them to try and correct this perception from now onwards!

Designing Queries:

Linq overcomes these problems completely. First of all, if you *really* prefer to use stored procedures, you can easily do so. Here's a screenshot of the Dlinq designer - note that you can specify stored procedures, or even your own command-text, for each CRUD operation.

These entities are created from the old Northwind database, simply by dragging tables from the Server Explorer panel onto the designer surface.

Of course, this designer diagram represents auto-generated code, which is updated in the background as you tweak this visual representation. Furthermore, the generated classes are partial, so if you wanted to specify your own behaviour (like a GetMostValuedCustomers method) on any of these classes, you can do so in your own seperate partial class, so as not to have it overwritten by the designer. The designer generates a DataContext, as well as the Entity classes seen here. It also generates the relationship depicted.

Methods can also be auto-generated by dragging sprocs onto the designer surface - this gives you the immediate advantage of strong-typed calling without having to go do the plumbing of converting SqlDbTypes to .net types.

So I can query my Linq object model as follows:

   1:  Northwind northwind = new Northwind();
   2:  {
   3:      var customerProjections = from c in northwind.Customers
   4:            where c.City == "London"   
   5:            orderby c.Orders.Count descending   
   6:            select new { Name = c.ContactName, City = c.City, NumberOfOrders = c.Orders.Count };   
   7:   
   8:      if(someBooleanExpression)  
   9:      {
  10:          foreach(var customerProjection in customerProjections)
  11:          {
  12:               //Do something with customerProjection
  13:          }
  14:      }
  15:  }

The actual query that gets generated and sent to SQL Server by this code looks as follows:

   1:  SELECT [t0].[ContactName], [t0].[City], (
   2:      SELECT COUNT(*)
   3:      FROM [Orders] AS [t2]
   4:      WHERE [t2].[CustomerID] = [t0].[CustomerID]
   5:      ) AS [NumberOfOrders]
   6:  FROM [Customers] AS [t0]
   7:  WHERE [t0].[City] = 'London'
   8:  ORDER BY (
   9:      SELECT COUNT(*)
  10:      FROM [Orders] AS [t1]
  11:      WHERE [t1].[CustomerID] = [t0].[CustomerID]
  12:      ) DESC

Performance

Firstly, a Linq data query only executes as soon as its members are accessed. This means that only in line 11 in the C# code above, will a SQL Connection be opened and the listed TSQL be executed upon the database. The SQL Connection is closed immediately when leaving the scope of the statement that caused it to be opened - this means line 15 in our C# listing. (This behaviour is known as deferred execution).

Deferred execution allows you to reuse your queries - each time you access members of the query, the query will actually be refreshed from the database, which means that you're never working with stale instances, decreasing the possiblity of concurrency issues. (Of course you have control over this, and the roundtrip doesn't have to happen every time).

Identity

The DataContext object has a tracking service that tracks identity for each object. This means that, when your LINQ query retrieves an object that has the same conceptual identity as an object that is already in memory, a pointer to this object is returned. For example: if I retrieve all customers in London, I might end up with a list that includes a customer with ID '4'. If I now issue a query for all customers with ID '4', I will get a list with an object that points to the same instance in memory as customer '4' in my first result.

Linq will throw a OptimisticConcurrencyException in case of a concurrency violation, without losing any of the in-memory changes to the object model that you tried to persist. This enables you to react appropriately:

   1:  while (retries < maxRetries)
   2:  {
   3:      // fetch objects and make changes here
   4:      try
   5:      {
   6:          northwind.SubmitChanges();
   7:          break;
   8:      }
   9:      catch (OptimisticConcurrencyException e)
  10:      {
  11:          retries++;
  12:      }
  13:  }

More observations

Linq also supports SQL User-defined Functions (UDFs), stored procedures, database creation (great at installation-time), and more. No valuable performance figures are available at this stage as the product is still changing constantly.

Just to end off with, I thought I'd post a preview of what will be possible with Linq; data queries are purely conceptual, and completely removed from the data's representation. The following statement selects data from a database, and transforms it into an XML document (note how easy it is to visualize the resulting XML document from the Linq-to-XML syntax):

   1:  var e = new XElement(“results”,
   2:      from c in db.Customers
   3:      where c.City == “London”
   4:      select new XElement(“customer”,
   5:          new XElement(“name”, c.ContactName),
   6:          new XElement(“phone”, c.Phone)
   7:      )
   8:  );

I'll be posting more on the Linq Project shortly, as well as do some talks in Cape Town for sadeveloper.net - keep an eye on their site for more details.

Published Monday, March 26, 2007 1:46 PM by Ernst Kuschke

Comments

# re: Linq-to-Sql - what are the implications?@ Monday, March 26, 2007 5:26 PM

PLEASE I beg of you, if/when you do a talk in CT, please upload your notes here for all to see.

Thanks a million.

StevenMcD

# re: Linq-to-Sql - what are the implications?@ Monday, March 26, 2007 9:13 PM

Thanks for the response, I look forward to getting more in-depth experience with LINQ to SQL and continuing my quest.

# re: Linq-to-Sql - what are the implications?@ Wednesday, March 28, 2007 1:02 AM

Requerying the same query will not retrieve fresh data.  It will retreive a new sequence of entities whose in-database values currently match the query, but the values in the those in-memory entities will be the same stale values you saw before. This is actually how optimistic concurrency is performed. Because you did not refresh the state on a requery, you can compare that original state against the database current state during an update. You can, however, specifically request to refresh the state of an entity using the 'Refresh' method on DataContext.

# re: Linq-to-Sql - what are the implications?@ Wednesday, March 28, 2007 10:39 AM

Matt,

Requerying the same query will indeed refresh the data:

Northwind northwind = new Northwind();

{

   var customerProjections = from c in northwind.Customers

       where c.City == "London"  

       orderby c.Orders.Count descending  

       select new { Name = c.ContactName, City = c.City, NumberOfOrders = c.Orders.Count };  

   Console.WriteLine("Before changing names:");

   foreach(var customerProjection in customerProjections)

   {

       Console.WriteLine(customerProjection.Name);

       customerProjection.Name = "bang";

   }

   Console.WriteLine("\nAfter changing names:");

   foreach(var customerProjection in customerProjections)

   {

       Console.WriteLine(customerProjection.Name);

   }

}

will produce:

Before changing names:

Thomas Hardy

Victoria Ashworth

Hari Kumar

Ann Devon

Simon Crowther

Elizabeth Brown

After changing names:

Thomas Hardy

Victoria Ashworth

Hari Kumar

Ann Devon

Simon Crowther

Elizabeth Brown

See that the anonymous types' names have been re-retrieved from the database ;o)

# re: Linq-to-Sql - what are the implications?@ Thursday, March 29, 2007 7:05 PM

I'm not sure I understand why DataContext cannot accept an IDBConnection which has an underlying "System.Data.OracleClient.OracleConnection". Its easy to obtain an "System.Data.OracleClient.OracleConnection" using MS Enterprise library,Data Access Application block, which in turn can be passeed into DataContext. This compiles perfectly but throws the following runtime exception

"{"connection must be string, SqlConnection or SqlTransaction"}

by Abhijeet

# re: Linq-to-Sql - what are the implications?@ Wednesday, April 04, 2007 2:47 PM

The DataContext, which forms part of Linq-to-SQL, has specifically been coupled to the SQL-specific data access platform.

DataContext generates SQL statements to execute against the database you connect to, and Oracle and SQL Server use two distinctly different flavours of SQL. (PLSQL and T-SQL)

At this stage, Linq-to-SQL generates T-SQL that is optimised for SQL Server.

Leave a Comment

(required) 
(required) 
(optional)
(required) 

Enter the numbers above: