LINQ (Language-Integrated Query)

.LINQ (Language-Integrated Query)
Syntax | Immediate vs Deferred Execution | Standard Query Operators | IEnumerable vs IQueryable | Expression Trees

"Language Integrated Query (LINQ) is Microsoft's general-purpose query language which is directly embedded in C#. LINQ's has a declarative syntax, called Query Syntax, which resembles SQL. LINQ also has an imperative syntax, called Method Syntax. LINQ queries can be applied to various types of data stores (e.g. in-memory collections, databases, XML documents, flat files, etc). LINQ makes the query a first-class language construct with the ability to be executed immediately or assigned to a query variable for deferred execution."

.NET Language-Integrated Query

LINQ was introduced in C# 3.0 (.NET 3.5) and required significant enhancements to the platform. The enhancements not only supported the implementation of LINQ, but they can also be used in other contexts. For example, lambda expressions also simplified working with delegates and superseded the Anonymous Methods construct which was released in C# 2.0. The LINQ enhancements were deeply embedded into the grammar of C# allowing LINQ queries to have IntelliSense support and type checking at compile time. The enhancements included the addition of the following constructs to C#:

  1. Extension Methods - allow you to add new functionality to existing classes without needing to subclass. This is accomplished with a static method which qualifies the first parameter with the this keyword. Extension methods can be called as if they were instance methods, however instance methods have priority. LINQ functionality was added as a set of extension methods for IEnumerable<T>. More information about extension methods can be found in my Extension Methods article.

  2. Lambda Expressions - use the lambda operator => to provide a concise syntax for creating anonymous functions. Lambda Expressions can be used with LINQ extension methods, such as the LINQ standard query operators, as shown in the following example. Additionally, lambda expressions simplified the syntax for creating inline logic for delegates and supersedes the anonymous method construct introduced in C# 2.0.

    var names = collection.Where(item => item.Name == "Kevin")
                          .OrderBy(item => item.Age)
                          .Select(item => item.Name);

    More information about lambda expressions can be found in my Lambda Expressions article.

  3. Anonymous Types - use the new operator with an object initializer to assign read-only data to an object without explicitly defining a type. This feature allows the compiler to generate a data class based on the supplied set of name/value pairs. A typical use is in the select clause of a query expression. The following example uses an anonymous type to associate an id with each element of an array:

    Example 1: LINQ Query


    using System;
    using System.Linq;

    class Program
    {
        static int id = 1;

        static void Main()
        {       
            string[] myArray = { "one", "two", "three", "four"};

            // Use new operator to create anonymous type
            var myQuery = from element in myArray
                          select new { Value = element, Id = id++ };

            foreach (var myType in myQuery)
                Console.WriteLine(myType);
        }
    }

  4. Implicitly Typed Variables - uses the var keyword to define a local variable without explicitly specifying a data type. Instead the compiler infers the data type. This is frequently used with LINQ queries because the type of their result sets may not be obvious, or even directly accessible. For example, a query variable must be implicitly typed when the query contains an anonymous type in its select clause.

  5. Object Initialization Syntax - is an optional, compact, yet readable, syntax for setting an object's data inside its declaration. The syntax is useful when using an anonymous type in a LINQ select clause to project new data, as shown in example #1 above. The syntax can be used to initialize a collection of objects (e.g. List, Dictionary, etc).


Components of LINQ

The LINQ API is designed to provide a consistent manner for accessing various forms of data. However LINQ is divided into components which are classified by the type of supported data stores. These components include:

  1. LINQ to Object - the use of LINQ queries with any enumerable collection (List, Dictionary, custom, etc). This is also useful for working with data that has been consolidated from one or more data sources.

  2. LINQ to XML - provides in-memory document modifications (similar to the capabilities of the DOM). Enables: create, load, query, validate, modify, and serialize of XML.

  3. LINQ to SQL - the use of LINQ queries with relational databases.

  4. LINQ to Entities - the use of LINQ queries with the Entity Framework conceptual model.

Note: Additional LINQ libraries are available in .NET for supporting different types of data. Such as LINQ to DataSet (ADO.NET), LINQ to Entities (Entity Framework), and PLINQ (Parallel LINQ). Third party libraries also exist for extending the support of additional data types with LINQ, such as the LINQtoCSV library which adds support for CSV and tab delimited files.

LINQ Namespaces
  1. System.Linq - contains classes and interfaces that support LINQ queries.
  2. System.Linq.Expressions - supports advanced LINQ functionality with expression trees.
  3. System.Data.Linq - supports interaction with relation databases (LINQ to SQL).
  4. System.Xml.Linq - supports working with XML documents.(LINQ to XML).

LINQ Syntax

The two kinds of syntax for coding LINQ queries are known as Query and Method. Choosing which syntax to use is, to a large extent, is a matter of preference. However sometimes it may depend upon the nature of the query. There are some operators which are supported in Method syntax, but not in Query syntax. Query and Method syntax can be combined to create a "Mixed Query", but some people dislike mixed queries. Regardless of the syntax used for coding, the compiler will translate all the queries into Method syntax. An example of Query, Method, and Mixed syntax is shown below.

While LINQ Query Syntax may resemble SQL, its underlying processing is very different. LINQ is actually based on the concept of list comprehensions from functional languages, despite its use of SQL-like keywords. This underlying difference makes the process of coding optimal LINQ queries a little different than coding SQL queries. LINQPad is an interactive application for running LINQ queries. It can also show a query in Method and Query syntax, SQL, and Intermediate Language. LINQpad contains sample queries and is a good tool for learning and exploring LINQ and newer technologies. However, LINQpad is not a replacement for SQL Server Management Studio, just as LINQ is not a replacement for SQL.

Example 2: LINQ Syntax (Query, Method, Mixed)

using System;
using System.Linq;

class Program
{
    static void Main()
    {       
        string[] myArray = { "one", "two", "three", "four"};

        // Query Syntax
        var querySyntax = from element in myArray
                          where element.Contains("o")
                          orderby element.Length
                          select element;

        // Method Syntax
        var methodSyntax = myArray.Where(item => item.Contains("o"))
                                  .OrderBy(item => item.Length)
                                  .Select(item => item);

        // Mixed Syntax
        var mixedSyntax = (from element in myArray
                           where element.Contains("o")
                           orderby element.Length
                           select element).Take(1);
    }
}

Syntax Comparison
  1. Query - a.k.a. Query Expression, Query Comprehensive
    • Declarative syntax. Resembles SQL. Does not have as many operators as Method Syntax.
    • Simpler for queries that involve ranges: (e.g. SelectMany, Join, GroupJoin) or a let clause.
    • Any operating outside the following list must be written, at least in part, with Method syntax: (Where, Select, SelectMany, OrderBy, ThenBy, OrderByDescending, ThenByDescending, GroupBy, Join, GroupJoin).
  2. Method - a.k.a Fluent, Lambda Syntax, Method Chaining
    • Imperative syntax. Looks more like C# code. Has more operators than Query Syntax.
    • Shorter and simpler for queries that comprise a single operator.
    • Customer extension methods can be created for unique situations.

Immediate vs Deferred Execution

LINQ queries are only executed when they are iterated over (e.g. foreach(), or For Each loop). The logic associated with some query operators include iteration, such as .sum() or .count(). When these operators are used in a query, it forces immediate execution. If none of these operators are used in a query, then additional code is required to iterate over the query to cause it to be executed. This requirement for additional iteration code to execute the query is know as deferred execution. Deferred execution allows queries to be built in stages, as well as making database queries possible.

The only query operators that include iteration logic (i.e. force immediate execution) are operators which fall into one of these two categories:

  1. Operators that return a single value (scalar), such as sum(), count(), first().
  2. Operators that convert the results into a collection, such as ToArray(), ToList(), ToDictionary(), ToLookUp().

The following example code demonstrates coding queries for both immediate and deferred execution. The second query is coded inline while the other three queries are stored in a query variable.

Example 3: Immediate vs Deferred Exections

using System;
using System.Collections.Generic;
using System.Linq;

namespace ImmediateVsDeferred
{
    class Program
    {
        static void Main()
        {
            List<string> myList = new List<string>()
            { "BMW", "Indian", "Ducati", "Harley" };

            var deferredQuery = from x in myList
                                where (x.Contains("a"))
                                select (x);

            var immediateQuery = (from x in myList
                                 where (x.Contains("a"))
                                 select (x)).Count();
                                
            // 1. Execute Immediately - (immediateQuery) Scalar (single value)
            int theCount = immediateQuery;
            Console.WriteLine("theCount: {0}", theCount); // Prints: theCount: 3

            // 2. Execute Immediately - Count (inline Query) Scalar
            int theCount2 = myList.Where(x => x.Contains("a")).Count();
            Console.WriteLine("theCount2: {0}", theCount2); // Prints: theCount2: 3

            // 3. Execute on enumeration - (deferredQuery) Non-Scalar
            foreach (var item in deferredQuery) // Prints: Indian, Ducati, Harley
                Console.WriteLine(item);

            // 4. Execute Immediately - (deferredQuery) ToList Conversion Operator
            List<string> newList = deferredQuery.ToList();

            foreach (var item in newList)
                Console.WriteLine("newList: {0}", item); // Prints: Indian, Ducati, Harley
        }
    }
}

Top




Standard Query Operators

The extension methods in the Enumerable and Queryable classes make up LINQ's standard query operators. Enumerable objects execute methods containing the logic of the query operator. Queryable objects build an expression tree that represents the query to be performed. The methods that make up the standard query operators will either execute immediately, or have their execution deferred to a later time. Methods that return a singleton value (e.g. Average, Sum) execute immediately. Methods that return a sequence defer the query execution and return an enumerable object.

Sorting Data
Method Name Query Syntax Description
OrderBy orderby Sorts values in ascending order.
OrderByDescending orderby … descending Sorts values in descending order.
ThenBy orderby …, … Performs a secondary sort in ascending order.
ThenByDescending orderby …, … descending Performs a secondary sort in descending order.
Reverse Not Available Reverses the order of the elements in a collection.


Set Operations
Method Name Query Syntax Description
Distinct Not Available Removes duplicate values from a collection.
Except Not Available Returns the set difference, which means the elements of one collection that do not appear in a second collection.
Intersect Not Available Returns the set intersection, which means elements that appear in each of two collections.
Union Not Available Returns the set union, which means unique elements that appear in either of two collections.


Filtering Data
Method Name Query Syntax Description
OfType Not Available Selects values, depending on their ability to be cast to a specified type.
Where where Selects values that are based on a predicate function.


Quantifier Operations
Method Name Query Syntax Description
All Not Available Determines whether all the elements in a sequence satisfy a condition.
Any Not Available Determines whether any elements in a sequence satisfy a condition.
Contains Not Available Determines whether a sequence contains a specified element.


Projection Operations
Method Name Query Syntax Description
Select select Projects values that are based on a transform function.
SelectMany Use multiple from clauses Projects sequences of values that are based on a transform function and then flattens them into one sequence.


Partitioning Data
Method Name Query Syntax Description
Skip Not Available Skips elements up to a specified position in a sequence.
SkipWhile Not Available Skips elements based on a predicate function until an element does not satisfy the condition.
Take Not Available Takes elements up to a specified position in a sequence.
TakeWhile Not Available Takes elements based on a predicate function until an element does not satisfy the condition.


Join Operations
Method Name Query Syntax Description
Join join … in … on … equals … Joins two sequences based on key selector functions and extracts pairs of values.
GroupJoin join … in … on … equals … into … Joins two sequences based on key selector functions and groups the resulting matches for each element.


Grouping Data
Method Name Query Syntax Description
GroupBy group … by -or- group … by … into … Groups elements that share a common attribute. Each group is represented by an IGrouping object.
ToLookup Not Available Inserts elements into a Lookup (a one-to-many dictionary) based on a key selector function.


Generation Operation
Method Name Query Syntax Description
DefaultIfEmpty Not Available Replaces an empty collection with a default valued singleton collection.
Empty Not Available Returns an empty collection.
Range Not Available Generates a collection that contains a sequence of numbers.
Repeat Not Available Generates a collection that contains one repeated value.


Equality Operations
Method Name Query Syntax Description
SequenceEqual Not Available Determines whether two sequences are equal by comparing elements in a pair-wise manner.


Element Operations
Method Name Query Syntax Description
ElementAt Not Available Returns the element at a specified index in a collection.
ElementAtOrDefault Not Available Returns the element at a specified index in a collection or a default value if the index is out of range.
First Not Available Returns the first element of a collection, or the first element that satisfies a condition.
FirstOrDefault Not Available Returns the first element of a collection, or the first element that satisfies a condition. Returns a default value if no such element exists.
Last Not Available Returns the last element of a collection, or the last element that satisfies a condition.
LastOrDefault Not Available Returns the last element of a collection, or the last element that satisfies a condition. Returns a default value if no such element exists.
Single Not Available Returns the only element of a collection, or the only element that satisfies a condition.
SingleOrDefault Not Available Returns the only element of a collection, or the only element that satisfies a condition. Returns a default value if no such element exists or the collection does not contain exactly one element.


Converting Data Types
Method Name Query Syntax Description
AsEnumerable Not Available Returns the input typed as IEnumerable.
AsQueryable Not Available Converts a (generic) IEnumerable to a (generic) IQueryable.
Cast From … As … Casts the elements of a collection to a specified type.
OfType Not Available Filters values, depending on their ability to be cast to a specified type..
ToArray Not Available Converts a collection to an array. This method forces query execution.
ToDictionary Not Available Puts elements into a Dictionary based on a key selector function. This method forces query execution.
ToList Not Available Converts a collection to a List. This method forces query execution.
ToLookup Not Available Puts elements into a Lookup (a one-to-many dictionary) based on a key selector function. This method forces query execution.


Concatenation Operations
Method Name Query Syntax Description
Concat Not Available Concatenates two sequences to form one sequence.


Aggregation Operations
Method Name Query Syntax Description
Aggregate Not Available Performs a custom aggregation operation on the values of a collection.
Average Not Available Calculates the average value of a collection of values.
Count Not Available Counts the elements in a collection, optionally only those elements that satisfy a predicate function.
LongCount Not Available Counts the elements in a large collection, optionally only those elements that satisfy a predicate function.
Max Not Available Determines the maximum value in a collection.
Min Not Available Determines the minimum value in a collection.
Sum Not Available Determines the sum of the values in a collection.


Top




IEnumerable vs IQueryable

"The Iterator pattern decouples the collection object form the traversal logic. IEnumerable is the base interface for supporting the Iterator pattern over a collection."

LINQ uses two interfaces for iterating through data. Below is a comparison of the two interfaces with details about their implementations and uses.

  • IEnumerable is designed for working with in-memory collections such as List, Dictionaries, etc. IEnumerable implements extension methods which create code (Methods via Delegates) for execution.

  • IQueryable is designed for working with out-of memory data sources, such as databases or web services. IQueryable implements extension methods which create data structures containing the queries, known as expression trees.
  • IEnumerable provides the ability to iterate through a collection by exposing a Current property and MoveNext and Reset Methods. When IEnumerable is implemented on a collection, the foreach syntax is enabled for iterating through the collection.

  • IQueryable is derived from IEnumerable and provides the additional functionality to evaluate queries against a specific data source without knowing the type of the data. Adding the ability for queries to work with unknown data types is important when working with remote data sources, such as databases.
IEnumerable vs IQueryable
IEnumerable IQueryable
Use In-memory objects. Out-of-memory objects.
Query Support LINQ to Object, LINQ to XML LINQ to SQL, LINQ to Entity
Loading Does not support lazy loading. Supports lazy loading.


Top

Expression Trees

An Expression Tree is a data structure that contains executable code stored in a binary tree structure. Storing code in a data structure is useful if you want to modify the code before execution. Such as transforming a LINQ query into code that will execute on a remote database. (Execution of LINQ providers is mostly accomplished with expression trees. Expression Trees can be used to create dynamic code and dynamic queries.

The simplest way to create an Expression Tree is through the use of lambda expressions. A more basic way to create Expression Trees is with the Expression API. The compiler with translate Expression Trees created with lambda expressions into the Expression API as part of its translation into intermediate code. The code below shows both of these ways for creating Expression Trees.

Two Ways of Creating an Expression Tree

using System;
using System.Linq.Expressions;

namespace LinqOperators
{
    class Program
    {
        static void Main()
        {
            /**************************************************
             * Expression tree created with Lambda Expression *
             **************************************************/
            Expression<Func<int, int, bool>> myExpression = (x,y) => x < y;           
            Func<int, int, bool> result = myExpression.Compile();

            Console.WriteLine(result(10,11));

            // Compile and run combined
            Console.WriteLine(myExpression.Compile()(10,11));

            /**************************************************
             *  Expression tree created with Expressions API  *
             **************************************************/
            ParameterExpression par1 = Expression.Parameter(typeof(int), "10");
            ParameterExpression par2 = Expression.Parameter(typeof(int), "11");
            BinaryExpression expr2 = Expression.LessThan(par1, par2);

            var f = Expression.Lambda<Func<int, int, bool>>(expr2, new ParameterExpression[] { par1, par2 });
            var func = f.Compile();

            // Prints: Generated lambda expression: (10, 11) => (10 < 11)
            Console.WriteLine("Generated lambda expression: {0}", f);
            // Prints: True
            Console.WriteLine("Evaluated expression tree: {0}",func(10,11));
        }
    }
}

Top



Reference Articles

Top