Signum Framework Logo
"Open framework that encourages convention over configuration, using C# code,
not XML files, to model at the right level of abstraction and achieve deadlines.
...but also has a full Linq provider, and syncs the schema for you!"
Login
RSS

Search

»



Main
Index
Map
Videos
Download
Source code
Tutorials
Forum
FAQ



Image



PoweredBy

SelectMany and Join Differences

RSS
Modified on 2009/02/14 15:32 by helen Paths: Documentation Categorized as SignumEngine

SelectMany differences

SelectMany is a very simple operator, however, it creates some problems for people to grasp. You can see more about how SelectMany is meant to work here.

public static IQueryable<TResult> SelectMany<TSource, TResult>(
   this IQueryable<TSource> source, 
   Expression<Func<TSource, IEnumerable<TResult>>> selector)

We use the same approach for SelectMany than Linq to Sql with two small differences:

  • A big part of Linq to Sql code base is, in words of Matt Warren, "to reduce CROSS APPLY's into CROSS JOIN in an opportunistic fashion". We save ourselves from writing this code, instead we use CROSS APPLY any time a SelectMany appears. If someone wants Linq to Signum to work on other DBMS, like SQLS 2000, MySQL or Oracle, this will be the most difficult part.

  • We support OUTER APPLY by adding DefaultIfEmpty at the end of the selector lambda

Let's see an example:

//This shows how to access {{trtlBugDNComments}}
IQueryable<CommentDN> result = Database.Query<BugDN>().SelectMany(b=>b.Comments);// Little trick to access tltrBugDNComments Table

//Or the equivalent using query expressions IQueryable<CommentDN> result = from b in Database.Query<BugDN>() from c in b.Comments select c;

Both examples gets translated to the following Sql:

SELECT t4.Text, t4.Date, t4.idWriter_CustomerDN, t4.idWriter_DeveloperDN
FROM tlBugDN AS t0
CROSS APPLY (
  (SELECT t2.Text, t2.Date, t2.idWriter_CustomerDN, t2.idWriter_DeveloperDN
  FROM trtlBugDNComments AS t2
  WHERE (t0.Id = t2.idBugDN))
) AS t4

Generating the following results:

TextDateWriterSelfModifiedToStringMethodErrorModified
SelectMany is hard in MySql08/09/2008 0:00:00OlmoTrueOlmo: SelectMany is hard in MySqlTrue
Necessary to reuse entity assemblies16/08/2008 0:00:00AlejandroTrueAlejandro: Necessary to reuse entity assembliesTrue
Doesn't wooooork14/10/2008 0:00:00OlmoTrueOlmo: Doesn't wooooorkTrue
Fixed...15/10/2008 0:00:00AntonioTrueAntonio: Fixed...True
We didn't know 2008 has Xmas too26/06/2008 0:00:00LuísTrueLuís: We didn't know 2008 has Xmas tooTrue
Rune support is not in Unicode04/10/2008 0:00:00OscarTrueOscar: Rune support is not in UnicodeTrue
It's ACME industries!25/12/2008 0:00:00OlmoTrueOlmo: It's ACME industries!True

Using DefaultIfEmpty at the end of the selector lambda, the equivalent Sql will use OUTER APPLY instead.

var result = from b in Database.Query<BugDN>()
             from c in b.Comments.DefaultIfEmpty()
             select new { b.Description, c.Text }; 

Gets translated to:

SELECT t0.Description, t4.Text
FROM tlBugDN AS t0
OUTER APPLY (
  (SELECT t2.Text
  FROM trtlBugDNComments AS t2
  WHERE (t0.Id = t2.idBugDN))
) AS t4

Resulting in:

DescriptionText
MySql SupportSelectMany is hard in MySql
Enum Casting
SchemaBuilderSettings supportNecessary to reuse entity assemblies
FileLine with LazyDoesn't wooooork
FileLine with LazyFixed...
Add ElvesSyndicateDN entity
Support for gifs > $1Million
Xmas '08 Budget ReportWe didn't know 2008 has Xmas too
Translate some windows to QuenyaRune support is not in Unicode
Massive Destruction flag on WeaponDN
Weapons do not workIt's ACME industries!
Optional Insurance Policy

Join / GroupJoin differences

We tried to make Linq to Signum as easy for the user as possible. One cool feature of Linq in general is that it provides a unified model for querying in memory objects and the database (sometimes, the hard thing is to know where you are).

On Joins, however, this idea of having an unified model has gone too far, making it hard to use. We have to follow a different approach.

Let's see how Linq to Sql does joins first:

Syntax used by Linq to Objects/Linq to Sql

In Linq there are two kind of joins, Join and GroupJoin:

//Given two sequences (outer & inner) mix them by a common key and using resultSelector combines each possible pair. 
public static IQueryable<TResult> Join<TOuter, TInner, TKey, TResult>(
    this IQueryable<TOuter> outer, 
         IEnumerable<TInner> inner, 
         Expression<Func<TOuter, TKey>> outerKeySelector, 
         Expression<Func<TInner, TKey>> innerKeySelector, 
         Expression<Func<TOuter, TInner, TResult>> resultSelector)

//Given two sequences (outer and inner) mix them by a common key and using resultSelector combines each element in outer with all the elements with the same key on inner. public static IQueryable<TResult> GroupJoin<TOuter, TInner, TKey, TResult>( this IQueryable<TOuter> outer, IEnumerable<TInner> inner, Expression<Func<TOuter, TKey>> outerKeySelector, Expression<Func<TInner, TKey>> innerKeySelector, Expression<Func<TOuter, IEnumerable<TInner>, TResult>> resultSelector)

As you see, there's no explicit way to do outer join with Join operator.

However, GroupJoin returns all the elements on outer collection, even if the group of the elements of the same key in inner is empty. By combining GroupJoin + SelectMany + DefaultIfEmpty you can archive left outer join behaviour.

var q = from c in customers
        join o in orders on c.Key equals o.Key into g //GrouJoin
        from o in g.DefaultIfEmpty() //SelectMany + DefaultIfEmpty
        select new {Name = c.Name, OrderNumber = o == null ? "(no orders)" : o.OrderNumber};

It look to us that these method signatures were designed with Linq to Objects in mind. They promote hierarchical sequences (GroupJoin) and try to centralize the nasty DefaultIfEmpty method. When used in Linq to Sql is has some disadvantages though:

  • It takes 3 operators to make a LEFT OUTER JOIN. In database, outer joins are so common that this is just not acceptable.
  • This approach doesn't work for RIGHT OUTER JOIN, neither for FULL OUTER JOIN.
  • The GroupJoin translation get's affected by why g is actually used. What if it is used twice, with and without the DefaultIfEmpty operator.

At the end, we found the Linq to Sql join strategy is overly complicated. They try too hard to keep the 'they are just objects' abstraction but instead of making it simpler (like with dot joins) they make it much more complex. That's why we are following a different path here.

Syntax used by Linq to Signum



Our approach is closer to Sql. On a Join (or GroupJoin) operator you can mark any of the imput collections (inner & outer) with DefaultIfEmpty. The side marked with DefaultIfEmpty will be allowed to have null values when no counterpart is found on the other side. You can use DefaultIfEmpty on left side, right side, or both!!

The example above, using Linq to Signum strategy will be, just:

var q = from c in customers
        join o in orders.DefaultIfEmpty() on c.Key equals o.Key //Normal Join 
        select new {Name = c.Name, OrderNumber = o == null ? "(no orders)" : o.OrderNumber};

This is just a pedagogic example, it won't work if customers or orders are not Linq to Signum IQueryable.

Linq queries, explicit joins are used when joining with something different than foreign keys (otherwise dot joins are usually more convinient).

Here we will join CommentDN and BugsDN by date.

INNER JOIN

Using comments as defined in SelectMany section:

IQeuryable<CommendDN> comments = Database.Query<BugDN>().SelectMany(b=>b.Comments);

We can easly join by the Date of the comment and the start date of the bug:

var result = from b in Database.Query<BugDN>()
             join c in comments on b.Start equals c.Date 
             select new { b.Description, b.Start, c.Text }; 


The generated sql code looks like this:

SELECT t0.Description, t0.Start, t5.Text
FROM tlBugDN AS t0
INNER JOIN tlBugDN AS t2  -- INNER JOIN
CROSS APPLY (
  (SELECT t4.Text, t4.Date
  FROM trtlBugDNComments AS t4
  WHERE (t2.Id = t4.idBugDN))
) AS t5
  ON (t0.Start = t5.Date)

Returning the following results:

DescriptionStartText
MySql Support08/09/2008 0:00:00SelectMany is hard in MySql
SchemaBuilderSettings support16/08/2008 0:00:00Necessary to reuse entity assemblies
FileLine with Lazy14/10/2008 0:00:00Doesn't wooooork
Xmas '08 Budget Report26/06/2008 0:00:00We didn't know 2008 has Xmas too
Translate some windows to Quenya04/10/2008 0:00:00Rune support is not in Unicode
Weapons do not work25/12/2008 0:00:00It's ACME industries!

LEFT OUTER JOIN

Just adding DefaultIfEmpty() at the end of the second source of the join command, you will get all the Bugs, no matter if they have a Comment on the same date:

var result = from b in Database.Query<BugDN>()
             join c in comments.DefaultIfEmpty() on b.Start equals c.Date 
             select new { b.Description, b.Start, c.Text }; 

Generating Sql:

SELECT t0.Description, t0.Start, t5.Text
FROM tlBugDN AS t0
LEFT OUTER JOIN tlBugDN AS t2
CROSS APPLY (
  (SELECT t4.Text, t4.Date
  FROM trtlBugDNComments AS t4
  WHERE (t2.Id = t4.idBugDN))
) AS t5
  ON (t0.Start = t5.Date)

Resulting data:

DescriptionStartText
MySql Support08/09/2008 0:00:00SelectMany is hard in MySql
Enum Casting23/11/2008 0:00:00
SchemaBuilderSettings support16/08/2008 0:00:00Necessary to reuse entity assemblies
FileLine with Lazy14/10/2008 0:00:00Doesn't wooooork
Add ElvesSyndicateDN entity30/09/2008 0:00:00
Support for gifs > $1Million29/09/2008 0:00:00
Xmas '08 Budget Report26/06/2008 0:00:00We didn't know 2008 has Xmas too
Translate some windows to Quenya04/10/2008 0:00:00Rune support is not in Unicode
Massive Destruction flag on WeaponDN15/11/2008 0:00:00
Weapons do not work25/12/2008 0:00:00It's ACME industries!
Optional Insurance Policy05/09/2008 0:00:00

RIGHT OUTER JOIN

With our syntax, doing a right outer join is as easy as changing DefaultIfEmpty from the second source to the first one:

var result = from b in Database.Query<BugDN>()
             join c in comments.DefaultIfEmpty() on b.Start equals c.Date
             select new { b.Description, b.Start, c.Text }; 

Generating SQL:

SELECT t0.Description, t5.Date, t5.Text
FROM tlBugDN AS t0
RIGHT OUTER JOIN tlBugDN AS t2
CROSS APPLY (
  (SELECT t4.Text, t4.Date
  FROM trtlBugDNComments AS t4
  WHERE (t2.Id = t4.idBugDN))
) AS t5
  ON (t0.Start = t5.Date)

Resulting Data:

DescriptionDateText
MySql Support08/09/2008 0:00:00SelectMany is hard in MySql
SchemaBuilderSettings support16/08/2008 0:00:00Necessary to reuse entity assemblies
FileLine with Lazy14/10/2008 0:00:00Doesn't wooooork
15/10/2008 0:00:00Fixed...
Xmas '08 Budget Report26/06/2008 0:00:00We didn't know 2008 has Xmas too
Translate some windows to Quenya04/10/2008 0:00:00Rune support is not in Unicode
Weapons do not work25/12/2008 0:00:00It's ACME industries!

FULL OUTER JOIN

We can easily turn it into a full outer join, just by adding DefaultIfEmpty in both sides of the join.

There's however an important thing to notice:

Type Mismatch: Nullability of types are handled authomatically by Sql, but in C#, even trying to reduce Type Mismatch on Entities, some operations like outer join can make some columns null. Sometimes you have to manually prepare your anonymous types to support null values:

var result = from b in Database.Query<BugDN>().DefaultIfEmpty()
             join c in comments.DefaultIfEmpty() on b.Start equals c.Date
             select new { b.Description,  Start = (DateTime?)b.Start, Date = (DateTime?)c.Date, c.Text };


The resulting Sql will be:

SELECT t0.Description, t0.Start AS c0, t5.Date AS c1, t5.Text
FROM tlBugDN AS t0
FULL OUTER JOIN tlBugDN AS t2
CROSS APPLY (
  (SELECT t4.Text, t4.Date
  FROM trtlBugDNComments AS t4
  WHERE (t2.Id = t4.idBugDN))
) AS t5
  ON (t0.Start = t5.Date)

The data

DescriptionStartDateText
MySql Support08/09/2008 0:00:0008/09/2008 0:00:00SelectMany is hard in MySql
Enum Casting23/11/2008 0:00:00
SchemaBuilderSettings support16/08/2008 0:00:0016/08/2008 0:00:00Necessary to reuse entity assemblies
FileLine with Lazy14/10/2008 0:00:0014/10/2008 0:00:00Doesn't wooooork
Add ElvesSyndicateDN entity30/09/2008 0:00:00
Support for gifs > $1Million29/09/2008 0:00:00
Xmas '08 Budget Report26/06/2008 0:00:0026/06/2008 0:00:00We didn't know 2008 has Xmas too
Translate some windows to Quenya04/10/2008 0:00:0004/10/2008 0:00:00Rune support is not in Unicode
Massive Destruction flag on WeaponDN15/11/2008 0:00:00
Weapons do not work25/12/2008 0:00:0025/12/2008 0:00:00It's ACME industries!
Optional Insurance Policy05/09/2008 0:00:00
15/10/2008 0:00:00Fixed...

GroupJoin

Using this syntax, GroupJoin is nothing else than grouping and joining, with no influence on the join becoming any kind of outer join. In fact, at a very early stage of the translation pipeline there's a rule that transforms:

//Converts
GroupJoin(outer, inner, o=>f1(o), i=>f2(i), (o, gI)=>f3(o,gI))  
//Into
Join(outer, GroupBy(inner, i=>f2(i), i=>i) , o=>f1(o), g=>g.Key, (o,g)=>f2(o, g))                            


That means that you can even do a full outer join like this:

var result = from b in Database.Query<BugDN>().DefaultIfEmpty()
             join c in comments.DefaultIfEmpty() on b.Start equals c.Date into g
             select new { b.Description,  Start = (DateTime?)b.Start, Count = (int?)g.Count()}; 

Generating this Sql:

SELECT t0.Description, t0.Start AS c0, t7.val1 AS c1
FROM tlBugDN AS t0
FULL OUTER JOIN (
  (SELECT t5.Date, COUNT(*) AS val1
  FROM tlBugDN AS t2
  CROSS APPLY (
    (SELECT t4.Date
    FROM trtlBugDNComments AS t4
    WHERE (t2.Id = t4.idBugDN))
  ) AS t5
  GROUP BY t5.Date)
) AS t7
  ON (t0.Start = t7.Date)

Returning the following data:

DescriptionStartCount
Xmas '08 Budget Report26/06/2008 0:00:001
SchemaBuilderSettings support16/08/2008 0:00:001
Optional Insurance Policy05/09/2008 0:00:00
MySql Support08/09/2008 0:00:001
Support for gifs > $1Million29/09/2008 0:00:00
Add ElvesSyndicateDN entity30/09/2008 0:00:00
Translate some windows to Quenya04/10/2008 0:00:001
FileLine with Lazy14/10/2008 0:00:001
1
Massive Destruction flag on WeaponDN15/11/2008 0:00:00
Enum Casting23/11/2008 0:00:00
Weapons do not work25/12/2008 0:00:001

Creative Commons License Signum Framework Site by Signum Software is licensed under a Creative Commons Attribution 3.0 License.
Powered by ScrewTurn Wiki version 3.0.5.600.