In database querying, the SELECT WHERE NOT EXISTS clause plays a pivotal role in retrieving data based on the absence of certain conditions. This is particularly useful when filtering records that do not have corresponding entries in another table. In this article, we will focus on how to execute the SELECT WHERE NOT EXIST SQL query using LINQ in C#.

To download the source code for this article, you can visit our GitHub repository.

Let’s dive in.

Environment Setup for LINQ Query Demonstrations

To start with, let’s create two model classes Employee class:

Support Code Maze on Patreon to get rid of ads and get the best discounts on our products!
Become a patron at Patreon!
public class Employee
{
    [Key]
    public int Id { get; set; }
    public string? Name { get; set; }
}

And EmployeeTask class:

public class EmployeeTask
{
    [Key]
    public int TaskId { get; set; }
    public int EmployeeId { get; set; }
    public string? Description { get; set; }
    public List<EmployeeTask>? Tasks { get; set; }
}

Now, let’s create an extension method to seed the data:

public static class EmployeeDbContextExtension
{
    public static void AddSeedData(this EmployeeDbContext context)
    {
        context.Employees.AddRange(
            new Employee { Id = 1, Name = "John" },
            new Employee { Id = 2, Name = "Alice" },
            new Employee { Id = 3, Name = "Bob" },
            new Employee { Id = 4, Name = "Eve" }
        );

        context.Tasks.AddRange(
            new EmployeeTask { TaskId = 101, EmployeeId = 1, Description = "Code Review" },
            new EmployeeTask { TaskId = 102, EmployeeId = 3, Description = "Testing" },
            new EmployeeTask { TaskId = 103, EmployeeId = 2, Description = "Documentation" }
        );

        context.SaveChanges();
    }
}

We create an EmployeeDbContextExtension class and define the AddSeedData() method, which adds a collection of employees and tasks to the context.

We recommend going through our Entity Framework Core Series to learn more about Entity Framework Core.

Use LINQ Any Method to Execute the Select Where Not Exist SQL Query

The Any() method in LINQ is a powerful tool for checking if any elements in a collection satisfy a given condition. It returns a boolean value, indicating whether the specified condition holds true for at least one element in the collection. This method is handy when determining if any records match specific criteria, making it an ideal candidate for implementing the SELECT WHERE NOT EXISTS query.

Let’s create a class and define a method:

public static class QueryExecutor
{
    public static IQueryable<Employee> GetUnassignedEmployeesAnyQuerySyntax(EmployeeDbContext context)
    {
        var employees =
            from employee in context.Employees
            where !context.Tasks.Any(task => task.EmployeeId == employee.Id)
            select employee;

        return employees;
    }
}

Here, we create a QueryExecutor class and define a GetUnassignedEmployeesAnyQuerySyntax() method that takes EmployeeDbContext as the input parameter.

Then, we use the where clause to filter the employee based on a condition and Any() method to check if there are any tasks in the Tasks table where the EmployeeId matches the Id of the current employee. Here, we use the LINQ query syntax.

Now, let’s achieve the same functionality using the LINQ method syntax:

public static IQueryable<Employee> GetUnassignedEmployeesAnyMethodSyntax(EmployeeDbContext context)
{
    var employees = context.Employees
        .Where(employee => !context.Tasks
        .Any(task => task.EmployeeId == employee.Id));

    return employees;
}

Here, we create another method inside the QueryExecutor class. Similarly, we use the Where() method filters the employees based on a condition specified by a lambda expression, and we check if there are no tasks in the Tasks table where the EmployeeId matches the Id of the current employee.

Finally, let’s call both the methods from the Program class:

var unassignedEmployees = QueryExecutor.GetUnassignedEmployeesAnyQuerySyntax(context);
Console.WriteLine(unassignedEmployees.ToQueryString());
unassignedEmployees = QueryExecutor.GetUnassignedEmployeesAnyMethodSyntax(context);
Console.WriteLine(unassignedEmployees.ToQueryString());

Here, we invoke the GetUnassignedEmployeesAnyQuerySyntax() and GetUnassignedEmployeesAnyMethodSyntax() methods and print the generated SQL query.

Now, let’s assess the generated SQL query from both the methods:

SELECT "e"."Id", "e"."Name"
FROM "Employees" AS "e"
WHERE NOT EXISTS (
    SELECT 1
    FROM "Tasks" AS "t"
    WHERE "t"."EmployeeId" = "e"."Id")

Here, the WHERE clause incorporates a subquery using the NOT EXISTS condition. The subquery checks for the existence of any records in the Tasks table where the EmployeeId matches the Id of the current employee in the outer query.

Let’s check the unassigned employees:

Name: Eve, Id: 4

Here, we retrieve the unassigned employee named ‘Eve’ with an ID of 4, as we expected.

Execute the Select Where Not Exist SQL Query Using LINQ Join for Left Anti-Join

The Left Anti-Join is a valuable concept in the context of NOT EXISTS scenarios. It involves retrieving records from the left table (primary collection) that do not have corresponding entries in the right table (secondary collection). In LINQ, we can achieve this using the combination of Join and Where clauses, providing a structured and efficient approach to SELECT WHERE NOT EXISTS queries.

To learn more about the join operations in LINQ please visit our article How to Create an Outer Join in LINQ

Now, let’s create a method:

public static IQueryable<Employee> GetUnassignedEmployeesJoinQuerySyntax(EmployeeDbContext context)
{
    var employees =
        from employee in context.Employees
        join task in context.Tasks on
        employee.Id equals task.EmployeeId
        into EmployeeTasks
        from task in EmployeeTasks.DefaultIfEmpty()
        where task == null
        select employee;

    return employees;
}

Here, we use the join keyword to join the Employees table with the Tasks table based on the equality of employee.Id and task.EmployeeId. Then, we store the result of the join operation in the EmployeeTasks variable, which represents a collection of tasks associated with each employee.

Then, we use the from keyword to iterate over the tasks in EmployeeTasks and we apply the DefaultIfEmpty() method to ensure that even if there are no tasks for an employee, the query still includes that employee. Subsequently, we check if the task is null using the where clause to indicate the absence of tasks for the current employee.

Now, let’s write the equivalent LINQ query using method syntax:

public static IQueryable<Employee> GetUnassignedEmployeesJoinMethodSyntax(EmployeeDbContext context)
{
    var employees = context.Employees.GroupJoin(
        context.Tasks, 
        employee => employee.Id, 
        task => task.EmployeeId, 
        (employee, joinedRecords) => new { employee, joinedRecords })
        .SelectMany(x => x.joinedRecords.DefaultIfEmpty(),
        (x, task) => new { x.employee, task })
        .Where(x => x.task == null)
        .Select(x => x.employee);

    return employees;
}

Here, we use the GroupJoin() method to join the Employees table with the Tasks table based on the equality of employee.Id and task.EmployeeId.

Then, we chain the SelectMany() method to flatten the grouped records into a single sequence, combining each employee with its associated tasks (or an empty collection if there are no tasks), and we invoke the DefaultIfEmpty() method to ensure that even employees without tasks are included in the result.

Now, let’s assess the SQL query generated:

SELECT "e"."Id", "e"."Name"
FROM "Employees" AS "e"
LEFT JOIN "Tasks" AS "t" ON "e"."Id" = "t"."EmployeeId"
WHERE "t"."TaskId" IS NULL

The query employs a LEFT JOIN operation to connect the Employees and Tasks tables. Then, the WHERE clause filters the results to include only those where the TaskId column in the Tasks table is NULL. This condition effectively identifies unassigned employees.

Execute the Query With the LINQ Contains Method

The Contains() method is an additional approach to execute SELECT WHERE NOT EXISTS queries. This method is beneficial for simplicity and readability, especially in scenarios involving small datasets.

Now, let’s define a method:

public static IQueryable<Employee> GetUnassignedEmployeesContainsQuerySyntax(EmployeeDbContext context)
{
    var employees =
        from employee in context.Employees
        where context.Tasks.All(task => task.EmployeeId != employee.Id)
        select employee;

    return employees;
}

Here, we’re filtering employees from the context.Employees collection based on a condition. We use the All() method ensures that for each employee, every task in the context.Tasks collection doesn’t share the same EmployeeId. 

This condition acts as a filter, excluding employees already engaged with tasks. The result is a concise list of unassigned employees.

Moving on, let’s write the equivalent for the execution of SELECT WHERE NOT EXISTS using LINQ query:

public static IQueryable<Employee> GetUnassignedEmployeesContainsMethodSyntax(EmployeeDbContext context)
{
    var employees = context
        .Employees
        .Where(employee => !context
            .Tasks
            .Select(task => task.EmployeeId)
            .Contains(employee.Id));

    return employees;
}

We use the Where() method as a discerning filter, examining if any EmployeeId in the context.Tasks collection matches the current employee’s Id. The inclusion of the ! (not) operator and the Contains() method ensures that only employees without matching EmployeeId values are selected.

Moving on, let’s inspect the SQL query generated:

SELECT "e"."Id", "e"."Name"
FROM "Employees" AS "e"
WHERE NOT EXISTS (
    SELECT 1
    FROM "Tasks" AS "t"
    WHERE "t"."EmployeeId" = "e"."Id")

In essence, this query employs a correlated subquery to identify unassigned employees by checking if there are no corresponding records in the Tasks table for each employee in the Employees table. If the subquery doesn’t find any matches, the result set includes the employee, effectively providing a list of unassigned employees.

Use LINQ All Method to Execute the Select Where Not Exist SQL Query

Expanding our exploration of SELECT WHERE NOT EXISTS queries using LINQ in C#, we introduce the combination of Where() with All() methods as an alternative approach. This method particularly suits scenarios where we prioritize simplicity and straightforward logic.

First, let’s create a method:

public static IQueryable<Employee> GetUnassignedEmployeesAllQuerySyntax(EmployeeDbContext context)
{
    var employees =
        from employee in context.Employees
        where context.Tasks.All(task =>task.EmployeeId != employee.Id)
        select employee;

    return employees;
}

Here, we use the All() method to check for each employee, the condition within the lambda expression holds true for all tasks in the context.Tasks collection. Specifically, the condition checks if no tasks EmployeeId matches the Id of the current employee.

Next up, we’ll generate the matching LINQ query using the method syntax:

public static IQueryable<Employee> GetUnassignedEmployeesAllMethodSyntax(EmployeeDbContext context)
{
    var employees = context
        .Employees
        .Where(employee => context.Tasks
        .All(task => task.EmployeeId != employee.Id));

    return employees;
}

We use the Where() method, which incorporates a lambda expression with the All() method, ensuring that the specified condition holds true for every task associated with each employee.

Finally, let’s inspect the SQL query:

SELECT "e"."Id", "e"."Name"
FROM "Employees" AS "e"
WHERE NOT EXISTS (
    SELECT 1
    FROM "Tasks" AS "t"
    WHERE "t"."EmployeeId" = "e"."Id")

Here, we get a similar SQL query.

Benchmark Comparison

Let’s evaluate the four approaches on how to execute SELECT WHERE NOT EXISTS clause that we’ve covered so far. We will perform a benchmark with the BenchmarkDotNet library to measure the time performance for three approaches, using the seed data we defined in our setup:

| Method                           | Mean      | Error     | StdDev    | Median    | Allocated |
|--------------------------------- |----------:|----------:|----------:|----------:|----------:|
| GetUsingAllWithMethodSyntax      |  3.528 us | 0.0351 us | 0.0293 us |  3.521 us |   2.12 KB |
| GetUsingAllWithQuerySyntax       |  3.561 us | 0.0285 us | 0.0238 us |  3.558 us |   2.12 KB |
| GetUsingContainsWithQuerySyntax  |  3.625 us | 0.0499 us | 0.0467 us |  3.609 us |   2.12 KB |
| GetUsingAnyWithMethodSyntax      |  3.693 us | 0.0705 us | 0.1441 us |  3.627 us |   2.16 KB |
| GetUsingAnyWithQuerySyntax       |  4.144 us | 0.1157 us | 0.3356 us |  4.036 us |   2.27 KB |
| GetUsingContainsWithMethodSyntax |  4.219 us | 0.0840 us | 0.1427 us |  4.178 us |   2.44 KB |
| GetUsingJoinWithMethodSyntax     | 15.355 us | 0.1866 us | 0.1654 us | 15.355 us |   6.98 KB |
| GetUsingJoinWithQuerySyntax      | 15.982 us | 0.3122 us | 0.3717 us | 15.882 us |   7.38 KB |

From our results, we see that methods utilizing Any() and All() with both query and method syntax showcasing lower mean execution times, indicating quicker processing.

In addition to the All() and Any() methods, methods utilizing the Contains() method with query syntax demonstrates relatively low mean execution times and memory allocations.

In general, methods using aJoin exhibit higher mean times, but with the method syntax, we see slightly better performance.

The allocated memory varies across methods, with generally lower allocations for Any() and All() approaches compared to Join method.

Notably, GetUsingAllWithMethodSyntax() and GetUsingAnyWithMethodSyntax() methods demonstrate not only faster execution times but also more efficient memory usage, making them favorable choices in this context.

Conclusion

In this article, we considered different methods on how to execute the SELECT WHERE NOT EXISTS using LINQ in C#. In conclusion, specific use case requirements should guide the choice among these approaches. We prefer the Any() or Contains() methods if simplicity and readability are crucial. For scenarios demanding explicit Left Anti-Join behavior or handling complex conditions, the Join approach might be more suitable.

When selecting the most appropriate method for our scenarios, we should consider the trade-offs between performance and flexibility.

Liked it? Take a second to support Code Maze on Patreon and get the ad free reading experience!
Become a patron at Patreon!