Learning Corner #5 – Logical Query Processing 101

One thing I believe it’s important for everyone who is interested in the database business is the way logical query processing works. And it’s interesting to observe that even some experienced developers have problems interpreting how a query gets processed by SQL Server.
In this article, I intend to explain the best I can how logical query processing works.

In order to do that, I will make a short analogy with a real-life situation, that worked for me when I started to get familiar with this topic and which I hope will work for you as well.
OK, before we get to business, let’s make sure everybody is on the same page, by answering some questions:

1. What is logical query processing?

Let’s take a very simple query as an example:

The commands we give to SQL Server with this query are:

a) SELECT (all the columns)
b) FROM (the Products table)

SQL Server does not execute the commands in the order we give them (this would mean to execute first the SELECT command and then the FROM), but it has a specific execution order, which is the reason of this blog post.

The order of the main commands, which should be known by everyone, is the following:

1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY

2. Why should anyone care about logical query processing?

I hope nobody asks this question, but in case it happens, here are some reasons that just popped out of my head:

  1. You need to know how SQL Server works on the inside, in order to be a good developer
  2. It will be easier to understand performance problems and to identify optimization possibilities
  3. Using aliases in some clauses other than SELECT and NOT using them in others will not be such a matter of luck anymore.

OK, let’s go to our little analogy now.
Let’s say you have a winter cold and go to the pharmacy to buy some treatment. You know from experience that cough syrup works best in situations like this.
Anyway, you go to the lady from the drug store and make the following request: I want to buy cough syrup, because I don’t feel so good, but I’m short on money, so please make sure that it costs less than 5 bucks a bottle. Oh, and I want three bottles, because all good things come in three.
So what does the lady do?
Well, first she will give you the warning that you should not drink all the bottles at once, because you will not feel any better and may wind up in hospital, etc. And then she goes to get the things you requested.

1 She goes to the location in the pharmacy where non-prescription drugs are arranged in shelves, from where she will choose some products to show you.
2 She searches for cough syrups, because that’s what you asked her to give you. Also, the syrups must have a price lower than 5 dollars, so she excludes some expensive ones.
3 Because even with this selection she finds plenty of bottles, she tries to arrange them somehow, to check if they fit all your needs. So she sorts them by their name.
4 After this arrangement was finished, she remembers that you want at least 3 bottles, so she excludes the groups that have less than 3 items.
5 She has finally finished and happily comes back to you with the result: she shows you the syrup bottle and tells you the price and also that they have enough to fit your needs (more than 3).

Of course, in a real-life situation, the lady does not take all these steps, because she is experienced and good at her job and knows by heart the products they have in stock and how many and what you need to feel better. Or she has a database where she does these searches and finds out the results immediately 🙂

Speaking of databases, do the previous steps followed by the pharmacist make sense? Now let’s see what if we had this table in a database and we were the ones who needed to make the query.
Say we have the following (poorly designed) table, in the Pharmacy database:

And there are the following values in it:

logical query processing
Results from the NonPrescriptionDrugs table

What I’m going to do next is write again the steps taken by the pharmacist to get us our syrups, but this time I will also add the SQL construction used to map the real life situation in a query part.

So here goes:

1. She goes to the location in the pharmacy where non-prescription drugs…
FROM [dbo].[NonPrescriptionDrugs]

2. She searches for cough syrups that are less than 5 dollars…
WHERE Category = 'Cough Syrups' AND Price < 5

3. She tries to group the selection so that it makes sense, by name… GROUP BY Name

4. She remembers that you want at least 3 bottles… HAVING SUM(Quantity) > 2

5. She has finally finished and happily comes back to you with the results…
SELECT Name, MIN(Price) AS Price, SUM(Quantity) AS Quantity

6. Because she found two syrups that meet your requests, she orders them on the counter by their price…
ORDER BY Price

So what do you think, does this make sense? I think logical query processing is made that way to imitate real life situations and even if we are used to executing blocks of code from top to bottom, we should change our perspective when writing SQL queries and follow this order.
When you’re not sure of how something in a query is executed, just try to transform the query in a real-life story and see if it becomes easier. For me it works almost every time. Also, it may be easier to split the query on rows in the order of execution, like this:

1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY

Hope this will help and please let me know your way of handling logical query processing challenging situations (like alias issues, etc.). You can leave your comments below and we will be glad to answer.

If you think you have mastered the logical query processing theory, let’s test your knowledge with the following quiz.

This week’s test has a difficulty level of: EASY
Quiz subject: Logical Query Processing

DifficultyLevel_EASY

 

Good luck!

Leave a Reply

Your email address will not be published. Required fields are marked *

eleven + thirteen =