Tuesday, January 14, 2014

Why column aliases are not recognized by all clauses?

Column aliases are commonly used for relabeling columns for increasing the readability of SQL statements. However aliases cannot be referred with all clauses used in SQL statements. Have a look on below code and its output.

1 SELECT 2 GroupName AS DepartmentGroup 3 , COUNT(Name) AS NumberOfDepartment 4 FROM HumanResources.Department 5 GROUP BY DepartmentGroup

Output:
Msg 207, Level 16, State 1, Line 5
Invalid column name 'DepartmentGroup'.

As you see, GROUP BY clause does not recognize the alias “DepartmentGroup” used for GroupName column. Not only GROUP BY, WHERE and HAVING clauses do not recognize aliases too. You can simply refer the same column name with these clauses without thinking much, however knowing the reason for this will definitely add something to your knowledgebase. Not only that, it will help you to properly construct your SQL statements too.

Possible ways of aliasing columns
There are three ways of aliasing columns with identical output results. One method is to use AS keyword, next is to use equal sign (=), and the other way is positioning the alias immediately following the column name.

1 SELECT 2 GroupName AS DepartmentGroup 3 , COUNT(Name) AS NumberOfDepartment 4 FROM HumanResources.Department 5 GROUP BY GroupName 6 7 SELECT 8 DepartmentGroup = GroupName 9 , COUNT(Name) AS NumberOfDepartment 10 FROM HumanResources.Department 11 GROUP BY GroupName 12 13 SELECT 14 GroupName DepartmentGroup 15 , COUNT(Name) AS NumberOfDepartment 16 FROM HumanResources.Department 17 GROUP BY GroupName

In terms of performance, there is no difference between these three methods but readability. Most prefer AS keyword and it is the recommended way by industrial experts.

Why all clauses cannot refer aliases added?
This is because of the logical order query processing. Clauses like GROUP BY, WHERE, and HAVING are processed prior to SELECT. Therefore the aliases used are unknown to them. Below image shows a SELECT statement. Elements in it are numbered based on the processing order.

Order

Note that you can refer aliases with ORDER BY. The reason for it is, ORDER BY (7) is processed after the SELECT (6). Read more on this at: http://dinesql.blogspot.com/2011/04/logical-query-processing-phases-in.html.

Aliases for expressions: Should I write it again with GROUP BY?
We assign aliases for expressions used in SELECT. However, as the alias cannot be referred with a clause like GROUP BY, same expression has to be duplicated which increases the cost of maintainability. It can be overcome by implementing a table expression without hindering the performance.

1 -- expression is set with 2 -- both column and group by 3 SELECT 4 YEAR(OrderDate) OrderYear 5 , SUM(SubTotal) TotalAmount 6 FROM Sales.SalesOrderHeader 7 GROUP BY YEAR(OrderDate) 8 9 -- expression is set only with 10 -- the column 11 SELECT 12 OrderYear 13 , SUM(SubTotal) TotalAmount 14 FROM (SELECT 15 YEAR(OrderDate) OrderYear 16 , SubTotal 17 FROM Sales.SalesOrderHeader) Sales 18 GROUP BY OrderYear
Aliases for tables
It is possible to have aliases for tables too. It does not support all three ways but AS keyword and adding right after the table name. Here is an example;

1 SELECT Orders.OrderDate 2 , Orders.SubTotal 3 FROM Sales.SalesOrderHeader AS Orders 4 5 SELECT h.PurchaseOrderNumber OrderNumber 6 , p.Name Product 7 , d.LineTotal 8 FROM Sales.SalesOrderHeader h 9 INNER JOIN Sales.SalesOrderDetail d 10 ON h.SalesOrderID = d.SalesOrderID 11 INNER JOIN Production.Product p 12 ON d.ProductID = p.ProductID

No comments: