SQL Server 2005 has added a new format for queries called a Common Table Expression or CTE. CTE's are part of the
SQL:1999 Specification and represent further compliance with this specification by SQL Server. This article will cover using one of the many capabilities of CTE's, implementing recursive functions. A recursive function is a one that iteratively executes itself. Understanding how CTE's implement recursive functions is the first step to understanding the full capabilities of CTE's.
We will first look at the basic CTE syntax, then how recursive algorithms work, and finally how to use CTE's to implement them.
The basic CTE syntax is a WITH clause that is used as a preface to a SELECT, INSERT, UPDATE, or DELETE statement. In this article, we will be looking specifically at using a CTE with a SELECT statement. The figure below shows a simple usage of a CTE.
The CTE always starts with the keyword WITH followed by the name of the CTE. A CTE always produces a set of rows and the names of the columns for those rows following the name of the CTE. The AS keyword always follows the names of the columns.
The definition of a CTE is a SELECT statement, inside of parentheses, that produces a column for each column names listed after the name of the CTE.
The Avogadro CTE in the figure produces a single row that has a single column named constant, which contains Avogadro's number. The SELECT statement that follows the CTE uses it in the same way it would use a view named Avogadro. In fact, you can think of a CTE as a temporary view. The SELECT statement that follows the CTE selects the "constant" column for the single row produced by the CTE and outputs 6.02214199e+23.
A recursive algorithm is one that executes itself. It's easier to see how recursion works in a language like C# than T-SQL, so we use C# first to illustrate the concept then show how that maps into T-SQL. The figure below illustrates a recursive algorithm that calculates the sum of the numbers from first to last. For example, the sum of the numbers from 1 to 3 is 1 + 2 + 3, which is 6.
This program uses a recursive algorithm to calculate the sum of the numbers from two to 100, which is 5,049. It illustrates the three parts of every recursive algorithm; initialization, recursive call, and test for completion.
The "trick" to a recursive algorithm is that the function that implements it must have as parameters both its inputs and any working variables it needs to do its computation. The SumOfNumbers3 function implements the recursive algorithm. Two of its parameters are used for input, first and last, and one is used to hold a working variable, sum.
The SumOfNumbers function initializes the algorithm by setting sum to zero and passing in the first and last numbers when it executes SumOfNumbers3. The first thing the SumOfNumbers3 function does is to check to see if the first number greater than the last number (using negative logic). This is the test for completion. If the first is not greater than last it executes itself again passing in the next sum, that is sum+first, the next number to be added, and first+1, and last so that it can be used in the next test for completion. When the algorithm is complete, as indicated by the test for completion, it returns the sum.
A CTE can also be used to implement a recursive function.
This figure shows the three parts of a recursive algorithm. A CTE executes the SELECT statement that comes before the UNION ALL first and just once. This SELECT initializes the recursion by producing a single row that contains a zero, which initializes the sum, and the first and last parameters.
After it executes the SELECT that comes before the UNION ALL, a CTE executes the SELECT statement that comes after the UNION ALL repeatedly until it returns no rows. This is the recursive call and it returns a single row that consists of the updated sum, the updated first number, and the last number.
Once the first number is greater than the last number it does not produce a row and the CTE stops running. This is the test for completion.
Once the CTE has stopped, it has produced, in effect, a table with a row for each row the CTE has produced. The SELECT statement that follows the CTE selects the sum column from the last row the CTE produced. This row and column contain the sum of the numbers. The figure below shows the flow of the calculations produced by the CTE.
The SumOfNumbers function initializes the algorithm by executing the SELECT before the UNION ALL once and produces a single row that has sum =0, first=1, and last = 3. It then executes the SELECT that comes after the UNION ALL three times, each time producing a row where the sum is the sum from the previous row plus the first number from the previous row, and the first number in the previous first number plus one. For example the first of these SELECT's produces 1=0+1, 2 = 1+1, and 3.
When the forth SELECT statement is executed it produces no row because it's WHERE clause compares the previous row first number (4) to its last number (3) and finds that the first number is the bigger of the two.
Finally, the SELECT statement that follows the CTE selects the last row produced by the CTE, the one where the first number (4) is equal to the last number plus one (3 + 1). The sum column in this row is the sum of the numbers from one to three (6).
Let's recap where we are. A Common Table Expression, or CTE, is a T-SQL syntax for creating what is, in effect, a temporary view. A recursive functioncan be implemented using a special form of a CTE, that combines two Select statements with a UNION ALL. The SumOfNumbers function used in this article is easy toimplement without using a CTE, but it is an archetype that is often used to show the workings of a recursive algorithm.
This is just one capability of a CTE, it has quite a few more rabbits in its hat.
About the author
Dan Sullivan started his professional life at Digital Equipment Corporation working on the development of the PDP-11's. Later, he was the director of e3ngineering at Data Translation Incorporated, which developed software and hardware data acquisition products for PDP-11's and introduced their first products for the IBM PC. Dan was also the chief engineer at Delta Lab Research and introduced their first microprocessor-based sound equipment. For the past 15 years Dan has been an independent software consultant, initially for DOS and now for Microsoft Windows NT. His consulting projects spanned the range from GUI-based products for the electronics manufacturing industry to software support for high data rate MPEG2 processing systems. Almost all the projects he has worked on wove together a database, GUI, engineering data and performance and required mentoring the customer's programming staff. He is the president of Danal Technology Inc., his consulting company.