![]() ![]() NOT NULL DEFAULT (REPLICATE ( 'X', 1000 ) ), CONSTRAINT PRIMARY KEY CLUSTERED (RowNum ) ) SomeCode char ( 1000 ) COLLATE LATIN1_GENERAL_CI_AS Here is the code necessary to create the above table and populate it with 50,000 rows: IF OBJECT_ID (N 'dbo.TestData', N 'U' ) IS NOT NULL DROP TABLE dbo. The table has a clustered primary key on the identity column RowNum, an integer SomeId column containing random numbers, and a char(1000) padding column named SomeCode. This first example is based on a question originally asked on the MSDN forums in December 2009.Īt the end of this post, you will find links to the original forum thread (with important contributions from Brad Schulz among others), and subsequent blogs on the subject by Adam Haines and Fabiano Amorim.Īs usual, we will need a test table and some sample data: This simple heuristic often works well, though it is not at all difficult to engineer situations where it comes unstuck. In all other cases, the normal sorting routines are used. SQL Server always uses the alternative algorithm where the top 100 (or fewer) rows are specified. In fact, you would be quite wrong about that. No doubt, you may reason, many extremely clever people have given the best years of their careers to find a great way to make this important choice. You might further imagine that SQL Server performs a number of fairly hairy calculations to determine the optimal choice, and those calculations might depend on heuristics as well as statistical information. Achieving the Right BalanceĪs you might imagine, determining the optimal algorithm to use in a particular circumstance depends on some complex considerations. Sadly, the query plan does not currently expose any information to identify which algorithm was used in a given query execution. To be clear, the Top N Sort iterator may use either algorithm. Normal Sort iterators always use the default sorting method. This alternative algorithm can only be used by a Sort iterator running in Top N Sort mode. The second method only works well in a fairly narrow set of circumstances, and is particularly unsuitable for large sorts since it cannot spill to tempdb. ![]() It is important to realize that the two sorting approaches are complementary. This second method must still examine all candidate rows of course, but it knows only to keep the ‘n’ highest-ranked rows as it goes. SQL Server addresses this problem by providing a second algorithm, optimized to return a small number of rows quickly. It seems wasteful to fully sort all one million rows when we know that a maximum of ten rows will ultimately be needed. Say we have a million-row table, and we just want the first ten rows (in some particular order). When we use a TOP expression, the FAST n query hint, or even an EXISTS clause, we indicate that we would prefer a plan (or part of a plan) optimized to produce the first ‘n’ rows quickly. Sorting with a Row GoalĪs sophisticated and highly-tuned as the main sorting algorithms are, they do assume that the full set of sorted rows is always required. Memory is a precious resource in the server, so SQL Server may spill a sort to tempdb, even if sufficient main memory is available. In fact the algorithms used are more complex they aim to achieve a balance between memory usage and response time, while maintaining high levels of resource concurrency. It is a common misconception that SQL Server will try to perform a sort entirely in memory if it can. They also make good use of available memory resources, and can spill to tempdb if required. They work extremely well regardless of the data types involved, the size of data to be sorted, or the number of sort keys specified. SQL Server’s normal sorting algorithms are suited to a wide range of ordering requirements. That is an idea you might find familiar if you read my previous post Row Goals and Grouping, where we saw how a Sort followed by a Stream Aggregate can sometimes be collapsed into a Sort iterator running in Sort Distinct mode. In reality, the query optimizer can often collapse these two related operations into a single iterator, a Sort running in Top N Sort mode: Thinking about how this requirement might be implemented in an executable query plan, we might expect to see a Sort iterator followed by a Top. Together, the TOP…ORDER BY construction can be used to precisely identify which top ‘n’ rows should be returned. To give a precise meaning to the TOP operation, it will normally be accompanied by an ORDER BY clause. When you write a query to return the first few rows from a potential result set, you’ll often use the TOP clause. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |