Looping Without Cursors
Sometimes you need to process a table row-by-row,
and the established approach is to use cursors,
which are verbose, slow, and painful to code
and use.
The Cursor Example
Here is the basic minimum syntax required to
loop through a table and get something done.
The SQL flavor is MS SQL Server, but its not
much better in any other flavor.
-- I coded this off the top of my head, there
-- may be a minor syntax error or two
-- Most of this is pseudo-code, but take
-- note that it is ordered on column1
declare someCursorName cursor for
select column1, column2, column3
from anyTable
ORDER BY column1
-- Have to do this now
open someCursorName
-- Now you need to declare some variables
-- For the example I'm just making everything int
declare @column1 int
, @column2 int
, @column3 int
-- Gosh, we're actually about to start the loop! Finally!
fetch next from someCursorName into @column1,@column2,@column3
while @@fetch_status = 0 begin
-- If you still remember what you actually wanted
-- to do inside the loop, code it here:
-- Repeat this line from the top here again:
fetch next from someCursorName into @column1,@column2,@column3
end
-- Not done yet, these two lines are crucial
close someCursorName
deallocate someCursorName
Call me petty, but what I hate about that code is that I
have to refer to specific columns of interest 3 times (not
counting the declarations). You refer to them in the
cursor declaration and in the two FETCH commands. With
a little clever coding, we can vastly simplify this
and do it only once.
Using An Ordered Column
We can execute the same loop without the cursor if
one of the columns is ordered and unique. Let us say
that column1 is the primary key, and is an auto-incremented
integer. So it is ordered and unique. The code now
collapses down to:
-- I coded this off the top of my head, there
-- may be a minor syntax error or two
-- We can't get around declaring the vars, so do that
declare @column1 int
, @column2 int
, @column3 int
-- If you know a safe value for initialization, you
-- can use the code below. If this is not 100%
-- safe, you must query for the value or it must
-- be supplied from some other source
set @column1 = -1
-- BONUS POINTS: Can this become an infinite loop?
while 1 = 1 begin
-- Now we code the query and exit condition
select TOP 1
@column1 = column1
, @column2 = column2
, @column3 = column3
from anyTable
WHERE column1 > @column1 -- this is what advances the loop
ORDER BY column1
if @@rowcount = 0 begin
break
end
-- Put the actions here
end
Final Notes
The only requirement for this approach is
that you have a unique ordered column.
This usually means a unique key or primary
key. If "column1" is not unique, the loop
will skip all but the first value in each
group.
Also, it is very nice if you know a safe
value to use as an initializer. Without that,
you must query for the minimum value that matches
the condition and then decrement it by one.
Finally, can this loop become infinite? No.
Well, if, in the extremely unlikely situation
that rows are being added to the base table faster
than you are processing them, then yes, it could
go on for a very long time. But if that were
happening I'd say there was a separate problem to
look at.
It should probably go without saying, but if
the particular loop is going to happen very
often, the table should be indexed on your
unique ordered column. If it is a primary key
or you already have a unique constraint it is not
necessary to create an index explicitly because
there will be one as part of the key or constraint.
No comments:
Post a Comment