Stored procedures can grow to be quite
lengthy. I've created procedures that are 500 to 1000 lines or
more. There are a couple of reasons for this, including the limited
coding structure provided by T-SQL. In particular, the limited
error handling in SQL Server 2000 forces constant testing of error
values. This sometimes leads to somewhat awkward procedures. Over
the years I've developed some practices for writing large stored
procedures that I'd like to share with you.
As far as I'm concerned most complex programs
shouldn't be written in T-SQL. They should be in a client program
or a batch program that can be run on a server other than the one
hosting SQL Server. The reasons for this are mostly economic, due
to the cost of running code inside SQL Server and in particular, the
cost of scaling up the SQL Server. However, there are other
economies at work. In particular, the cost of programmer
productivity, training, and operations. These factors may weigh in
favor of T-SQL stored procedures.
Yet another set of factors is performance. A
T-SQL stored procedure runs in the SQL Server database engine and
its access to data is as close as possible. The result can be
staggeringly higher performance, in terms of run-time, than a
program running in another tier of the application.
So I accept that there are times when large
T-SQL stored procedures are necessary. Given that, I'd like to make
them as maintainable as possible. This is done with several
strategies:
- Proper naming of the procedure.
- Documentation in the program header
- Breaking stored procedures down into
subprocedures.
- Formatting the T-SQL for readability.
Previous articles have discussed the naming of
database objects including stored procedures. You might want to go
back and look at
Volume 3 #10
and #11,
which are about naming. In addition, as I started the "Style" theme
for Coding-in-SQL, Volume 3 #9 is about Seperator First Formatting (SFF)
for all T-SQL. I'll assume that you're familiar with these issues
and concentrate on the last two items. Breaking stored procedures
into subprocedures and the layout of large stored procedures.
Back when I was a freshman at Brown and took
Computer Science 100, Professor Van Damm taught us about creating
subroutines. Programs should be readable and almost all subroutines
should be less than a page in length. Of course, the only way to
see your subroutine was on greenbar paper or flipping through your
card deck. So the 66 line subroutine limit sort of got engrained in
my thought process.
Things have changed and we hardly see programs
on paper anymore. The screen is the most common vehicle for reading
most programs and the number of lines that you can see at a time is
much more limited than on a full sheet of paper. I can get around
40 lines on my screen when using SQL Query Analyzer. Of course, I'm
kind of old and I've pumped up the font size so I can read it. I'm
sure some of you can see more lines.
While I generally like to break code into
manageable chunks, there's a down side to this practice: poor
performance. The SQL Server database engine isn't a compiler it's
an interpreter. Calling a stored procedure from another stored
procedure requires that SQL Server at least consider whether it must
create a new plan for the called stored procedure. Since the
driving reason for creating a stored procedure may be the
performance, using a technique that slows you down does the opposite
of the original intent. Overall, I've had to strike a balance
between making sort subroutine type stored procedures and longer
procedures that perform better but might be more difficult to read.
One of the things that I've learned about
breaking any code down is that it isn't really necessary if the code
is linear. That is if there are no loops and the code only flows
from top to bottom, I can live without breaking it down into
subroutines. This helps a great deal when writing T-SQL because a
routine can often be arranged this way.
Error handling is another major factor in the
layout of a stored procedure. As I've written before T-SQL lacks
any sort of structured exception handling. You're really on your
own as far as structure goes. My typical structure for error
handling is to test for errors after every significant statement and
if an error is found, set the error message and then GOTO an error
handing block at the end of the procedure. Yes, you got that right,
I use GOTO all the time to handle errors. Professor Van Damm would
be rolling over in his grave, if he were dead.
What I’ve found over time is that trying to do
error handling in T-SQL without resorting to the GOTO statement
makes for much more convoluted code. Lot’s of indented IF blocks
and the like. In most procedures, I can keep the nesting level to a
minimum and keep the code pretty much linear from beginning to end.
The same holds for many Visual Basic 6 programs. VB6 also lacks
structured exception handling and the “ON ERROR GOTO” statement is a
form goto that you can’t live without. Fortunately, Visual Basic.Net
and C# have very robust structured exception handling.
Each section of my typical long stored
procedure is layed out approximately like this:
-----------------------------------------------------------------
-----------------------------------------------------------------
-- Comments about the section go here.
-- Possibly an additional comment about the next statement
DELETE or SELECT or INSERT …
SELECT
@myError = @@Error, @myRowCount = @@Rowcount
IF
@myError = 0
BEGIN
SET
@NumDelMsr = @myRowCount
SET
@Msg = 'Message describing success …’
exec
usp_AppSession_Trace @TraceMsgID
output,
@TraceSesID
,@myTLevel , 2,@Msg,@TraceSource,@TraceCategory
END
ELSE BEGIN
SET
@Msg = 'message
describing the error …’
exec
usp_AppSession_Trace @TraceMsgID
output,
@TraceSesID
,@myTLevel, 1,@Msg,@TraceSource,@TraceCategor
RAISERROR
(@Msg, 16, 1) – So the
message is in the output.
SET
@WorstError =
CASE WHEN
@WorstError < @myError
THEN
@MyError
ELSE
@WorstError
END
GOTO
GETOUT
END
The double line of dashes set the section apart
from the other sections. That’s followed by the SQL Statement that
really does the work. Next is the statement:
SELECT
@myError = @@Error, @myRowCount = @@Rowcount
This statement is needed after every SQL statement that could
set an error code of interest. Typically these are SELECT, INSERT,
UPDATE, DELETE, and EXEC statements. It’s important to retrieve the
value of @@ERROR immediately after it might have been set because
the next T-SQL Statement is going to reset @@ERROR.
Once the value of @@ERROR and @@ROWCOUNT are
safe in variables, they can be used in determining if the statement
succeeded and what to do about it. Usually for an error, I try and
write a trace statement, leave the message in the results, and GOTO
the GETOUT label, which is at the end of the procedure. If the
error forces a rollback, that may also be necessary. In that case,
trace messages written to the AppSessionTrace table are going to be
wiped out by the rollback, but the message in the procedures output
in the RAISERROR statement will get back to the caller.
The @WorstError variable keeps the worst error
code so that it can be used as the return code for the proc. Of
course, if the first error is all that you’re going to allow, you
can dispense with @WorstError.
At the bottom of the procedure is the GETOUT:
label that is referenced in the GOTO. Here’s the layout of the
typical bottom of a long procedure:
-----------------------------------------------------------------
-----------------------------------------------------------------
-- Wrap Up the Job
GETOUT:
SET
@Msg =
‘Construct a final message here.’
EXEC
usp_AppSession_End @TraceSesID
, 'AppSession|AppSessionTrace|AppSessionMeasurement'
, @Msg
, '|'
RETURN
@WorstError
For a full blown example, I refer you to a
recent article,
Volume 3 #7,
Trace Tables Part V: Purging Trace Tables. The procedure
usp_AppSessionTrace in that issue follows my typical procedure
layout.
The next topic is SQL Server 2005 programming
with the .Net CLR. This should be interesting. I’m making two
presentations about .net CLR Programming at Microsoft’s Code Camp V
on September 24-25. If you’re in the Boston Area, you should
regester and come check it out. I’d love to meet you.
Transact-SQL User-Defined Functions has been published! Take a look at it
now!
Do you have an interesting stored procedure, user-defined-function,
or other sample code? If you'd like to share them please send
them in an email to:
coding-in-sql@NovickSoftware.com
and they might be published in this newsletter. I try and
respond to every request that I get.
Thanks,
Andy
Andrew Novick