Tuning, Optimizing, Increasing and Improving Performance of Asp.Net Application - Part III
Posted on Saturday, January 23, 2010 9:33:20 PM
and it has been read 3805 times since then.
Tuning, Optimizing, Increasing and Improving Performance of Asp.Net Application - Part III
I gathered things together from the internet that is why also specified as many links as possible. This is kind of check list to get benefit from. I gave the original link for each item in this list to their author's web site.
In the third and the last part of this three-part series, I will try to give you some of the precautions that we had better pay attention before going public with our asp.net application. These are my collection from the internet. I just wanted to give you kind of check list which consists of as many item as possibly could be in it. In this third part, I will be focusing on the MS SQL Server and T-SQL, some tips and best practices I gathered from my reading on the internet. If you think that some other unique links (I believe there are) available and would be beneficial to others, please share as comment.
2-) DataReaders provide a fast and efficient method of data retrieval. DataReader is much faster than DataSets as far as performance is concerned. Prefer database reader over dataset unless and until you have specific reason to use database.
If you are reading a table sequentially you should use the DataReader rather than DataSet. DataReader object creates a read only stream of data that will increase your application performance because only one row is in memory at a time.
3-) Always use built in .Net data providers. The built in .Net data providers allow you to take advantage of both the .Net framework and the full power of the database.
4-) Always use a config file to store your connection strings. It is always best to store data that might change in a location outside of your application where you can easily update the connection strings. Also encrypting the connection strings is always a good idea from a security standpoint.
5-) Prefer to use the sorting methods on the SQL Server such as the ORDER BY, HAVING and GROUP BY statements. By performing the sorting on the server side as opposed to the client side you save time because the server can perform the work faster.
6-) You should always try to limit the number of rows in a resultset. This can be performed typically by using the TOP keyword or other similar methods. By limiting the amount of information you send through the wire you make the application seem faster.
7-) It is always best to use CommandBehavior.CloseConnection when you use the ExecuteReader method of a Command object. This allows for better connection pooling as the connections that are opened are returned quickly.
8-) It is always best to cancel before closing a DataReader object if you are finished reading any more rows. The close method of the DataReader class continues to read all remaining rows before it finally closes the object. This is a wasteful use of resources.
9-) It is always best to use a parameterized command (usually a stored procedure) over dynamic SQL queries. This will improve performance and reduce the chance SQL injection attack while also making your code much more easier to maintain.
10-) Use database paging over normal paging while displaying huge amount of data.
It is always best to implement some sort of resultset pagination when dealing with results of 50 or more rows. Although not an easy task in most cases using this technique you can increase performance on both your server database and your client application as less overhead and network traffic is taking place at any one time.
11-) Consider SET NOCOUNT ON for SQL Server
Microsoft SQL Server offers a set option called NOCOUNT. It is turned off by default so that each operation returns information regarding the number of rows affected. However, applications do not need this information. If you turn on the NOCOUNT option, stored procedures will not return row-count information and therefore, you will save the network overhead involved with communicating that information to the client. To set NOCOUNT, simply insert SET NOCOUNT ON as the first statement in the stored procedure, as shown in the following.
CREATE PROCEDURE dbo.procSample
SET NOCOUNT ON
SELECT au_id FROM authors
Running a query to select the author id field returns "(23 row(s) affected)" at the bottom of the query results. On the other hand, running the stored procedure returns only the author id field without the extra message. The reduced network load can be significant if the particular query or update involves a high number of transactions.
12-) Optimize table access with NOLOCK
Most database access does not require transaction safety. This is evident in the popularity of the MySQL database product, which does not supply any record-locking capability (although the 4.0 release is supposed to support transactions). A stored procedure or any access to a database table in SQL can make tremendous performance gains if you use a table hint that lets the SQL engine ignore and not perform locks for a given operation. Take a close look at your applications and you will see many queries that can ignore locking and still return valid information.
Consider the following T-SQL, which shows a stored procedure that loops over the entire set of records in the authors table to obtain a count. Modifying that routine to no longer perform locking yields a tremendous reduction (for 23 records, perhaps a modest reduction) in overhead.
CREATE PROCEDURE dbo.procSample AS
SET NOCOUNT ON
DECLARE @recCount int
SELECT @recCount = COUNT(au_id) FROM authors WITH (NOLOCK)
13-) Use ADO.NET asynchronous calls for ado.net methods. Asp.net 2.0 or higher version is supporting your performance. If you are using same procedure or command multiple time then use ADO.NET Prepare command it will increase your performance.
14-) Keep Your Datasets Lean
Remember that the dataset stores all of its data in memory, and that the more data you request, the longer it will take to transmit across the wire. Therefore Only put the records you need into the dataset.
15-) Use Sequential Access as Often as Possible
With a data reader, use CommandBehavior.SequentialAccess. This is essential for dealing with blob data types since it allows data to be read off of the wire in small chunks. While you can only work with one piece of the data at a time, the latency for loading a large data type disappears. If you don't need to work the whole object at once, using Sequential Access will give you much better performance.
Provides a way for the DataReader to handle rows that contain columns with large binary values BLOBs. Rather than loading the entire row, SequentialAccess enables the DataReader to load data as a stream. When setting the DataReader to use SequentialAccess, it is important to note the sequence in which you access the fields returned. The default behaviour of the DataReader, which loads an entire row as soon as it is available, allows you to access the fields returned in any order until the next row is read. When using SequentialAccess however, you must access the different fields returned by the DataReader in order. For example, if your query returns three columns, the third of which is a BLOB, you must return the values of the first and second fields before accessing the BLOB data in the third field. If you access the third field before the first or second fields, the first and second field values will no longer be available. This is because SequentialAccess has modified the DataReader to return data in sequence and the data will not be available after the DataReader has read past it.
When accessing the data in the BLOB field, use the GetBytes or GetChars typed accessors of the DataReader, which fill an array with data. You can also use GetString for character data, however to conserve system resources you may not want to load an entire BLOB value into a single string variable. You can specify a specific buffer size of data to be returned, and a starting location for the first byte or character to be read from the returned data. GetBytes and GetChars will return a long value, which represents the number of bytes or characters returned. If you pass a null array to GetBytes or GetChars, the long value returned will be the total number of bytes or characters in the BLOB. You can optionally specify an index in the array as a starting position for the data being read.
16-) Do Not Use CommandBuilder at Run Time
CommandBuilder objects such as as SqlCommandBuilder and OleDbCommandBuilder are useful when you are designing and prototyping your application. However, you should not use them in production applications. The processing required to generate the commands affects performance.
Manually create stored procedures for your commands, or use the Visual Studio .NET design-time wizard and customize them later if necessary.
18-) Use Stored Procedures Whenever Possible
Stored procedures are highly optimized tools that result in excellent performance when used effectively.
Set up stored procedures to handle inserts, updates, and deletes with the data adapter.
Stored procedures do not have to be interpreted, compiled or even transmitted from the client, and cut down on both network traffic and server overhead. Be sure to use CommandType.StoredProcedure instead of CommandType.Text
19-) Try to avoid using SQL Server cursors, whenever possible.
SQL Server cursors can result in some performance degradation in comparison with select statements. Try to use correlated subquery or derived tables, if you need to perform row-by-row operations.
Always stick to 'set based approach' instead of a 'procedural approach' for accessing/manipulating data. Cursors can be easily avoided by SELECT statements in many cases. If a cursor is unavoidable, use a simple WHILE loop instead, to loop through the table. It is tested and concluded that a WHILE loop is faster than a cursor most of the times. But for a WHILE loop to replace a cursor you need a column (primary key or unique key) to identify each row uniquely and I personally believe every table must have a primary or unique key.
20-) Do not depend on undocumented functionality. The reasons being:
- You will not get support from Microsoft, when something goes wrong with your undocumented code
- Undocumented functionality is not guaranteed to exist (or behave the same) in a future release or service pack, thereby breaking your code
21-) Try not to use system tables directly. System table structures may change in a future release. Wherever possible, use the sp_help* stored procedures or INFORMATION_SCHEMA views. There will be situations where you cannot avoid accessing system table though!
According to SQL Server Books Online, "these components constitute a published API for obtaining system information from SQL Server. Microsoft maintains the compatibility of these components from release to release. The format of the system tables depends on the internal architecture of SQL Server and might change from release to release. Therefore, you might have to change applications that directly access the system tables before the applications can access a later version of SQL Server."
22-) Make sure you normalize your data at least till 3rd normal form. At the same time, do not compromise on query performance. A little bit of denormalization helps queries perform faster.
23-) Write comments in your stored procedures, triggers and SQL batches generously, whenever something is not very obvious. This helps other programmers understand your code clearly. Do not worry about the length of the comments, as it will not impact the performance, unlike interpreted languages like ASP 2.0.
24-) Try to avoid IN clause. While checking the existence of some values, then use EXISTS instead of IN. Because IN counts the NULL values also, and slower than EXISTS. EXISTS returns only Boolean(Yes/No) value, but IN returns all result set.
25-) Avoid DISTINCT/ORDER BY clause : If you do not need the DISTINCT/ORDER BY clause, then try to avoid them. Unnecessary DISTINCT or ORDER BY clauses will cause extra work for the database, so it makes performance slower.
26-) Do not use sp prefix when you write stored procedure.
Do not use the "sp_" prefix in a user created stored procedure name as the "sp_" prefix is reserved for system stored procedures. Any stored procedure that has the "sp_" prefix will be lookup in the MASTER database first, if a stored procedure uses same name in both the user database and a system database, the stored procedure in the System Database only get executed.
27-) Do not use SELECT * in your queries. Always write the required column names after the SELECT statement, like SELECT CustomerID, CustomerFirstName, City. This technique results in less disk IO and less network traffic and hence better performance.
28-) Avoid the creation of temporary tables while processing data, as much as possible, as creating a temporary table means more disk IO. Consider advanced SQL or views or table variables of SQL Server 2000 or derived tables, instead of temporary tables. Keep in mind that, in some cases, using a temporary table performs better than a highly complicated query.
29-) Try to avoid wildcard characters at the beginning of a word while searching using the LIKE keyword, as that results in an index scan, which is defeating the purpose of having an index. The following statement results in an index scan, while the second statement results in an index seek:
1. SELECT LocationID FROM Locations WHERE Specialities LIKE '%pples'
2. SELECT LocationID FROM Locations WHERE Specialities LIKE 'A%s'
Also avoid searching with not equals operators (<> and NOT) as they result in table and index scans. If you must do heavy text-based searches, consider using the Full-Text search feature of SQL Server for better performance.
30-) Use 'Derived tables' wherever possible, as they perform better. Consider the following query to find the second highest salary from Employees table:
SELECT MIN(Salary) FROM Employees WHERE EmpID IN (SELECT TOP 2 EmpID FROM Employees ORDER BY Salary Desc)
The same query can be re-written using a derived table as shown below, and it performs twice as fast as the above query:
SELECT MIN(Salary) FROM (SELECT TOP 2 Salary FROM Employees ORDER BY Salary Desc) AS A
This is just an example, the results might differ in different scenarios depending upon the database design, indexes, volume of data etc. So, test all the possible ways a query could be written and go with the efficient one. With some practice and understanding of 'how SQL Server optimizer works', you will be able to come up with the best possible queries without this trial and error method.
31-) While designing your database, design it keeping 'performance' in mind. You can not really tune performance later, when your database is in production, as it involves rebuilding tables/indexes, re-writing queries. Use the graphical execution plan in Query Analyzer or SHOWPLAN_TEXT or SHOWPLAN_ALL commands to analyze your queries. Make sure your queries do 'Index seeks' instead of 'Index scans' or 'Table scans'. A table scan or an index scan is a very bad thing and should be avoided where possible (sometimes when the table is too small or when the whole table needs to be processed, the optimizer will choose a table or index scan).
32-) Prefix the table names with owner names, as this improves readability, avoids any unnecessary confusions. Microsoft SQL Server Books Online even states that qualifying tables names, with owner names helps in execution plan reuse.
33-) Use the more readable ANSI-Standard Join clauses instead of the old style joins. With ANSI joins the WHERE clause is used only for filtering data. Where as with older style joins, the WHERE clause handles both the join condition and filtering data. The first of the following two queries shows an old style join, while the second one shows the new ANSI join syntax:
SELECT a.au_id, t.title FROM titles t, authors a, titleauthor ta
WHERE a.au_id = ta.au_id AND a.title_id = t.title_id AND t.title LIKE '%Computer%'
SELECT a.au_id, t.title FROM authors a
INNER JOIN titleauthor ta ON a.au_id = ta.au_id
INNER JOIN titles t ON a.title_id = t.title_id WHERE t.title LIKE '%Computer%'
Be aware that the old style *= and =* left and right outer join syntax may not be supported in a future release of SQL Server, so you are better off adopting the ANSI standard outer join syntax.
34-) Views are generally used to show specific data to specific users based on their interest. Views are also used to restrict access to the base tables by granting permission on only views. Yet another significant use of views is that, they simplify your queries. Incorporate your frequently required complicated joins and calculations into a view, so that you do not have to repeat those joins/calculations in all your queries, instead just select from the view.
35-) Use 'User Defined Datatypes', if a particular column repeats in a lot of your tables, so that the datatype of that column is consistent across all your tables.
36-) Do not let your front-end applications query/manipulate the data directly using SELECT or INSERT/UPDATE/DELETE statements. Instead, create stored procedures, and let your applications access these stored procedures. This keeps the data access clean and consistent across all the modules of your application, at the same time centralizing the business logic within the database.
37-) Try not to use text, ntext datatypes for storing large textual data. 'text' datatype has some inherent problems associated with it. You can not directly write, update text data using INSERT, UPDATE statements (You have to use special statements like READTEXT, WRITETEXT and UPDATETEXT). There are a lot of bugs associated with replicating tables containing text columns. So, if you don't have to store more than 8 KB of text, use char(8000) or varchar(8000) datatypes.
38-) If you have a choice, do not store binary files, image files (Binary large objects or BLOBs) etc. inside the database. Instead store the path to the binary/image file in the database and use that as a pointer to the actual binary file. Retrieving, manipulating these large binary files is better performed outside the database and after all, database is not meant for storing files.
39-) Use char data type for a column, only when the column is non-nullable. If a char column is nullable, it is treated as a fixed length column in SQL Server 7.0+. So, a char(100), when NULL, will eat up 100 bytes, resulting in space wastage. So, use varchar(100) in this situation. Of course, variable length columns do have a very little processing overhead over fixed length columns. Carefully choose between char and varchar depending up on the length of the data you are going to store.
40-) Avoid dynamic SQL statements as much as possible. Dynamic SQL tends to be slower than static SQL, as SQL Server must generate an execution plan every time at runtime. IF and CASE statements come in handy to avoid dynamic SQL. Another major disadvantage of using dynamic SQL is that, it requires the users to have direct access permissions on all accessed objects like tables and views. Generally, users are given access to the stored procedures which reference the tables, but not directly on the tables. In this case, dynamic SQL will not work. Consider the following scenario, where a user named 'dSQLuser' is added to the pubs database, and is granted access to a procedure named 'dSQLproc', but not on any other tables in the pubs database. The procedure dSQLproc executes a direct SELECT on titles table and that works. The second statement runs the same SELECT on titles table, using dynamic SQL and it fails with the following error:
Server: Msg 229, Level 14, State 5, Line 1
SELECT permission denied on object 'titles', database 'pubs', owner 'dbo'.
To reproduce the above problem, use the following commands:
sp_defaultdb 'dSQLuser', 'pubs'
sp_adduser 'dSQLUser', 'dSQLUser'
CREATE PROC dSQLProc AS
SELECT * FROM titles WHERE title_id = 'BU1032' --This works
DECLARE @str CHAR(100)
SET @str = 'SELECT * FROM titles WHERE title_id = ''BU1032'''
EXEC (@str) --This fails
GRANT EXEC ON dSQLProc TO dSQLuser
Now login to the pubs database using the login dSQLuser and execute the procedure dSQLproc to see the problem.
41-) Consider the following drawbacks before using IDENTITY property for generating primary keys. IDENTITY is very much SQL Server specific, and you will have problems if you want to support different database backends for your application. IDENTITY columns have other inherent problems. IDENTITY columns run out of numbers one day or the other. Numbers can not be reused automatically, after deleting rows. Replication and IDENTITY columns do not always get along well. So, come up with an algorithm to generate a primary key, in the front-end or from within the inserting stored procedure. There could be issues with generating your own primary keys too, like concurrency while generating the key, running out of values. So, consider both the options and go with the one that suits you well.
42-) Minimize the usage of NULLs, as they often confuse the front-end applications, unless the applications are coded intelligently to eliminate NULLs or convert the NULLs into some other form. Any expression that deals with NULL results in a NULL output. ISNULL and COALESCE functions are helpful in dealing with NULL values. Here's an example that explains the problem:
Consider the following table, Customers which stores the names of the customers and the middle name can be NULL.
43-) Use Unicode datatypes like nchar, nvarchar, ntext, if your database is going to store not just plain English characters, but a variety of characters used all over the world. Use these datatypes, only when they are absolutely needed as they need twice as much space as non-unicode datatypes.
44-) Always use a column list in your INSERT statements. This helps in avoiding problems when the table structure changes (like adding a column). Here's an example which shows the problem.
Consider the following table:
CREATE TABLE EuropeanCountries(CountryID int PRIMARY KEY,CountryName varchar(25))
Here's an INSERT statement without a column list , that works perfectly:
INSERT INTO EuropeanCountries VALUES (1, 'Ireland')
Now, let's add a new column to this table:
ALTER TABLE EuropeanCountries ADD EuroSupport bit
Now run the above INSERT statement. You get the following error from SQL Server:
Server: Msg 213, Level 16, State 4, Line 1
Insert Error: Column name or number of supplied values does not match table definition.
This problem can be avoided by writing an INSERT statement with a column list as shown below:
INSERT INTO EuropeanCountries (CountryID, CountryName) VALUES (1, 'England')
45-) Perform all your referential integrity checks, data validations using constraints (foreign key and check constraints). These constraints are faster than triggers. So, use triggers only for auditing, custom tasks and validations that can not be performed using these constraints. These constraints save you time as well, as you don't have to write code for these validations and the RDBMS will do all the work for you.
46-) Always access tables in the same order in all your stored procedures/triggers consistently. This helps in avoiding deadlocks. Other things to keep in mind to avoid deadlocks are: Keep your transactions as short as possible. Touch as less data as possible during a transaction. Never, ever wait for user input in the middle of a transaction. Do not use higher level locking hints or restrictive isolation levels unless they are absolutely needed. Make your front-end applications deadlock-intelligent, that is, these applications should be able to resubmit the transaction in case the previous transaction fails with error 1205. In your applications, process all the results returned by SQL Server immediately, so that the locks on the processed rows are released, hence no blocking.
47-) Offload tasks like string manipulations, concatenations, row numbering, case conversions, type conversions etc. to the front-end applications, if these operations are going to consume more CPU cycles on the database server (It's okay to do simple string manipulations on the database end though). Also try to do basic validations in the front-end itself during data entry. This saves unnecessary network round trips.
48-) If back-end portability is your concern, stay away from bit manipulations with T-SQL, as this is very much RDBMS specific. Further, using bitmaps to represent different states of a particular entity conflicts with the normalization rules.
49-) Consider adding a @Debug parameter to your stored procedures. This can be of bit data type. When a 1 is passed for this parameter, print all the intermediate results, variable contents using SELECT or PRINT statements and when 0 is passed do not print debug information. This helps in quick debugging of stored procedures, as you don't have to add and remove these PRINT/SELECT statements before and after troubleshooting problems.
50-) Do not call functions repeatedly within your stored procedures, triggers, functions and batches. For example, you might need the length of a string variable in many places of your procedure, but do not call the LEN function whenever it is needed, instead, call the LEN function once, and store the result in a variable, for later use.
51-) Make sure your stored procedures always return a value indicating the status. Standardize on the return values of stored procedures for success and failures. The RETURN statement is meant for returning the execution status only, but not data. If you need to return data, use OUTPUT parameters.
52-) If your stored procedure always returns a single row resultset, consider returning the resultset using OUTPUT parameters instead of a SELECT statement, as ADO handles output parameters faster than resultsets returned by SELECT statements.
53-) Always check the global variable @@ERROR immediately after executing a data manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction in case of an error (@@ERROR will be greater than 0 in case of an error). This is important, because, by default, SQL Server will not rollback all the previous changes within a transaction if a particular statement fails. This behavior can be changed by executing SET XACT_ABORT ON. The @@ROWCOUNT variable also plays an important role in determining how many rows were affected by a previous data manipulation (also, retrieval) statement, and based on that you could choose to commit or rollback a particular transaction.
54-) To make SQL Statements more readable, start each clause on a new line and indent when needed. Following is an example:
SELECT title_id, title FROM titles WHERE title LIKE 'Computing%' AND title LIKE 'Gardening%'
55-) Though we survived the Y2K, always store 4 digit years in dates (especially, when using char or int datatype columns), instead of 2 digit years to avoid any confusion and problems. This is not a problem with datetime columns, as the century is stored even if you specify a 2 digit year. But it is always a good practice to specify 4 digit years even with datetime datatype columns.
56-) In your queries and other SQL statements, always represent date in yyyy/mm/dd format. This format will always be interpreted correctly, no matter what the default date format on the SQL Server is. This also prevents the following error, while working with dates:
Server: Msg 242, Level 16, State 3, Line 2
The conversion of a char data type to a datetime data type resulted in an out-of-range datetime value.
57-) As is true with any other programming language, do not use GOTO or use it sparingly. Excessive usage of GOTO can lead to hard-to-read-and-understand code.
58-) Do not forget to enforce unique constraints on your alternate keys.
59-) Always be consistent with the usage of case in your code. On a case insensitive server, your code might work fine, but it will fail on a case sensitive SQL Server if your code is not consistent in case. For example, if you create a table in SQL Server or database that has a case-sensitive or binary sort order, all references to the table must use the same case that was specified in the CREATE TABLE statement. If you name the table as 'MyTable' in the CREATE TABLE statement and use 'mytable' in the SELECT statement, you get an 'object not found' or 'invalid object name' error.
60-) Though T-SQL has no concept of constants (like the ones in C language), variables will serve the same purpose. Using variables instead of constant values within your SQL statements, improves readability and maintainability of your code. Consider the following example:
UPDATE dbo.Orders SET OrderStatus = 5 WHERE OrdDate < '2001/10/25'
The same update statement can be re-written in a more readable form as shown below:
DECLARE @ORDER_PENDING int
SET @ORDER_PENDING = 5
UPDATE dbo.Orders SET OrderStatus = @ORDER_PENDING WHERE OrdDate < '2001/10/25'
61-) Do not use the column numbers in the ORDER BY clause as it impairs the readability of the SQL statement. Further, changing the order of columns in the SELECT list has no impact on the ORDER BY when the columns are referred by names instead of numbers. Consider the following example, in which the second query is more readable than the first one:
SELECT OrderID, OrderDate FROM Orders ORDER BY 2
SELECT OrderID, OrderDate FROM Orders ORDER BY OrderDate