Introduction into Caching in SQL Server 2000
This topic has been in my minds for quite some
time now. Just thought to get them into a understandable article for
public use ... Well, most of the DBA's and many programmers of SQL Server
should have heard of the terms data cahe and procedure cahe in conjunction with
SQL Server. These are the primary building blocks or parts of the caching
mechanism we all talk about. But what is this cache on first place?
SQL Server is configured to use a pool of memory of
the server and it will allocate the majority of this memory pool to hold both
data pages that have been read and the compiled execution plans for all
Transact-SQL statements. This as said is a Dynamic set of memory that gets
allocated. These constitute the data and the procedural cache we mentioned
earlier. In versions prior to SQL Server 2000 these were two
distinct entities and we could manage the memory allocated to each of
them separately. From SQL 2000, these have been merged into one big unit and
SQL Server manages the allocation internally.
Understanding Cache Objects
SQL Server has a couple of roles to manage memory: one
it needs to free up buffers or a inbuilt aging execution plans. Basically
buffer acts as a page in memory with the same footprint of the size as a data
or index. Hence a buffer page is equivalent to a single page on the database.
The aging mechanism is handled by a process called lazywriter.
Mostly familiar to people who have had a close watch on SQL Server. Lazywriter
has been in the SQL Server architecture for quite some time now and acts as a
clocked algorithm that scans the memory for stale or unused objects.
As the lazywriter visits each buffer it will determine
whether that buffer has been referenced since the last lazywriter sweep, it
does this by examining a reference count value in the buffer header, the
reference count is adjusted up by 1 each time a statement references that
buffer. If the reference count is not 0, the buffer will stay in the pool, but
its reference count will be adjusted downward for the next sweep.
The lazywriter will also sweep the buffer pool when the number of pages on the
free list falls below a minimum value, this value is computed as a percentage
of the overall buffer pool size but is always between 128KB and 4MB. SQL Server
will adjust this computed size based on the load on the system and the number
of buffer stalls occurring.
As I had mentioned earlier the data is just a part of
the memory. There are execution plans that are also a part of the memory cache
that need to be monitored. By execution plans we mean these are compiled plans
of stored procedures or they can also mean ad-hoc plans that occur due to
Dynamic SQL (D-SQL) execution. There are also statements that are run using
sp_executeSQL, SQLPrepare/SQLExecute, autoparameterizing feature etc that get
into this memory space. I will not deal with them in detail at this section.
The concept works almost the same way as we worked for data pages. But instead
of reads we associate a cost to the execution plan, mostly a factor of the IO
and the CPU cost to the execution plan. The lazywriter mechanism still
sweeps these data and decrements the value. When the cost associated becomes 0
these plans can be taken off from the memory. One important point I wanted to
specify is that the plans or the data cache is not removed until there is a
need for memory and the SQL Server is running low on memory. Only the need
drives the clearing of the cache.
More on Cache Ojects
The cache objects can be queried using the system
table "syscacheobjects". The schema for the same in SQL Server 2000 as in BOL
(Books Online) is as below:
|
Column
name
|
Data
type
|
Description
|
|
bucketid
|
int
|
Bucket
ID. Value indicates a range from 0 through (directory size - 1). Directory size
is the size of the hash table.
|
|
cacheobjtype
|
nvarchar(34)
|
Type
of object in the cache:
Compiled
Plan
Executable Plan
Parse Tree
Cursor Parse Tree
Extended Stored Procedure
|
|
objtype
|
nvarchar(16)
|
Type
of object:
Stored
Procedure
Prepared statement
Ad hoc query (Transact-SQL submitted as language events from isql or osql, as
opposed to remote procedure calls)
ReplProc (replication procedure)
Trigger
View
Default
User table
System table
Check
Rule
|
|
objid
|
int
|
One
of the main keys used for looking up an object in the cache. This is the object
ID stored in sysobjects for database objects (procedures, views, triggers, and
so on). For cache objects such as ad hoc or prepared SQL, objid is an
internally generated value.
|
|
dbid
|
smallint
|
Database
ID in which the cache object was compiled.
|
|
uid
|
smallint
|
Indicates
the creator of the plan for ad hoc query plans and prepared plans. -2 indicates
the batch submitted does not depend on implicit name resolution and can be
shared among different users. This is the preferred method. Any other value
represents the user ID of the user submitting the query in the database.
|
|
refcounts
|
int
|
Number
of other cache objects referencing this cache object. A count of 1 is the base.
|
|
usecounts
|
int
|
Number
of times this cache object has been used since inception.
|
|
pagesused
|
int
|
Number
of memory pages consumed by the cache object.
|
|
setopts
|
int
|
SET
option settings that affect a compiled plan. These are part of the cache key.
Changes to values in this column indicate users have modified SET options.
These options include:
ANSI_PADDING
FORCEPLAN
CONCAT_NULL_YIELDS_NULL
ANSI_WARNINGS
ANSI_NULLS
QUOTED_IDENTIFIER
ANSI_NULL_DFLT_ON
ANSI_NULL_DFLT_OFF
|
|
langid
|
smallint
|
Language
ID. ID of the language of the connection that created the cache object.
|
|
dateformat
|
smallint
|
Date
format of the connection that created the cache object.
|
|
status
|
int
|
Indicates
whether the cache object is a cursor plan. Currently, only the least
significant bit is used.
|
|
sqlbytes
|
int
|
Length
of name or batch submitted. Can be used to distinguish two names or submitted
batches if the first 128 characters are the same.
|
|
sql
|
nvarchar(256)
|
Procedure
name or first 128 characters of the batch submitted.
|
An interesting schema to take a close look at. Note:
the data cache we discussed earlier will not be a part of this tables
data. One column that struck me for the first time is the "setopt"
column in this table. Very intersting column and values ... Understand that
each of these SET values can potentially change the behaviour of the executing
statement. Classic example is the usage of CONCAT_NULL_YIELDS_NULL can yield
different results based on this option for NULL+"String Value" type of code
blocks. And to get the values of these settings use the below query to get the
SET options associated with a plan generated:
declare @n int
SET @n = 251
SELECT SetOption, sign( p2 ) AS
IsEnabled
FROM (
SELECT 'ANSI_PADDING',
POWER( 2, 0 )
UNION ALL
SELECT 'FORCEPLAN', POWER( 2, 1 )
UNION ALL
SELECT 'CONCAT_NULL_YIELDS_NULL',
POWER( 2, 2 )
UNION
ALL
SELECT 'ANSI_WARNINGS', POWER( 2, 3 )
UNION ALL
SELECT 'ANSI_NULLS', POWER( 2, 4 )
UNION ALL
SELECT 'QUOTED_IDENTIFIER', POWER( 2,
5 )
UNION ALL
SELECT 'ANSI_NULL_DFLT_ON', POWER( 2,
6 )
UNION ALL
SELECT 'ANSI_NULL_DFLT_OFF', POWER(
2, 7 )
) AS p2( setoption, p2
)
WHERE NULLIF( p2 & @n, p2 ) IS
NULL
Note: Most of these are undocumented and a little bit
of testing turning each and every option and viewing the value can lead you to
the apt settings.
Maximizing Caching Capabilities
There are many ways you can tend to use these caching
capabilities of SQL Server unknowingly. But the fact remains we can
intentionally use these feature in our code also. In this section I will deal
with how we can store or PIN our tables of interest in memory using some
documented DBCC Commands.
I don't personally recommend using these commands for
real-time production environment applications. This is because by pinning we
force or use up the memory to pull these data in memory in a persisted way
untill a server/service restart. This means that these data pages stay in the
meory even if they are least used. I also disagree with the school of thought
that says to PIN all the frequently used data. Basically the frequently used
data are always in memory as they are frequently accessed and hence the
lazywriter will not flush them from memory. Hence from my experience I feel PIN
those data that are not frequently used but are costlier to access using an IO.
This situation is most unlikely to happen. Next, in the various myths is the
one that states SQL Server loads the complete table into memory when we PIN
them. Wrong, when we PIN a table and as and when records get accessed the data
pages are PINNED into the memory. This means there can be records that are not
in the memory when a table is pinned but needs an IO. This is possible and
after reading from the disk the data pages get pinned.
DBCC PINTABLE
First in the usage of the PINTABLE command. And the
BOL explanation says it marks a table to be pinned, meaning SQL Server will not
flush the pages for the table from the memory.
--Use
the pubs database
USE
pubs
DECLARE
@dbid INTEGER
SET
@dbid = DB_ID('pubs')
--Determine
id number for the dbo.authors table
DECLARE
@obid INTEGER
SET
@obid = OBJECT_ID('dbo.authors')
--Pin
the dbo.authors table to memory
DBCC
PINTABLE (@dbid,@obid)
GO
DBCC UNPINTABLE
Once we PIN tables into the meory it is our
responsibility to un-pin these data from memory again programatically. And the
code block looks the same except for the last line:
--Use
the pubs database
USE
pubs
DECLARE
@dbid INTEGER
SET
@dbid = DB_ID('pubs')
--Determine
id number for the dbo.authors table
DECLARE
@obid INTEGER
SET
@obid = OBJECT_ID('dbo.authors')
--UnPin
the dbo.authors table to memory
DBCC
UNPINTABLE (@dbid,@obid)
GO
DBCC PROCCACHE
The contents available for the stored procedure cache
can be returned using this command. Interesting command to execute.
--Return
the procedure cache information
DBCC
PROCCACHE
GO
DBCC MEMORYSTATUS
A host of information is displayed by this command.
Gives the usage of memory on the local server in general.
--Use
the master database
USE
master
--Show
memory usage on the database
DBCC
MEMORYSTATUS
GO
Conclusion
More often than not developers or system analysts are
faced with a problem to understand how the memory management is functioning at
the SQL Server. Sometimes these knowledge can help develop and deliver proper
capacity planning for SQL Server. Exploring and troubleshooting SQL Server
Cache is an art and now I've taken you one step closer to mastering the art.
PS: Refer to DBCC
FREEPROCCACHE, DBCC FLUSHPROCINDB, DBCC DROPCLEANBUFFERS, CHECKPOINT for memory
related other commands. I repeat the earlier disclaimer, use these statement
with utmost care under production enviroments. Refer to SQL Server BOL for
futher details.
|