1 |
originally posted 2013-4-4 |
Using uniqueidentifiers (Guids) as primary/foreign keys in SQL Server has its advantages and disadvantages. There are a lot of good discussions of uniqueidentifiers vs. integers as keys out there. We use uniqueidentifiers and the purpose of this example is to address a problem that some encounter using them with large and active databases.
One problem is that primary keys are typically clustered indexes (they are stored in sort order), so if the key for a new record isn’t sort-wise at the end of the index, then the server must move some number of records down to free a slot for the new key. In an extreme case, the server performance can degrade because it’s spending too much time churning the indexes.
Uniqueidentifiers are longer keys (than integers), requiring a bit more work from the server to sort, but the real issue is that uniqueidentifiers generated with newid() are random and therefore not guaranteed to fall at the end of the index. This can be a problem for large tables that experience frequent inserts.
SQL Server exposes the newsequentialid() function that guarantees that 1) a uniqueidentifier is sort-wise larger than the previously generated uniqueidentifier for that SQL instance, so the server can append to the end of the index and avoid the churn, and 2) it retains the uniqueness characteristics that make uniqueidentifiers desirable. The problem with newsequentialid() is that it’s not entirely convenient to use. SQL Server 2005 thru 2008 (and likely 2012) require that the function only be used as a default value for a column.
So while you can do this (SQL 2008 example)…
[OK]
declare @id uniqueidentifier = newid();
This won’t work…
[NO!]
declare @id uniqueidentifier = newsequentialid();
So, if newsequentialid() can only be used as a default for a column and therefore you can only set it when inserting a new record and *not* including that field in the insert statement, then how can you get the value for that column in the newly inserted record?! Sometimes you need it and sometimes you don’t. The example shows how to get the value for those times when you need it. Here’s the table definition, look at the use of newsequentialguid() in the default constraint in the last statement.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE [dbo].[MyTable]( [Id] [uniqueidentifier] NOT NULL, [Namespace] [nvarchar](100) NOT NULL, [CreateDateTime] [datetime] NOT NULL, CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED ( [Id] ASC, )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO ALTER TABLE [dbo].[MyTable] ADD CONSTRAINT [DF_MyTable_CreateDateTime] DEFAULT (getdate()) FOR [CreateDateTime] GO ALTER TABLE [dbo].[MyTable] ADD CONSTRAINT [DF_MyTable_Id] DEFAULT (newsequentialid()) FOR [Id] GO |
Insert a record like the statement below, and SQL Server sets the default values for the [Id] and [CreateDateTime] fields. The [CreateDateTime] receives a default value of the current SQL Server time. The [Id] receives a default value of a new sequential uniqueidentifier. This works!
insert [MyTable]([Namespace]) values ('MyNamespace');
Insert a record like the statement below, and SQL Server sets the default value for the [CreateDateTime] field and you’re supplying the value for the [Id] field.
This solves the first problem (you know the value of [Id]), but creates another problem… the value for [Id] isn’t a sequential uniqueidentifier.
1 2 3 |
declare @id uniqueidentifier; set @id = newid(); insert [MyTable]([Namespace],[Id]) values ('MyNamespace', @id); |
Insert a record like the statement below, and SQL Server sets the default values for the [Id] and [CreateDateTime] fields (good so far) and the new record’s [Id] can be retrieved by using a table variable
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
--declare the table variable declare @tab table (id uniqueidentifier) --insert the new record insert [MyTable]([Namespace]) output inserted.id into @tab select N'MyNamespace' --declare the variable to hold the new [Id] value declare @id uniqueidentifier; --fetch the value of [Id] for the newly inserted record into the @id variable select @id = id from @tab |
Now you’ve got the best of both worlds! Here’s a stored procedure that sort of side-steps the issue with newsequentialid().
Even though a stored procedure isn’t always convenient, it works.
1 2 3 4 5 6 7 |
create procedure [dbo].[spNewSequentialId](@id uniqueidentifier out) as begin declare @tab table(dummy bit, id uniqueidentifier default (newsequentialid())); insert @tab (dummy) values (0); select @id=id from @tab end |
Wouldn’t it be great to turn this stored procedure into a function?! Yes it would, but it won’t work because of SQL’s side-effecting rules.
If you try something like this…
[NO!]
1 2 3 4 5 6 7 8 9 10 11 |
--[Won't work!] create function [dbo].[spNewSequentialId]() returns uniqueidentifier as begin declare @id uniqueidentifier; declare @tab table(dummy bit, id uniqueidentifier default (newsequentialid())); insert @tab (dummy) values (0); select @id=id from @tab; return @id; end |
… you’re rewarded for you efforts with the following message
1 2 |
<span style="color: #ff0000;">Msg 443, Level 16, State 1, Procedure spNewSequentialId, Line 6 Invalid use of a side-effecting operator 'newsequentialid' within a function.</span> |
A C# SQL CLR Solution
Here’s an alternate method, using a C# CLR assembly. This is a “Visual C# SQL CLR Database Project For SQL Server 2005-2008″, these project types have some limitations in terms of what libraries they can reference, they’re restricted to using .Net Framework 3.5 SP1, and they induce registration and versioning headaches outside of those with regular T-SQL schema statements.
I put some info at the end regarding registration of the DLL with SQL Server. It’s a little different with 2005 and 2008, and depending on what you’re doing, you may have to deal with assembly trust issues. If you have the energy and patience to get over the initial hump, then a CLR assembly solution just might be the thing for you.
The C#
The SeqGuid class implements the behavior and is called by the Common class, which is called by a SQL database object. The Common class is the one to which SQL objects are pointed (more on that below).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
[assembly: AllowPartiallyTrustedCallers] namespace MyStuff { public static class Common { public static Guid NewSequentialGuid() { return SeqGuid.NewSequentialGuid(); } } } ... using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Runtime.InteropServices; namespace MyStuff { public class SeqGuid { [DllImport("rpcrt4.dll", SetLastError = true)] static extern int UuidCreateSequential(out Guid guid); public static Guid NewSequentialGuid() { Guid g; if (UuidCreateSequential(out g) != 0) throw new System.ComponentModel.Win32Exception(System.Runtime.InteropServices.Marshal.GetLastWin32Error()); else { byte[] bytes = g.ToByteArray(); Array.Reverse(bytes, 0, 4); Array.Reverse(bytes, 4, 2); Array.Reverse(bytes, 6, 2); return new Guid(bytes); } } } } |
T-SQL CLR Assembly Registration (arcane incantations)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
--Note: To use ":setvar" command variables from the SQL Server Management Studio, --you'll have to check the "By default, open new queries in SQLCMD mode" box from --Tools->Options->Query Execution and you may have to open a new query window. :setvar User "database user name goes here" :setvar Db "database name goes here" :setvar AssemblyName "assembly name goes here (the full logical name, not the file name)" :setvar AssemblyPath "path to the assembly file, e.g. c:\projects\mystuff.dll" use [master] go alter authorization on database::$(Db) TO $(User); go GRANT UNSAFE ASSEMBLY TO $(User) go use root_properties go exec sp_changedbowner '$(User)' go sp_configure 'show advanced options', 1; go reconfigure; go sp_configure 'clr enabled', 1; go reconfigure; go alter database $(Db) set trustworthy on; go create assembly [$(Assembly)] from '$(AssemblyPath)' with permission_set = unsafe; go |
If that worked, then create database objects that call the CLR function. You should see the assembly in the Programmability->Assemblies under your database in SQL Server Management Studio. Keep in mind that your case may require more or less work for registration. There’s usually at least one wrench in the gears to remove
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
--a scalar function to get a new sequential guid. This calls the registered C# assembly create function [dbo].[fNewSequentialGuid]() returns uniqueidentifier as external name [<AssemblyName>].[MySqlStuff.Common].[NewSequentialGuid] GO --a table-valued function to get a new sequential guid create function [dbo].[tfGetNewSequentialGuid]() returns @tab table([Id] uniqueidentifier) as begin insert @tab select [dbo].[fNewSequentialGuid](); return; end GO |
If all went to plan, the new C# function should be callable from T-SQL
select [dbo].[fNewSequentialGuid]() --as a regular function that returns a scalar
or
select [Id] from [dbo].[tfGetNewSequentialGuid]() --as a table-valued function that returns a single-column, single-row table
Note: In cases where a CLR function is to be called between databases, or even instances (requires a linked server to be defined), you may have to wrap the SQL object
for it to be usable. I don’t know why, and there may be another really easy configuration solution for this, but I never got down to the brass tacks of the issue.
1 2 3 4 5 6 7 |
--a scalar function that wraps the [dbo].[fNewSequentialGuid](). It *should* be visible to other databases. create function [dbo].[fNewSequentialGuidX]() returns uniqueidentifier begin return [dbo].[fNewSequentialGuid](); end GO |