Home > T - SQL > IDENTITY IN SQL

IDENTITY IN SQL

Exploring each table in SSMS, just to check whether a particular table has IDENTITY property set or not i.e. presence of any IDENTITY column in the given table; might be a cumbersome task, especially when the number of tables in the given database is more than hundred. We need some shortcut to perform this type of tasks. Let’s discuss what identity property is and how to use some shortcuts i.e. custom procedures to get what we want.

What is IDENTITY Column?

This is same as any other regular column we define in T-SQL while creating table. But it is associated with a special type of property called IDENTITY which gives it some special powers J and is set while creating the column. Let’s check what these special powers are:

1. IDENTITY columns got populated automatically. No need to insert data into IDENTITY columns explicitly (but rules are made to be broken ☺, we can insert and will explain how in below sections).
2. It cannot accept NULL values. As soon as you define a column as IDENTITY, it becomes a NOT NULL column.
3. It contains UNIQUE values; (although it’s not always true, it is meant to have unique values but we can have duplicates if we want, shown in below sections.)
4. Moreover, it is the best tool to generate sequence numbers like 1, 2, 3, … N (although we have other methods to do the same job :)  )

All powers come with some short of limitation with it.

1. IDENTITY column supports only numerical data type excluding float or double. It supports only int, bigint, smallint, tinyint, or decimal / numeric with a scale of 0.

2. No Unique constraint. The data populated in the IDENTITY column cannot guarantee of its uniqueness. Duplicate data can be inserted into IDENTITY columns explicitly.

3. No GUI indicator in SSMS to detect IDENTITY column within a table.

Hard to say whether userId in the above figure is an IDENTITY column or not.




Now let’s check out how to create IDENTITY column. It’s very simple; we have to just specify the IDENTITY property to a column while defining it. It can be done while creating or altering table but not while altering column.

Syntax:

IDENTITY ( seed, increment )

It takes two parameters-

  • Seed: The starting point of the counter i.e. the sequence will start from 5 and will increment for each record inserted.  Default value is 1.
  • Increment: The value to be added to the seed value / previous IDENTITY value, for each insertion of the record. It can accept negative values too, in case the requirement is to decrement the IDENTITY value for each insertion. Default value is 1.

Only one argument is not acceptable. Either you have to specify Seed and Increment both or neither.

CREATE TABLE <Table_Name>
(
 <Identity_Column_Name> int IDENTITY(1,1) [NOT NULL  -- Optional],
 <Column2> varchar (100),
 <Column3> char(1),
 .......
);
 
-- Or,
ALTER TABLE <Table_Name>
ADD <Identity_Column_Name> int IDENTITY(1,1) [NOT NULL  -- Optional]

Example:

CREATE TABLE [dbo].[DimUser]
(
	[userId] int Identity(1,1) ,
	[userName] varchar(100) NULL
)
 
Or,
CREATE TABLE [dbo].[DimUser]
(
	[userName] varchar(100) NULL
)
GO
 
ALTER TABLE [dbo].[DimUser]
ADD [userId] int IDENTITY(1,1)

If you want to set IDENTITY property to an existing column, then you have to drop the existing column and create/ add new column with the same name along with IDENTITY property set.

Let’s insert some data.

INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty');
 
SELECT [userId],[userName]
FROM   DimUser;

Let’s do some modifications to the create script so that it would increment by 2 starting from 0.

CREATE TABLE [dbo].[DimUser]
(
	[userId] int IDENTITY(0,2) ,
	[userName] varchar(100) NULL
)
 
GO
INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty');
GO
 
SELECT [userId],[userName]
FROM   DimUser;

Now let’s check out if it is decrementing or not.

CREATE TABLE [dbo].[DimUser]
(
	[userId] int IDENTITY(1,-1) ,
	[userName] varchar(100) NULL
)
 
GO
INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty'),
('Stenly'),
('Kumar');
GO
 
SELECT [userId],[userName]
FROM   DimUser;

Once you created an IDENTITY column, you might have noticed that the Null ability property is set to NOT NULL automatically because IDENTITY column does not support NULL values.

IDENTITY columns cannot have DEFAULT constraints.

How to control IDENTITY property?

Till now we know what IDENTITY is and how to set IDENTITY. But sometimes we need more flexibility than only initiating the Identities while creating.  Let’s check out the features coming  with IDENTITY columns.

1. $IDENTITY: In order to reference IDENTITY columns in a table we can use $IDENTITY in our T-SQL statement i.e. in SELECT clause, WHERE clause, GROUP BY, ORDER BY, etc. Although, we can use the IDENTITY column name itself, but $IDENTITY is useful in case we don’t know the name of the IDENTITY column within a table.

Example:

SELECT   $IDENTITY,  --Global Identifier
         [userId],   --Identity Column
         [userName]
FROM     DimUser
WHERE    $IDENTITY BETWEEN -2 AND 0
ORDER BY $IDENTITY DESC;

In case you are joining with other tables which is having the IDENTITY columns too, $IDENTITY can be referred by the alias of the corresponding table.

SELECT A.$IDENTITY, --Identity from first table
       B.$IDENTITY, --Identity from second table
       [userId],
       [userName]
FROM   dbo.DimUser AS A
       INNER JOIN
       dbo.DimUser1 AS B
       ON A.userId = B.userId1;

 

2. @@IDENTITY: This is very useful to get the current IDENTITY valueand is very flexible to use when we are performing some operations on the basis of last identity value inserted in given table. By using @@IDENTITY variable (I would rather say it as a function, you can see it under Programmability Functions System Functions Other Functions), we can avoid hitting the table to get the max identity value.

Example:

SELECT [userId],[userName]
FROM   [dbo].[DimUser];
GO
SELECT @@IDENTITY as [@@IDENTITY];

Drawbacks:

  • It works only after Insertion.
  • The value can be tampered by Insert statements executing on other tables.
  • The Scope of @@IDENTITY is valid within the query session (i.e. within one query window)

3. SCOPE_IDENTITY(): This is similar to @@IDENTITY but with a small difference. This too return the current Identity id with respect to Identity column but it works under the scope. Within one query session we may have different scopes. For example, inserting to Table A might trigger one more insertion statement for Table B. This is called two scopes.

Or, let’s check with another example, in SSMS query window; execute one insert statement for table A. After that execute one stored procedure in the same query window which in turn execute insert statement for table B. This too also called dual scope. One scope within the query window and the other scope inside stored procedure.

Let’s check how Scope_Identity() function got impacted in both the scopes.

Let’s prepare two tables:

--DROP table [dbo].[DimUser]
CREATE TABLE [dbo].[DimUser]
(
	[userId] int IDENTITY(1,1) ,
	[userName] varchar(100) NULL
)
 
GO
--DROP table [dbo].[DimUser1]
CREATE TABLE [dbo].[DimUser1]
(
	[userId1] int IDENTITY(1,1) ,
	[userName1] varchar(100) NULL
)
GO

Let’s create a SP named “sp_InsertData” as:

CREATE PROCEDURE sp_InsertData
AS
BEGIN
    INSERT  INTO [dbo].[DimUser1]
    VALUES ('xyz'),
    ('pqo'),
    ('abc');
END

[dbo].[DimUser1] is another table created similar to [dbo].[DimUser] stated above.

Now, let’s execute our sample query:

INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty'),
('Stenly'),
('Kumar');
GO
EXEC sp_InsertData
 
SELECT @@IDENTITY AS [@@IDENTITY],
       SCOPE_IDENTITY() AS [SCOPE_IDENTITY];

Notice that Scope_Identity returns 5 i.e. actual records inserted in DimUser in current scope. But @@IDENTITY returns 3 i.e. no. of records inserted inside stored procedure. Hence, Scope_Identity might not be so reliable to do operations with respect to IDENTITY values.

The values will differ even if you insert records in the same table inside stored procedure to that in the query window. (Readers are requested to do this as practiceJ.) Because Scope_Identity is not able see what is happening outside its scope.

4. IDENT_CURRENT: One more amazing and flawless function to get the current Identity value for a given table. This accepts only one parameter i.e. the table name, for which the identity value needs to be extracted.  The scope is not a boundary here. It can work anywhere.

SELECT IDENT_CURRENT('[dbo].[DimUser]')  AS DimUserIdentity,
       IDENT_CURRENT('[dbo].[DimUser1]') AS DimUser1Identity;

If the table doesn’t have Identity column, it will return NULL.

5. IDENT_SEED and IDENT_INCR: These two functions will return the initial Identity configuration related to any table.

--DROP table [dbo].[DimUser]
CREATE TABLE [dbo].[DimUser]
(
	[userId] int Identity(5,2) ,
	[userName] varchar(100) NULL
)
INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty'),
('Stenly'),
('Kumar');
GO
SELECT IDENT_SEED('[dbo].[DimUser]') AS [IDENT_SEED],
       IDENT_INCR('[dbo].[DimUser]') AS [IDENT_INCR]

6. OBJECTPROPERTY: It is helpful just to check whether a given table has any identity column associated or not. Returns 1 for True and 0 for False.

SELECT OBJECTPROPERTY(OBJECT_ID('[dbo].[DimUser]'), 'TableHasIdentity') AS [TableHasIdentity];

It takes two parameters, one is the object_id for the given table and the other is the property name to check i.e. ‘TableHasIdentity’ property in our case. It might return NULL, if the provided table name or property name is wrong.

 

7. COLUMNPROPERTY: Another function to check whether a given column for a given table is an IDENTITY column or not. Returns 1 for True and 0 for False.

SELECT COLUMNPROPERTY(OBJECT_ID('[dbo].[DimUser]'), 'userId', 'IsIdentity') AS [IsIdentity];

It takes three parameters, id of table name, column name and property to check. It might return NULL if the provided table name or column name or property name is wrong.

 

8. IDENTITYPROPERTY: Mystery for me. This is not documented yet (as per my awareness :) ). It takes two parameters as below.

Will update the post as soon as I get some solid information about it.

 

DBCC CHECKIDENT: This is one of the DBCC (Database Console Command) statements which can perform the actions:

  • Shows current IDENTITY value for a given table.
  • Reseed IDENTITY to any new value provided.
  • Auto Reseed IDENTITY values if the IDENTITY value is less than the max value in the IDENTITY column.

Syntax:

DBCC CHECKIDENT 
( 
        table_name
        [ , { NORESEED | { RESEED [ , new_reseed_value ] } } ]
)
[ WITH NO_INFOMSGS ]

Example:  To check current IDENTITY value of a given table.

DBCC CHECKIDENT ('[dbo].[DimUser]', NORESEED)

OUTPUT

Example: To reseed IDENTITY value to a new value.

DBCC CHECKIDENT ('[dbo].[DimUser]', RESEED, 3);

It will reseed the identity value to 3 so that next time when any record will be inserted it will start from (3 + Incremented value) if the table is not empty otherwise it will start from 3 only.

--DROP table [dbo].[DimUser]
CREATE TABLE [dbo].[DimUser]
(
	[userId] int Identity(1,1) ,
	[userName] varchar(100) NULL
)
GO
INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty'),
('Stenly'),
('Kumar');
GO
DBCC CHECKIDENT ('[dbo].[DimUser]', RESEED, 3);
GO
INSERT  INTO [dbo].[DimUser]
VALUES ('Arun Mallick'),
('John Roy')
GO
SELECT [userId],[userName]
FROM   DimUser;

Note that userId 4 and 5 are repeated twice.  Especially this type of scenario might result in duplicate values in Identity columns.

Example: Auto resetting Identity value if it is not correct.  This will work only if the Identity value is less than the MAX value in the identity column and the table should not be empty.  It will auto reseed the Identity value to the MAX value of the identity column.

DBCC CHECKIDENT ('[dbo].[DimUser]', RESEED)
-- Or,
DBCC CHECKIDENT ('[dbo].[DimUser]')

By default, the second parameter is “RESEED”. Hence, be careful while running DBCC CHECKIDENT.

Example:

--DROP table [dbo].[DimUser]
CREATE TABLE [dbo].[DimUser]
(
	[userId] int IDENTITY(1,1) ,
	[userName] varchar(100) NULL
)
GO
INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty'),
('Stenly'),
('Kumar');
GO
DBCC CHECKIDENT ('[dbo].[DimUser]', RESEED, 3);
GO

OUTPUT

The current IDENTITY value is 3. Now if we run DBCC CHECKIDENT (with RESEED, optional) only. It will automatically reset it to the max IDENTITY value.

DBCC CHECKIDENT ('[dbo].[DimUser]')

We inserted five records initially before manipulating the IDENTITY values; hence the max value in the IDENTITY column is 5.

SET IDENTITY_INSERT: You might have noticed that we cannot insert the values explicitly to the IDENTITY columns by insert statements, it will throw error. But sometimes we might need to do so. What if the requirement is to copy all data exactly what is in the source table to the destination table even if it has some identity columns. Hence, we need to break the ruleJ.  We need to insert data to the identity columns by using insert statements and that we can do with the help of “SET IDENTITY_INSERT”.

This is a type of global flag for a particular session, which allows us to insert values to identity columns to any one table at one time.

i.e. within a session, if one table got entrance to IDENTITY_INSERT, then no other table within that session can gain IDENTITY_INSERT. To give IDENTITY_INSERT permission to other tables, one has to remove the IDENTITY_INSERT from the first table.

Syntax:

To set IDENTITY_INSERT:

SET IDENTITY_INSERT <Table_Name> ON

To Reset / Unset IDENTITY_INSERT:

SET IDENTITY_INSERT <Table_Name> OFF

Example:

--Setting IDENTITY_INSERT ON
SET IDENTITY_INSERT dbo.DimUser ON
INSERT INTO [TestDB].[dbo].[DimUser]
           (userId, userName)
     VALUES(6,'Ram')
 
--Resetting IDENTITY_INSERT so that to apply IDENTITY_INSERT on other tables
SET IDENTITY_INSERT dbo.DimUser OFF
SET IDENTITY_INSERT dbo.DimUser1 ON
INSERT INTO [TestDB].[dbo].[DimUser1]
           (userId1, userName1)
     VALUES(6,'Ram')
SET IDENTITY_INSERT dbo.DimUser1 OFF

Things to remember:

  • We cannot set IDENTITY_INSERT for tables with no identity columns
  • IDENTITY_INSERT works only within the session
  • Setting IDENTITY_INSERT in one session and inserting in another session will not work.
  • To SET IDENTITY_INSERT in subsequent tables, we need to reset the IDENTITY_INSERT from the previous table.
  • We can set IDENTITY_INSERT for more than one tables in different sessions i.e. one table for each session.
  • Need to specify the insert column list explicitly for IDENTITY_INSERT i.e. column names are required along with Insert statements.

 

Impact of Transactions / Rollback on IDENTITY: Here is one interesting fact about Identity that it will not reset automatically to its initial value after a ROLLBACK, i.e. Rollback cannot affect Identity values.

Example:

BEGIN TRAN
 
INSERT  INTO [dbo].[DimUser]
VALUES ('Arun'),
('John'),
('Bunty'),
('Stenly'),
('Kumar');
 
ROLLBACK
 
SELECT * FROM DimUser
SELECT IDENT_CURRENT('[dbo].[DimUser]')  AS DimUserIdentity

Even though our transaction got roll backed, the identity value still remain same. Hence, Rollback does not affect Identity. We need to explicitly reset identity if required by DBCC CHECKIDENT.

Stored Procedure to retrieve IDENTITY information from a database:

In real scenario, we face many problems while dealing with large volume of data and tables, especially when the number of tables is more than hundred. In data warehouse we maintain surrogate keys in our dimension tables which are basically the IDENTITY keys and are got populated automatically. It is quite true that we might miss to define IDENTITY property for some of the dimensions which might lead to unexpected results while loading data to its corresponding fact tables.

Hence, I have come up with one stored procedure that can be used to monitor this type of flaw while scripting for dimensions or fact. It basically checks for all identity columns in a given database and list them down for reference.

SP Name [dbo].[usp_RetrieveIdentityColumns]

Parameters@TableFilter (Optional parameter to filter the no. of tables to list)

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_RetrieveIdentityColumns]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[usp_RetrieveIdentityColumns]
GO
 
/*
 
EXEC [usp_RetrieveIdentityColumns] '%Dim%'
 
*/
CREATE PROCEDURE [dbo].[usp_RetrieveIdentityColumns]
@TableFilter VARCHAR (30)=''
AS
BEGIN
    SET NOCOUNT ON;
    IF LTRIM(RTRIM(ISNUll(@TableFilter, ''))) = ''
        SET @TableFilter = '';
    SELECT   ST.[name] AS "Table Name",
             SC.[name] AS "Column Name",
             CASE WHEN SC.status = 0x80 THEN 'True' ELSE 'False' END AS "Is Identity",
             IDENT_CURRENT(ST.[name]) AS "Current Identity Value",
             IDENT_SEED(ST.[name]) AS "Identity Seed",
             IDENT_INCR(ST.[name]) AS "Identity Increment",
             SType.[name] AS "Data Type",
             SC.length AS "Data Type Size"
    FROM     sys.tables AS ST
             INNER JOIN
             sys.syscolumns AS SC
             ON ST.[object_id] = SC.[id]
             INNER JOIN
             Sys.Types AS SType
             ON SType.user_type_id = SC.xusertype
    WHERE    SC.status = 0x80
             AND ST.[name] <> 'sysdiagrams'
             AND (@TableFilter = ''
                  OR ST.[name] LIKE @TableFilter)
    ORDER BY ST.[name], SC.colid;
    SET NOCOUNT OFF;
END
 
GO

Example:

EXEC [usp_RetrieveIdentityColumns]

EXEC [usp_RetrieveIdentityColumns] '%Product%'

 

 





Code Description:

System tables that are used to get the result set are:

  • sys.tables                      [For table related information]
  • sys.syscolumns          [For Column related information]
  • sys.types                       [For Data type related information]

The [sys].[syscolumns] table has a special column named “status” which flags special properties for columns. The value 0×80 under “status” column states that the column has IDENTITY property associated with it.

The above procedure can be done by using sys.columns instead of sys.syscolumns. (Expecting the readers to do it for practice.  :) )

Waiting for your valuable comments and suggestions.

Categories: T - SQL
  1. ramana
    August 21st, 2011 at 18:14 | #1

    Superb material. Looks like the whole encyclopedia of Identity is here. Very well explained.

  2. jorge
    August 21st, 2011 at 20:15 | #2

    Thanks for such nice explanation.

  3. Robyn
    August 24th, 2011 at 22:12 | #3

    nice article. Is there any alternative to Identity column. Basically I want to track each identity value while inserting to a table and use it somewhere else.

    Thanks,

  4. Satish
    September 8th, 2011 at 10:46 | #4

    I am reading your(SQL Lion) articles regularly all those are superb i can get more information. Thanks for your Wonderful articles

  5. Satish
    September 8th, 2011 at 10:46 | #5

    Satish :
    I am reading your(SQL Lion) articles regularly all those are superb i can get more information. Thanks for your Wonderful articles

  6. Grizzly
    September 12th, 2011 at 15:47 | #6

    Hats off to wohever wrote this up and posted it.

  7. Kannan
    September 15th, 2011 at 18:12 | #7

    You are rocking man! Excellent article i seen ever.

  8. AV
    October 23rd, 2011 at 08:54 | #8

    Great information. ONe of the best article ever on identity. Please post more of this.
    I am trying to do peer-to-peer replication in sql 2008. Is there any way in SQL server, I can change the increment in identity without having to create new table or create new column and migrate the data.
    Something like changing syscolumns table.
    I tried using DAC connection and setting sp_configure and all that but SQL 2008 is still blocking ad-hoc updates to sys tables.
    Wish anyone new how to change that.
    Thank you,
    AV

  9. October 23rd, 2011 at 09:36 | #9

    @AV,
    if you only needs a sequential number then i would suggest to go for ROW_NUMBER() [ http://www.sqllion.com/2010/07/row_number/ ] , but if you are looking for changing the identity value then go for dbcc checkident.

    And syscolumns is a system view and not a table. The corresponding table for this is sys.syscolrdb, which is not accessible normally. Even after using DAC mode, you can only see the table data but you cannot modify it.

    If you want to modify anything on system tables, then you have to restart your sql instance service in single user mode. Try this command » sqlservr.exe -m

    After that you can update the table using DAC mode. But this might create problems in sql server caching mechanisms. Hence, i wouldn’t suggest you to do this. Rest up to you.

    Thanks,
    Arun

  10. Aravind
    March 10th, 2012 at 12:04 | #10

    No Words!!!!!!!!! Thanks a ton…………
    :-) :-) :-) :-) :-) :-) :-) :-) :-)

  11. Charudatta
    June 13th, 2012 at 22:49 | #11

    I have not read such a detailed Information about Identity anywhere ..Too Good :)

  12. behzadabf
    April 24th, 2013 at 04:05 | #12

    OoOoh..!!! thankssssssssss…….

  13. Dinesh Sharma
    June 18th, 2013 at 12:04 | #13

    Very Nice Article Thanks Bro… :) :)

  14. Abie
    August 16th, 2013 at 03:03 | #14

    A table in SQL Server 2012 is used by cashiers, only to call-up sales that were submitted by salespeople. The idea is to keep the table empty. Whenever a sale is submitted by salespeople, the new sale appears as a record in the table and the cashier uses the identity increment number that was automatically generated to begin processing the check-out. This automatically generated number is becoming too large. We are looking for a way to limit these numbers in such a way that when the lowest number is used, the next record will re-use it. Is this possible?

  15. Mike
    August 21st, 2013 at 14:33 | #15

    The $Identity construct is welcome news to me. I have been using
    Select * from information_schema.columns where
    ColumnProperty(object_id(Table_name), COLUMN_NAME, ‘IsIdentity’) = 1
    to do the same job – $Identity is a nice shortcut that I did not know about.
    Identity columns in a replicated system are also a whole lot of fun. We generally set up our replicated identity columns manually to use an increment of 2 and set up one server to use odd values and the other one to use even values. It saves surprises across replicated systems as it guarantees no clashes.
    It is nice to read an article that covers the topic so well.

  1. No trackbacks yet.
You must be logged in to post a comment.