Monday, October 24, 2016

Disable PRIMARY KEY before data loading

In order to improve the loading performance, specifically when a data warehouse or a large table is loaded, it is recommended to disable constraints such as PRIMARY KEY, UNIQUE KEY and CHECK and indexes if the data is already cleansed and validated. Although we heavily practice this in data warehousing, we do have some limitations on OLTP databases because the way we structure tables is different from the way we do with data warehouses.

Can we disable primary key of the table and continue with data loading? The answer is, Yes and No.

Remember, when you mark one of your table columns as the primary key, it creates a Clustered Index for the column you picked. If the Clustered Index is already created (before creating the primary key), then it creates a Non-Clustered Index. If the primary key is created with a Clustered Index and if you disable it, table will not be able to accessed because the main structure is disabled. However, if the primary key is created with a non-clustered index, you can disable it and still work with it.

USE tempdb;
GO

-- create a table with a primary key which is clustered
CREATE TABLE dbo.Customer
(
 CustomerID int identity(1,1) not null
 , FirstName varchar(100) not null
 , Constraint PK_Customer PRIMARY KEY CLUSTERED (CustomerID)
);

-- create a table with a primary key which is nonclustered
CREATE TABLE dbo.Employee
(
 EmployeeID int identity(1,1) not null
 , FirstName varchar(100) not null
 , Constraint PK_Employee PRIMARY KEY NONCLUSTERED (EmployeeID)
);

-- Insert some records
INSERT INTO dbo.Customer
 (FirstName) VALUES ('Dinesh'), ('Yeshan');
INSERT INTO dbo.Employee
 (FirstName) VALUES ('Jane'), ('Jack');

-- checking records
-- This should return all recorded we inserted
SELECT * FROM dbo.Customer;
SELECT * FROM dbo.Employee;

-- Disable the primary key of the Customer
-- by disabling associated index
ALTER INDEX PK_Customer ON dbo.Customer DISABLE;

-- Now following statements are not possible
INSERT INTO dbo.Customer
 (FirstName) VALUES ('Kate'), ('Mihen');
SELECT * FROM dbo.Customer;

-- Disable the primary key of the Employee
-- by disabling the associated index
ALTER INDEX PK_Employee ON dbo.Employee DISABLE;

-- These statements work without any issue
INSERT INTO dbo.Employee
 (FirstName) VALUES ('Nazir'), ('Daniel');
SELECT * FROM dbo.Employee;

-- Enabling both
ALTER INDEX PK_Customer ON dbo.Customer REBUILD;
ALTER INDEX PK_Employee ON dbo.Employee REBUILD;

Same goes to Unique Constraint as well.

Note that this is possible with data warehouse because, as a best practice, we do not (always, there are exceptions) make the column marked as primary key as the clustered key. For an example, we may create a primary key using all foreign keys columns but we create the clustered index using DataKey column.

No comments: