After a recent database review session our development dba pointed out to me that he wants every table's primary key to be a identity column. I did not like this suggestion because I believe when dealing with reference tables the primary key should the same as the logical key. For transactional tables it makes more sense to the link the primary key to a physical key. Well after some research his logic starts to make sense. Here is why.
Primary keys
The primary key is nothing else then a physical unique identifier for every single row. Most RDBMs have this key implemented for internal use, but not available for the programmers use.
The primary key must to be stable. With other words it can NEVER be changed.
The primary key must never be the same as the logical key, because the logical key is made up of entities whose values can change.
Way too often databases contain many tables where an additional field was added to force the uniqueness of the logical key: e.g. A_ID got a SEQ_NUM added for that sole purpose. Meaning keys represent a hierarchy among the data. This is inflexible and limits the number of levels. And it is plain wrong.
If a logical key is used as primary key and this key is changed all foreign key references to this table have to be updated. Migration of logical keys used as primary keys will have terrible implications on database flexibility. A primary key independent of all attributes or relationships allows changes to the logical key without cascading changes. It allows maintaining data easily and sufficiently.
Another major advantage is the minimal impact when adding attributes to the logical key. Since entities are defined by relationships programmers are tempted to use foreign key columns in the primary key. As stated above this is bad practice because changing the primary key will lead to major changes in all client applications and stored procedures. When using a real primary key there would be only the change to the logical key in the involved tables and any place where inserts and updates happen. All references to primary keys stay the way they are.
Primary keys have to be
- unique
- not null
- stable, meaning unchanged over their lifetime
- different from the logical key
- internally assigned, else they belong in the logical key
- logical meaningless, to avoid the use of them
Do all tables need to have primary keys?
Yes. Just do it!
When a table contains rows which have to be referenced from other tables in the database then that table absolutely needs a primary key. In all other cases it depends on the way programmers use the data. In cases it will make sense to add the primary key to simplify the physical view of the data. From past experience we know that systems and their uses change.
To be future-compatible always adding the primary key is best practice. Primary keys should be integers filled from one generator per DB.
Benefits of primary keys
- Improved speed
- Improved flexibility
- Simplification of the database schema
- Easier transition to databases from other vendors because foreign key constraints are simplified
There is only one major disadvantage: It takes a lot more time to implement the use of primary keys into an existing system then starting with primary keys right from the beginning. And it has to be done with a lot of discipline.
Taken from How to speed up a database which has gotten slow