Taking a step back ... Database Normalisation 1NF, 2NF, 3NF, BCNF and 4NF (Part 2)
In part 1 we covered chaos to 3NF normalisation. We can go to higher levels of normal forms ... although this is not my area of expertise at all, let's explore BCNF and 4NF.
---------------------------------------------------------------------------------
Boyce-Codd Normal Form (BCNF)
The BCNF normal form is a special case of 3NF and BCNF can only be violated if a 3NF compliant table contains more than one candidate key. A candidate key has the same characteristics as a primary key, but for whatever reason, was not chosen as the PK.
Characteristics
- 3NF (special case)
- Every determinant is a candidate key
Scenario
We see the following dependencies:
Although there are no partial or transitive dependencies, which makes the above scenario 3NF compliant, the condition C--> B denotes that a non-key attribute determines part of the PK, resulting in BCNF conditions to fail.
Steps
- Convert PK to A+C, because due to C --> B C is a superset of B.
- Decompose the table using the standard normalisation steps.
- Now the tables are 3NF and BCNF compliant.
---------------------------------------------------------------------------------
Fourth Normal Form (4NF)
Characteristics
- 3NF
- No independent multi valued dependencies
Steps
- Ensure that all attributes are dependent on the primary key.
- Ensure that all attributes are independent of each other.
- Ensure that no row contains two or more multi valued facts about an entity.
In my opinion, our 3NF dependency diagram we concluded with in part 1, has a problem in the following area:
Using 4NF normalisation we would end up with the following dependency diagram ... I think.
This is my view, I would like to hear from the DB gurus whether the assumption is bad, ok or good.
---------------------------------------------------------------------------------
Apparently one can go further with the normalisation, but this is unknown territory for me. Remember that normalisation is not always desired. While it assists us in creating good database designs, it can impact efficiency. Remember to balance good design with raw performance, as these do not typically go hand in hand.