Join Dependencies and Fifth Normal Form (5NF)
Fifth Normal Form (5NF)
Definition 1 :
A relation R is in 5NF if and only if every join dependency in R is implied by the candidate keys of R.
Definition 2 :
A relation decomposed into two relations must have
loss-less join Property, which ensures that no spurious or extra tuples are generated, when relations are reunited through a natural join.
What is a Join Dependency(JD) ??
Let
R be a relation. Let
A, B, …, Z be arbitrary subsets of
R’s attributes.
R satisfies the
JD
* ( A, B, …, Z )
if and only if
R is equal to the join of its projections on
A, B, …, Z.
A join dependency
JD(R1, R2, …, Rn) specified on relation schema
R, is a trivial JD, if one of the relation schemas
Ri in
JD(R1, R2, ….,Rn) is equal to R.
Join dependency is used in the following case :
When there is no lossless join decomposition of
R into two relation schemas, but there is a lossless join decompositions of
R into more than two relation schemas.
Point : A join dependency is very difficult in a database, hence normally not used.
Negative Example :
Consider a relation
ACP(Agent, Company, Product)
| ACP : |
|
Meaning of the tuples |
| Agent(A) |
Company(C) |
Product(P) |
⇒ |
Agent sells Company’s Products. |
| A1 |
PQR |
Nut |
⇒ |
A1 sells PQR’s Nuts and Screw. |
| A1 |
PQR |
Screw |
| A1 |
XYZ |
Bolt |
⇒ |
A1 sells XYZ’s Bolts. |
| A2 |
PQR |
Bolt |
⇒ |
A2 sells PQR’s Bolts. |
The table is in 4 NF as it does not contain multivalued dependency. But the relation
contains redundancy as
A1 is an agent for PQR twice. But there is no way of eliminating this redundancy without losing information.
Suppose that the table is decomposed into its two relations,
R1 and
R2.
R1 :
| Agent |
Company |
| A1 |
PQR |
| A1 |
XYZ |
| A2 |
PQR |
|
R2 :
| Agent |
Product |
| A1 |
Nut |
| A1 |
Screw |
| A1 |
Bolt |
| A2 |
Bolt |
|
The redundancy has been eliminated by decomposing
ACP relation, but the information about which companies make which products and which agents supply which product has been lost.
The natural join of these relations over the ‘agent’ columns is:
| R12 : |
| Agent |
Company |
Product |
| A1 |
PQR |
Nut |
| A1 |
PQR |
Screw |
| A1 |
PQR |
Bolt |
| A1 |
XYZ |
Nut |
| A1 |
XYZ |
Screw |
| A1 |
XYZ |
Bolt |
| A2 |
PQR |
Bolt |
Hence, the decomposition of
ACP is a lossy join decomposition as the natural join table is spurious, since it contains extra tuples(shaded) that gives incorrect information.
But now, suppose the original relation
ACP is decomposed into 3 relations :
- R1(Agent, Company)
- R2(Agent, Product)
- R3(Company, Product)
The result of the natural join of
R1 and
R2 over ‘Agent’ (already Calculated
R12) and then, natural join of
R12 and
R3 over ‘Company’ & ‘Product’ is –
| R123 : |
| Agent |
Company |
Product |
| A1 |
PQR |
Nut |
| A1 |
PQR |
Screw |
| A1 |
PQR |
Bolt |
| A1 |
XYZ |
Bolt |
| A2 |
PQR |
Bolt |
Again, we get an extra tuple shown as by shaded portion.
Hence, it has to be accepted that it is not possible to eliminate all redundancies using normalization techniques because it cannot be assumed that all decompositions will be non-loss. Hence again, the decomposition of
ACP is a lossy join decomposition
Positive Example :
Consider the above schema, but with a different case as “if a company makes a product and an agent is an agent for that company, then he always sells that product for the company”. Under these circumstances, the
ACP table is shown as :
| ACP : |
| Agent |
Company |
Product |
| A1 |
PQR |
Nut |
| A1 |
PQR |
Bolt |
| A1 |
XYZ |
Nut |
| A1 |
XYZ |
Bolt |
| A2 |
PQR |
Nut |
The relation ACP is again decompose into 3 relations. Now, the natural Join of all the three relations will be shown as :
| R1 : |
| Agent |
Company |
| A1 |
PQR |
| A1 |
XYZ |
| A2 |
PQR |
|
| R3 : |
| Company |
Product |
| PQR |
Nut |
| PQR |
Bolt |
| XYZ |
Nut |
| XYZ |
Bolt |
|
| R2 : |
| Agent |
Product |
| A1 |
Nut |
| A1 |
Bolt |
| A2 |
Nut |
|
| Result of Natural Join of R1 and R3 over ‘Company’ and
then Natural Join of R13 and R2 over ‘Agent’and ‘Product’
⇓ |
| R123 : |
| Agent |
Company |
Product |
| A1 |
PQR |
Nut |
| A1 |
PQR |
Bolt |
| A1 |
XYZ |
Nut |
| A1 |
XYZ |
Bolt |
| A2 |
PQR |
Nut |
|
Hence, in this example, all the redundancies are eliminated, and the decomposition of
ACP is a lossless join decomposition. Hence the relation is in 5NF as it does not violate the property of lossless join.]]>