Article ID: 100139 - View products that this article applies to.
Novice: Requires knowledge of the user interface on single-user computers.
This article explains the basics of database normalization terminology. A basic understanding of this terminology is helpful when discussing the design of a relational database.
NOTE: Microsoft also offers a WebCast that discusses the basics of database normalization. To view this WebCast, please visit the following Microsoft Web site:
http://support.microsoft.com/servicedesks/webcasts/wc060600/wc060600.asp?fr=1NOTE: To see this information for Microsoft Access 2000, please see the following article in the Microsoft Knowledge Base:
(http://support.microsoft.com/kb/209534/EN-US/ )ACC2000: Database Normalization Basics
Description of NormalizationNormalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating two factors: redundancy and inconsistent dependency.
Redundant data wastes disk space and creates maintenance problems. If data that exists in more than one place must be changed, the data must be changed in exactly the same way in all locations. A customer address change is much easier to implement if that data is stored only in the Customers table and nowhere else in the database.
What is an "inconsistent dependency"? While it is intuitive for a user to look in the Customers table for the address of a particular customer, it may not make sense to look there for the salary of the employee who calls on that customer. The employee's salary is related to, or dependent on, the employee and thus should be moved to the Employees table. Inconsistent dependencies can make data difficult to access; the path to find the data may be missing or broken.
There are a few rules for database normalization. Each rule is called a "normal form." If the first rule is observed, the database is said to be in "first normal form." If the first three rules are observed, the database is considered to be in "third normal form." Although other levels of normalization are possible, third normal form is considered the highest level necessary for most applications.
As with many formal rules and specifications, real world scenarios do not always allow for perfect compliance. In general, normalization requires additional tables and some customers find this cumbersome. If you decide to violate one of the first three rules of normalization, make sure that your application anticipates any problems that could occur, such as redundant data and inconsistent dependencies.
NOTE: The following descriptions include examples.
First Normal Form
But what happens when you add a third vendor? Adding a field is not the answer; it requires program and table modifications and does not smoothly accommodate a dynamic number of vendors. Instead, place all vendor information in a separate table called Vendors, then link inventory to vendors with an item number key, or vendors to inventory with a vendor code key.
Second Normal Form
Third Normal Form
For example, in an Employee Recruitment table, a candidate's university name and address may be included. But you need a complete list of universities for group mailings. If university information is stored in the Candidates table, there is no way to list universities with no current candidates. Create a separate Universities table and link it to the Candidates table with a university code key.
EXCEPTION: Adhering to the third normal form, while theoretically desirable, is not always practical. If you have a Customers table and you want to eliminate all possible interfield dependencies, you must create separate tables for cities, ZIP codes, sales representatives, customer classes, and any other factor that may be duplicated in multiple records. In theory, normalization is worth pursuing; however, many small tables may degrade performance or exceed open file and memory capacities.
It may be more feasible to apply third normal form only to data that changes frequently. If some dependent fields remain, design your application to require the user to verify all related fields when any one is changed.
Other Normalization FormsFourth normal form, also called Boyce Codd Normal Form (BCNF), and fifth normal form do exist, but are rarely considered in practical design. Disregarding these rules may result in less than perfect database design, but should not affect functionality.
********************************** Examples of Normalized Tables ********************************** Normalization Examples: Unnormalized table: Student# Advisor Adv-Room Class1 Class2 Class3 ------------------------------------------------------- 1022 Jones 412 101-07 143-01 159-02 4123 Smith 216 201-01 211-02 214-01
For additional information about designing a database, click the article number below to view the article in the Microsoft Knowledge Base:
234208"FoxPro 2 A Developer's Guide," Hamilton M. Ahlo Jr. et al., pages 220-225, M & T Books, 1991
(http://support.microsoft.com/kb/234208/EN-US/ )ACC2000: "Understanding Relational Database Design" Document Available in Download Center
"Using Access for Windows," Roger Jennings, pages 799-800, Que Corporation, 1993
Article ID: 100139 - Last Review: June 17, 2014 - Revision: 3.0
Retired KB Content Disclaimer
This article was written about products for which Microsoft no longer offers support. Therefore, this article is offered "as is" and will no longer be updated.