A well-designed database makes every query simpler and faster. A poorly-designed one creates bugs, slow queries, and painful migrations forever. Here are the fundamentals every developer should know before creating tables.

1. Normalization โ€” Reduce Redundancy

Normalization eliminates duplicate data and prevents update anomalies. You need at least 3NF (Third Normal Form) for most applications.

1NF: Atomic Values, No Repeating Groups

-- โŒ Denormalized: multiple phone numbers in one field
| id | name  | phones              |
| 1  | Alice | "555-0001, 555-0002"|

-- โœ… 1NF: each value is atomic, or use a separate table
| id | name  |
| 1  | Alice |
| id | user_id | phone     |
| 1  | 1       | 555-0001  |
| 2  | 1       | 555-0002  |

2NF: No Partial Dependencies

Every non-key column must depend on the WHOLE primary key, not part of it. This only applies to tables with composite keys.

-- โŒ 2NF violation: course_name depends only on course_id, not the full key
| student_id | course_id | course_name | grade |
| 1          | CS101     | Intro CS    | A     |

-- โœ… 2NF: split into two tables
Students: student_id โ†’ course_id โ†’ grade
Courses: course_id โ†’ course_name

3NF: No Transitive Dependencies

Non-key columns must not depend on other non-key columns.

-- โŒ 3NF violation: city_population depends on city, not directly on the key
| student_id | city    | city_population |
| 1          | Boston  | 675000          |

-- โœ… 3NF: city_population in a cities table
Students: student_id โ†’ city_id
Cities: city_id โ†’ name, population

2. Indexing โ€” Speed Up Queries

Index TypeBest ForExample
B-Tree (default)Equality, range, sortingWHERE email = ?, ORDER BY created_at
CompositeMulti-column queriesWHERE user_id = ? AND status = ?
PartialFiltering by conditionWHERE deleted_at IS NULL
Full-text (GIN/GiST)Text searchWHERE body @@ to_tsquery('typescript')
UniqueEnforce uniquenessUNIQUE(email)

Index Rules of Thumb

  • Index WHERE and JOIN columns. Every foreign key gets an index.
  • Composite index column order matters. Put the most selective column first.
  • Don't over-index. Each index slows down INSERT/UPDATE/DELETE.
  • Use EXPLAIN ANALYZE. Verify the index is actually being used.

3. Relationship Types

-- One-to-Many (most common):
CREATE TABLE posts (
  id SERIAL PRIMARY KEY,
  user_id INT REFERENCES users(id),  -- FK to users
  title TEXT NOT NULL
);  -- One user โ†’ many posts

-- Many-to-Many (use junction table):
CREATE TABLE post_tags (
  post_id INT REFERENCES posts(id),
  tag_id INT REFERENCES tags(id),
  PRIMARY KEY (post_id, tag_id)
);  -- One post โ†’ many tags, one tag โ†’ many posts

-- One-to-One (rare, use for optional extension):
CREATE TABLE user_profiles (
  user_id INT PRIMARY KEY REFERENCES users(id),
  bio TEXT
);  -- One user โ†’ one profile

4. Common Schema Design Mistakes

MistakeWhy It's BadFix
Using VARCHAR for everythingNo constraints, wasted spaceUse appropriate types (UUID, INT, TIMESTAMPTZ, TEXT)
Storing JSON blobs instead of columnsCan't index, can't query efficientlyRelational columns first, JSONB only for truly dynamic data
No TIMESTAMPTZTimezone bugs are a nightmareAlways use TIMESTAMPTZ, store UTC
Missing foreign key constraintsOrphaned data, referential chaosAlways add FK constraints (ON DELETE CASCADE or SET NULL)
EAV (Entity-Attribute-Value)Unqueriable soupUse JSONB for dynamic fields per row, or normal columns

5. Choosing Your Primary Key

StrategyProsCons
UUID v4No collisions, client-generated, no sequence contentionLarger (16 bytes), fragmented index, slower joins
UUID v7Time-ordered, all UUID benefitsSlightly more complex generation
Auto-increment INTSmall index, fast joins, orderedPredictable, can't merge across servers, exposes count
Auto-increment BIGINTSame as INT, won't overflowSame cons. Use this over INT for new projects.
Nano ID / CUID2URL-safe, collision-resistantString-based (slower than UUID in Postgres)

Recommendation: UUID v7 for distributed systems and public-facing IDs. BIGINT for internal tables. Avoid exposing auto-increment IDs in URLs.

Bottom line: Normalize to 3NF, index every FK and query column, use appropriate data types, and always have foreign key constraints. A well-designed schema is cheaper to fix now than later. See also: database comparison and ORM comparison.