Database Migrations
This guide covers how to evolve your database schema over time using Proper's migration system.
After reading this guide, you will know:
- What a migration is and when you need one.
- How Proper's model-driven workflow differs from frameworks where you write migrations by hand.
- How to generate, write, and apply migrations.
- The full
migratorAPI for creating tables, changing fields, adding indexes, and running raw SQL. - How to roll back, target a specific migration, and check status.
- How to manage migrations across multiple databases.
- How to squash old migrations and bring an existing database under management.
1. Migration Overview
1.1 What a Migration Is
A migration is a Python file that describes a change to your database schema - creating a table, adding a column, renaming a field, dropping an index, or any combination of these. Each migration knows how to apply the change and how to undo it.
Migrations exist for two reasons:
- Reproducibility: every developer (and every environment) ends up with the same schema by running the same migrations in the same order.
- Safety: you can change a live database without losing data, because each step is small, reviewable, and reversible.
1.2 Your Model File IS the Schema
This is the most important thing to understand about migrations in Proper, the model is always the source of truth, and migrations are an artifact derived from changes to it.
# models/book.py
class Book(BaseModel):
title = pw.CharField()
author = pw.CharField()
published = pw.BooleanField(default=False)
You don't usually write a migration by hand. Instead, you edit the model and let Proper generate a migration that brings the database up to date with what the model now says.
The cycle looks like this:
- Edit the model - add a field, rename one, drop one, change a default.
- Generate the migration -
proper db create [name]diffs your models against the migration history and writes a migration file. - Review the file - the generated code lives in
db/main/. Read it, edit if needed. - Apply it -
proper db migrate.
Note
You can still write migrations by hand when you need to (see Writing a Migration by Hand), but day-to-day you won't.
1.3 Where Migrations Live
Migration files live in db/<database_name>/. For a single-database app, that means db/main/:
db/
└── main/
├── 001_initial.py
├── 002_add_book.py
└── 003_rename_book_status.py
Filenames are NNN_name.py, where NNN is a three-digit zero-padded sequence number. Migrations run in numeric order.
Sequence numbers can collide on branches
Unlike other framework's timestamp-based naming, Proper uses sequential numbers.
If two developers each create migration 005_*.py on separate branches, you'll need to renumber one of them when merging.
For most teams this is a minor annoyance; for large teams it's worth a short merge convention (e.g. always renumber to the highest known sequence on the target branch).
The history of which migrations have been applied is stored inside the database itself, in a small bookkeeping table called migratehistory that records which migration steps have been run.
2. Generating a Migration
2.1 Auto-Generating From Model Changes
The everyday command is:
proper db create [name]
This diffs every model imported in models/__init__.py against the migration history and writes a new migration file that captures the differences.
For example, if you've just added a published field to Book:
class Book(BaseModel):
title = pw.CharField()
author = pw.CharField()
published = pw.BooleanField(default=False) # newly added
Then:
proper db create add_published_to_book
Will produce something like db/main/004_add_published_to_book.py containing the migrate() and rollback() calls needed to add (and undo) that column.
If you don't pass a name, the migration is named auto:
proper db create
# => db/main/004_auto.py
This works, but it's worth giving migrations meaningful names - they'll show up in code review, deploy logs, and your future self's git blame.
2.2 Naming Migrations
Conventionally, migration names are snake_case, short, and describe what the migration does - not why:
| Good | Less good |
|---|---|
add_published_to_book |
book_changes |
rename_user_email_to_address |
fix_user |
drop_legacy_session_table |
cleanup |
create_comment |
comments |
You can use any verb that fits: add_, remove_, rename_, drop_, create_, change_, backfill_.
2.3 The proper g model Shortcut
When you generate a brand-new model with proper g model - or as a part of a resource with proper g resource - you can also create the corresponding migration in one step by passing --migration:
proper g model Comment body:text book:foreign --migration
This generates models/comment.py, adds it to models/__init__.py, and then runs proper db create create_comment for you. It's the fastest way to add a new table.
Without --migration, only the model file is created; you'd then run proper db create separately.
2.4 Writing a Migration by Hand
Sometimes you need a migration that isn't a straightforward model diff - backfilling data, running a one-off SQL fix, transforming column values, or coordinating a schema change with application code.
For these cases, generate an empty migration with a descriptive name and edit it directly:
proper db create backfill_book_slugs
Then open db/main/NNN_backfill_book_slugs.py and write the body of migrate() and rollback() yourself. See Data Migrations: Running Python for the data-migration pattern.
3. Anatomy of a Migration File
Every generated file follows the same shape.
3.1 The migrate() and rollback() Functions
"""Peewee migrations -- 002_add_book.
Some examples (model - class or model name)::
> Model = migrator.orm['table_name']
> migrator.create_model(Model)
> migrator.add_fields(model, **fields)
> ...
"""
import peewee as pw
from peewee_migrate import Migrator
def migrate(migrator: Migrator, database: pw.Database, *, fake=False):
"""Write your migrations here."""
migrator.create_model(Book)
def rollback(migrator: Migrator, database: pw.Database, *, fake=False):
"""Write your rollback migrations here."""
migrator.remove_model(Book)
migrate() runs when the migration is applied. rollback() runs when it's rolled back. Both directions are explicit - when you generate a migration from a model diff, Proper auto-fills both, but you should always read both before merging the file.
The fake keyword argument lets you mark a migration as applied without actually running it. You usually trigger this from the CLI (--fake), not from inside the function body - see The --fake Flag.
3.2 The migrator Object
The migrator argument is your toolkit. Every operation you'd want to do - create a table, add a column, drop an index - is a method on it:
| Method | What it does |
|---|---|
create_model(Model) |
Create a table from a Peewee model class. |
remove_model(Model, cascade=True) |
Drop a table. |
add_fields(Model, **fields) |
Add one or more columns to a table. |
change_fields(Model, **fields) |
Change a column's type or options. |
remove_fields(Model, *names, cascade=True) |
Drop one or more columns. |
rename_field(Model, old_name, new_name) |
Rename a column. |
rename_table(Model, new_table_name) |
Rename a table. |
add_index(Model, *cols, unique=False) |
Add an index over one or more columns. |
drop_index(Model, *cols) |
Drop an index. |
add_not_null(Model, *names) |
Add NOT NULL to existing columns. |
drop_not_null(Model, *names) |
Remove NOT NULL from columns. |
add_default(Model, name, default) |
Set or change a column default. |
add_constraint(Model, name, sql) |
Add a named CHECK constraint. |
drop_constraints(Model, *names) |
Drop named constraints. |
sql(query, *params) |
Run raw SQL. |
run(func, *args, **kwargs) |
Run an arbitrary Python function. |
You can also fetch a model by table name with migrator.orm['<table_name>']. This is useful inside data migrations, where you want to operate on the schema as it was at this migration's point in time - see Data Migrations: Running Python.
3.3 Reading a Generated File End-to-End
Here's a real migration that adds a Comment table with a foreign key to Book:
"""Peewee migrations -- 003_create_comment."""
import peewee as pw
from peewee_migrate import Migrator
def migrate(migrator: Migrator, database: pw.Database, *, fake=False):
@migrator.create_model
class Comment(pw.Model):
id = pw.AutoField()
body = pw.TextField()
book = pw.ForeignKeyField(
column_name="book_id",
field="id",
model=migrator.orm["book"],
on_delete="CASCADE",
)
class Meta:
table_name = "comment"
def rollback(migrator: Migrator, database: pw.Database, *, fake=False):
migrator.remove_model(migrator.orm["comment"])
A few things to notice:
- The model class inside
migrate()is not your application'sCommentmodel. It's a snapshot of what the table should look like at this point in history. This is why migrations importpeeweedirectly and don't import your real models - if yourCommentmodel changes later, this migration must still describe the original shape. - Foreign keys reference the related model through
migrator.orm["<table_name>"], not by importing the model class. Same reason. @migrator.create_modelis a decorator form;migrator.create_model(MyModel)as a regular call works too.
4. Writing Migrations
This section walks through every common operation. All examples assume you're inside a migrate() function with a migrator argument available.
4.1 Creating and Removing Tables
@migrator.create_model
class Tag(pw.Model):
id = pw.AutoField()
name = pw.CharField(unique=True)
class Meta:
table_name = "tag"
To drop a table:
migrator.remove_model(migrator.orm["tag"], cascade=True)
cascade=True drops dependent objects (like foreign keys pointing at this table). Without it, the database will refuse to drop a table that other tables depend on.
4.2 Adding, Changing, and Removing Fields
Add one or more columns to an existing table:
migrator.add_fields(
migrator.orm["book"],
isbn=pw.CharField(null=True),
page_count=pw.IntegerField(default=0),
)
Change a column's type or options:
migrator.change_fields(
migrator.orm["book"],
title=pw.CharField(max_length=500),
)
Remove columns:
migrator.remove_fields(migrator.orm["book"], "isbn", "page_count")
Warning
remove_fields is destructive - once a column is dropped, the data in it is gone. Make sure you really want this before merging, and that the corresponding model attribute has already been removed (or the model would still try to read a column that no longer exists).
4.3 Renaming Tables and Fields
migrator.rename_field(migrator.orm["user"], "email", "email_address")
migrator.rename_table(migrator.orm["user"], "account")
Renames preserve data - the column or table is renamed in place, not dropped and recreated.
4.4 Indexes
# Single-column index
migrator.add_index(migrator.orm["book"], "title")
# Composite index
migrator.add_index(migrator.orm["session"], "user_id", "revoked")
# Unique index
migrator.add_index(migrator.orm["book_tag"], "book_id", "tag_id", unique=True)
# Drop an index
migrator.drop_index(migrator.orm["book"], "title")
For background on when to add an index, see the Peewee ORM guide's Indexes and Constraints section.
4.5 Constraints: NOT NULL, Defaults, and CHECK
Add NOT NULL to one or more existing columns:
migrator.add_not_null(migrator.orm["book"], "isbn")
This will fail at the database level if any row currently has NULL in that column - backfill the data first (see Data Migrations: Running Python).
Drop NOT NULL:
migrator.drop_not_null(migrator.orm["book"], "isbn")
Set or change a column default:
migrator.add_default(migrator.orm["book"], "views", 0)
Add a named CHECK constraint:
migrator.add_constraint(
migrator.orm["book"],
"book_views_non_negative",
"views >= 0",
)
Drop one:
migrator.drop_constraints(migrator.orm["book"], "book_views_non_negative")
4.6 Foreign Keys
Foreign keys are added like any other column, using pw.ForeignKeyField and referencing the parent table through migrator.orm:
migrator.add_fields(
migrator.orm["comment"],
user=pw.ForeignKeyField(
column_name="user_id",
field="id",
model=migrator.orm["user"],
on_delete="SET NULL",
null=True,
),
)
on_delete accepts the same values as in your model definitions: "CASCADE", "SET NULL", "RESTRICT", "SET DEFAULT", or "NO ACTION". Foreign keys automatically create an index on the referencing column, so you don't need a separate add_index call.
4.7 Running Raw SQL
When migrator doesn't have a method for what you need (database-specific features, complex updates, working with views), drop down to SQL:
migrator.sql("CREATE EXTENSION IF NOT EXISTS pg_trgm")
migrator.sql(
"UPDATE book SET status = %s WHERE status IS NULL",
"draft",
)
The second argument is a tuple of bind parameters. Always use parameters - never interpolate user data into the SQL string yourself.
4.8 Data Migrations: Running Python
Some migrations aren't about schema at all - they're about reshaping the data already in your tables. Use migrator.run():
def backfill_slugs(migrator):
Book = migrator.orm["book"]
for book in Book.select().where(Book.slug.is_null()):
book.slug = slugify(book.title)
book.save(only=[Book.slug])
def migrate(migrator: Migrator, database: pw.Database, *, fake=False):
migrator.add_fields(
migrator.orm["book"],
slug=pw.CharField(null=True),
)
migrator.run(backfill_slugs, migrator)
migrator.add_not_null(migrator.orm["book"], "slug")
The pattern above is the canonical "add a NOT NULL column to existing rows" workflow:
- Add the column as nullable.
- Backfill values into every existing row.
- Promote the column to
NOT NULL.
Always go through migrator.orm
Inside a data migration, fetch models with migrator.orm["<table>"] instead of importing your application model. The orm snapshot reflects the schema as of this migration. Importing your real model would couple the migration to whatever version of the model exists when the migration runs - which can break old migrations months later.
5. Running Migrations
5.1 Applying Pending Migrations
proper db migrate
Runs every migration that hasn't been applied yet, in order. With no arguments, it migrates all registered databases. To target one:
proper db migrate --db main
If there's nothing to apply, it prints No pending migrations found. and exits.
5.2 Migrating to a Specific Target
proper db migrate_to 003_create_comment --db main
Applies migrations up to and including 003_create_comment, then stops. Useful when you're stepping forward through history during testing or debugging.
5.3 Rolling Back
proper db rollback --db main
Reverses the most recently applied migration by running its rollback() function. Roll back one migration at a time - running this command repeatedly steps backward through history.
Warning
Rollbacks rely on the rollback() function being correct. The auto-generated rollback covers most schema operations, but data migrations and raw-SQL migrations need a hand-written rollback (or an honest "this is irreversible" message). Always read the rollback before merging.
5.4 Checking Status
proper db todo --db main # migrations not yet applied
proper db done --db main # migrations already applied
Both print one filename per line.
5.5 The --fake Flag
proper db migrate --fake --db main
proper db migrate_to 003_create_comment --fake --db main
--fake updates the migratehistory table as if the migration ran, without actually executing it. There are two cases where this is the right tool:
- Adopting an existing database: you're bringing a database that already has the correct schema under Proper's migration system. Run
proper db create initialagainst your current models, thenproper db migrate --faketo mark it as applied without trying (and failing) to create tables that already exist. - Recovering from a partial migration: a migration ran halfway, you fixed the database state by hand, and you want to mark it as complete.
Outside those cases, don't use --fake. The history table will lie about what's actually in the database.
6. Working with Multiple Databases
If your application has more than one database registered in config/storage.py, each one gets its own folder and its own migration history.
6.1 One Folder Per Database
db/
├── main/
│ ├── 001_initial.py
│ └── 002_add_book.py
└── analytics/
├── 001_initial.py
└── 002_add_page_view.py
Sequence numbers reset per database - db/main/001_initial.py and db/analytics/001_initial.py are independent.
6.2 The --db Flag
Every proper db subcommand accepts --db <name> to target one specific database:
proper db create add_page_view --db analytics
proper db migrate --db analytics
proper db rollback --db analytics
proper db todo --db analytics
The default is --db main.
6.3 Migrating Every Database at Once
proper db migrate (with no --db flag) applies pending migrations for every database in turn:
proper db migrate
# Running migrations for 'main':
# db/main/004_add_isbn.py
#
# Running migrations for 'analytics':
# db/analytics/003_add_referrer.py
This is what you want during deployment - a single command brings every database up to the version your code expects.
For more on registering and using multiple databases, see the Peewee ORM guide's Using Multiple Databases section.
7. Seed Data
A seed is a small Python script that puts canonical data into your database - default roles, reference data, the first admin user. Seeds differ from migrations in three important ways: they're idempotent (safe to re-run any time), they describe data your current schema needs, and they're not tied to a specific moment in history.
Seeds vs Migrations
Seeds are operations you want to run on every fresh database, forever. Migrations are operations tied to a single moment in your schema's history.
Default roles are a seed. Backfilling existing rows after adding a column is a migration (see Data Migrations: Running Python).
7.1 Where Seeds Live
Seeds live in db/seeds/, parallel to the per-database migration folders:
db/
├── main/ # migrations for "main"
│ ├── 001_initial.py
│ └── 002_add_book.py
└── seeds/
├── __init__.py
├── roles.py
└── admin_user.py
For applications with multiple databases, mirror the per-database layout:
db/
├── main/
├── analytics/
└── seeds/
├── main/
│ ├── __init__.py
│ ├── roles.py
│ └── admin_user.py
└── analytics/
├── __init__.py
└── reference_paths.py
Seeds run in the order they're imported in __init__.py, not alphabetically. This lets you make dependencies explicit - if admin_user needs the admin role to exist first, import roles before admin_user:
# db/seeds/__init__.py
from . import roles # noqa
from . import admin_user # noqa
The import list is the dependency graph. Reorder it to reorder the seed runs.
7.2 Writing a Seed
Each seed file exports a seed() function and declares which environments it runs in:
# db/seeds/roles.py
from myapp.models import Role
# APP_ENV values where this seed should run.
envs = ("dev", "test", "prod")
def seed():
"""The roles the auth system expects to exist."""
for name in ["admin", "editor", "viewer"]:
Role.get_or_create(name=name)
seed() takes no arguments and returns nothing. Whatever it does, it should leave the database in the same state every time you run it - which is what idempotency means in practice.
7.3 Idempotency
A seed must be safe to run a second, third, or hundredth time without changing the database after the first run. The standard pattern is Peewee's get_or_create, which returns a tuple (instance, created) - the first run creates the row, every subsequent run finds it:
Role.get_or_create(name="admin")
For seeds with attributes beyond the lookup key, pass defaults:
user, created = User.get_or_create(
email="admin@example.com",
defaults={
"name": "Admin",
"password_hash": hash_password("changeme"),
},
)
defaults are only used when creating. If the row already exists, the existing values are kept untouched - so re-running the seed won't reset a password that's been changed since.
For checks that don't fit get_or_create, fall back to an explicit existence test:
if not Tag.select().where(Tag.name == "featured").exists():
Tag.create(name="featured")
Warning
Avoid plain Model.create(...) in seeds. The second run will hit a unique-constraint error and stop the whole seed before later inserts get a chance.
7.4 Environment-Aware Seeds
The envs tuple is the only knob that controls when a seed runs. APP_ENV is set automatically per context:
| Context | APP_ENV |
|---|---|
| Local development | dev |
| Tests | test |
| Production | prod |
Common combinations:
envs = ("dev", "test", "prod")- runs in every context. Use this for data your tests rely on (e.g. default roles).envs = ("dev", "prod")- runs everywhere except tests. Use this for essential seeds whose test counterparts you'd rather control via fixtures.envs = ("dev",)- runs only on a developer's machine. Use this for sample or demo content.envs = ("test",)- runs only during tests. Rare - tests usually use fixtures.
If APP_ENV isn't in the tuple, the seed is skipped, and the CLI reports it as such.
There is no --force flag. The envs tuple is the absolute environment boundary - even invoking a seed by name (proper db seed sample_books) honors it. To run a guarded seed in another environment, change the tuple in the file.
7.5 Running Seeds
proper db seed # run every seed for "main", in import order
proper db seed --db main # same, explicit
proper db seed roles # run a single seed by name
proper db seed --db analytics # target a different database
The output reports each seed's outcome:
db/seeds/roles.py - ran
db/seeds/admin_user.py - ran
db/seeds/sample_books.py - skipped (envs=("dev",), APP_ENV=prod)
Warning
proper db migrate does not run seeds. The two are deliberately separate - migrate is purely about schema, and seeds are always opt-in. Always run proper db seed explicitly.
7.6 Generating a Seed
proper g seed roles
Creates db/seeds/roles.py with a seed() skeleton and a default envs = ("dev", "test", "prod") tuple, and adds the import line to db/seeds/__init__.py for you. A seed file that isn't registered in __init__.py will never run, so always either use the generator or remember to add the import yourself.
For a multi-database app, target a specific database with --db:
proper g seed reference_paths --db analytics
7.7 Seeds and Tests
APP_ENV is set to test during test runs, but seeds are not run automatically. If a test needs canonical data, call the relevant seed() function from a fixture:
# tests/conftest.py
import pytest
from db.seeds.roles import seed as seed_roles
@pytest.fixture(autouse=True)
def setup_default_data():
seed_roles()
In general, tests should use fixtures (model factories, explicit setup in the test file) rather than seeds. Seeds describe the canonical state of your application; tests describe the canonical state of one specific test. Mixing the two tends to leak unrelated data into tests and make failures harder to diagnose.
8. Managing Migration History
8.1 The migratehistory Table
Every database under migration control has a small bookkeeping table named migratehistory. It stores one row per migration that has been applied, in the order they ran. Proper reads this table to figure out which migrations are pending and which have already been done.
You shouldn't need to touch it directly - proper db todo, proper db done, and --fake are the tools for working with migration state. But if you ever need to inspect it (during a recovery, for instance), it's just a regular table.
8.2 Don't Edit Applied Migrations
Once a migration has been merged and applied to any environment, treat it as immutable. Editing it after the fact won't re-run it on environments where it's already been applied - those environments will be running on the old version of the file forever, and your databases will silently drift.
If you need to fix something a previous migration got wrong, write a new migration that corrects it.
8.3 Squashing Old Migrations
Over time, the db/main/ folder grows. After a few hundred migrations, applying them on a fresh database becomes slow, and the history is more noise than signal. To collapse all current migrations into a single new starting point:
proper db merge initial --db main
This generates a single new migration (named initial here, but you can choose any name) that recreates the current state of the schema in one step, and marks the previous history as superseded. Existing environments where the old migrations have already run aren't affected - they continue to skip already-applied entries - but new environments only need to run the squashed migration.
Squash sparingly. It's an irreversible operation in the sense that the old history is rewritten; do it on a calm branch, with the team aware.
8.4 Bringing an Existing Database Under Management
If you're adopting Proper for a database that already has tables you didn't create through migrations, the workflow is:
- Write your models to match the existing schema.
- Run
proper db create initial- this generates a migration that creates all those tables. - Run
proper db migrate --fake- this marks the migration as applied without actually running it (since the tables already exist).
From there, every future change goes through the normal migration flow.
9. Common Patterns and Pitfalls
9.1 Adding a NOT NULL Column to a Populated Table
You can't just add a NOT NULL column to a table that already has rows - the database has nothing to put in the new column for the existing data. The fix is a three-step migration, all in the same file:
def migrate(migrator: Migrator, database: pw.Database, *, fake=False):
# Step 1: add the column as nullable
migrator.add_fields(
migrator.orm["book"],
slug=pw.CharField(null=True),
)
# Step 2: backfill every existing row
def backfill(migrator):
Book = migrator.orm["book"]
for book in Book.select():
book.slug = slugify(book.title)
book.save(only=[Book.slug])
migrator.run(backfill, migrator)
# Step 3: enforce NOT NULL now that every row has a value
migrator.add_not_null(migrator.orm["book"], "slug")
If your data set is large, consider doing the backfill in a separate, dedicated migration so it can be re-run or paused independently.
9.2 Renaming a Column Without Downtime
A column rename is one operation in the database, but if your application code is still reading and writing the old name, requests will fail during deploy. The safe pattern is two migrations spread over two deploys:
- Deploy A: add the new column, copy data into it, keep the old column. App code writes to both columns and reads from the new one.
- Deploy B: once Deploy A is fully rolled out, drop the old column.
For small apps and short maintenance windows, a single in-place rename_field is fine. The two-deploy pattern is for production systems that can't afford a stale-read window.
9.3 Writing Rollback Functions You Can Trust
The auto-generated rollback() is correct for any schema operation migrator knows about. It's not magic for the operations you write yourself:
migrator.sql(...)- auto-generation has no way to invert raw SQL. Write the inverse SQL inrollback()yourself, or document that the migration is irreversible.migrator.run(...)- same. If you backfilled data, the rollback can't unbackfill it. Decide whether the rollback should clear the column, leave it, or raise.- Destructive operations (
remove_fields,remove_model) - the rollback can recreate the column or table, but the data is gone. State this in a comment, so a future reader doesn't expect a rollback to restore lost rows.
9.4 Long-Running Migrations on Large Tables
ALTER TABLE on a multi-million-row table can take minutes (or hours) and lock writers during that time. A few practical tips:
- For PostgreSQL, set a
lock_timeoutandstatement_timeoutat the start of the migration, so a stuck migration fails fast instead of holding a lock indefinitely. - For MySQL, prefer online schema-change tools (e.g.
pt-online-schema-change,gh-ost) for very large tables - wrap the operation in amigrator.sql(...)call that invokes the tool, or run the change out-of-band and then mark the migration--fake. - Split big data backfills into batches inside
migrator.run(...). A singleUPDATEover ten million rows is one transaction; a hundred batches of a hundred thousand rows is a hundred small transactions, much friendlier to the database.