Friday afternoon. Sprint ends in 4 hours. QA needs staging data.
We had just shipped a new multi-tenant billing module. The schema was clean: organizations, users, subscriptions, invoices, invoice_line_items. We needed hundreds of realistic rows, all with proper foreign key relationships.
My senior developer had started a seed script Wednesday. By Friday morning, he was still debugging:
- "Why is this invoice orphaned?"
- "Did I generate the customer IDs in the right order?"
- "How do I ensure every invoice_line_item points to a real invoice?"
Seed scripts are a nightmare of coordination. Parent tables must generate first. Child tables need to reference valid parent IDs. One mistake and your entire dataset is broken.
Then someone suggested FakerForge.
10 minutes later, our staging database was fully seeded with relationship-aware data. Real foreign keys, realistic values, zero orphaned records. We shipped on time.
The Problem with Manual Seed Scripts
Most teams write seed scripts like this:
-- seeds/initial-data.sql
INSERT INTO organizations (id, name, country)
VALUES (1, 'Acme Corp', 'US');
INSERT INTO users (id, organization_id, email, name)
VALUES (1, 1, '[email protected]', 'John');
INSERT INTO subscriptions (id, organization_id, plan)
VALUES (1, 1, 'pro');
INSERT INTO invoices (id, subscription_id, amount)
VALUES (1, 1, 99.99);
-- ... hundreds more lines, manually coordinated
This seems fine for 10 rows. But at scale, it breaks:
Problem 1: Manual Coordination You have to manually ensure every foreign key references a row that was already inserted. One mistake and the entire script fails.
Problem 2: Doesn't Scale Writing 1,000 invoice lines manually isn't practical. You end up with loops that are error-prone.
Problem 3: Goes Stale
Two months later, someone adds a new column: invoice.payment_status. Your old seed script doesn't include it. Staging fails until someone updates the script.
Problem 4: Not Realistic
Hand-written data is uniform and predictable. [email protected], [email protected], $99.99 amounts. It doesn't test edge cases.
Problem 5: Merge Conflicts When two developers add rows to the same seed file, merge conflicts are nightmare.
The core issue: Seed scripts are manual coordination of relationships, and humans are bad at scale.
How FakerForge Actually Works
FakerForge flips the approach. Instead of writing data, you describe your schema and FakerForge generates the data.
Step 1: Paste Your Schema
CREATE TABLE organizations (
id BIGINT PRIMARY KEY,
name VARCHAR(255),
country VARCHAR(2),
created_at TIMESTAMP
);
CREATE TABLE users (
id BIGINT PRIMARY KEY,
organization_id BIGINT REFERENCES organizations(id),
email VARCHAR(255),
name VARCHAR(255),
created_at TIMESTAMP
);
CREATE TABLE subscriptions (
id BIGINT PRIMARY KEY,
organization_id BIGINT REFERENCES organizations(id),
plan VARCHAR(50),
monthly_cost DECIMAL(10,2),
created_at TIMESTAMP
);
CREATE TABLE invoices (
id BIGINT PRIMARY KEY,
subscription_id BIGINT REFERENCES subscriptions(id),
amount DECIMAL(10,2),
status VARCHAR(20),
created_at TIMESTAMP
);
CREATE TABLE invoice_line_items (
id BIGINT PRIMARY KEY,
invoice_id BIGINT REFERENCES invoices(id),
product_id BIGINT,
quantity INT,
unit_price DECIMAL(10,2)
);
FakerForge parses this automatically. It detects:
- All 5 tables
- All column types
- All primary keys
- All foreign key relationships
Step 2: Set Row Counts
You tell FakerForge how much data you want:
- organizations: 25 rows
- users: 500 rows
- subscriptions: 50 rows
- invoices: 2,000 rows
- invoice_line_items: 10,000 rows
Step 3: Generate
FakerForge handles relationship-aware generation automatically. According to the docs, it:
"Respects table relationships so child rows reference valid parent rows... Parent tables generate first. Child tables generate after parent IDs exist."
Result: Every invoice.subscription_id points to a real subscription. Every invoice_line_items.invoice_id points to a real invoice. Zero orphaned records. No manual coordination needed.
Step 4: Export
Download the output in whatever format you need:
- SQL (direct import with
psqlormysql) - JSON (for document databases or APIs)
- CSV (for spreadsheets or analysis)
- XML (for legacy systems)
Why This Works Better Than Seed Scripts
Relationship Awareness Is Automatic
You don't manually coordinate parent-child generation. FakerForge handles it.
Realistic Data
FakerForge uses faker mappings. According to the docs, you can configure which faker method applies to each column. A column named email generates realistic emails. A column named country generates real country codes. A column named amount generates realistic decimal values.
Not [email protected] and $99.99 forever.
Scalable
You can easily generate 1M rows the same way you generate 100 rows. No manual script grows to thousands of lines.
Easy to Maintain
Your schema changes (new column, new table). You re-run generation. Done. Your "seed" is always in sync with your schema.
No Merge Conflicts
No checked-in SQL files to fight over. You generate data on demand.
The Free Tier Is Actually Useful
FakerForge has a free plan. According to their docs, you get:
- Up to 100 rows per table
- 1 database
- 10 tables per database
For most small teams validating the approach, that's enough to seed a realistic feature set.
Paid plans unlock higher row counts if you need production-scale load testing data.
The Real Win: It Just Works
The best part? Relationship-aware generation means it "just works." You don't need to think about:
- Table generation order
- Foreign key validity
- Orphaned records
- Edge cases in your relationships
FakerForge's algorithm handles it. You get back valid, consistent data you can use immediately.
Bottom Line
Manual seed scripts are technical debt you don't have to carry. They're slow to write, hard to maintain, break constantly, and don't scale.
FakerForge replaces them with schema-based generation. Paste your schema, set row counts, download data. Relationship-aware. Realistic. Scalable.
Get started free. No credit card required.
Your team will spend 10 minutes instead of two days. And you'll never write a seed script again.