README
ΒΆ
Fabricator
A modern, enterprise-grade CSV data generator for system-of-record testing
Fabricator is a powerful command-line tool that generates realistic CSV test data for system-of-record (SOR) platforms. Built with a robust pipeline architecture and comprehensive validation, it transforms YAML definitions into consistent, relationship-aware CSV datasets.
π Quick Start
# Download the latest release for your platform
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-linux -o fabricator
chmod +x fabricator
# Generate test data from a YAML definition
./fabricator -f examples/sample.yaml -n 1000 -o ./test-data
# Output:
# β Generated 16 CSV files with 1000 rows each
# β All relationships consistent across files
# β Entity-relationship diagram created
ποΈ Architecture
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β YAML Input βββββΆβ Validation βββββΆβ Pipeline β
β β β β β β
β β’ Entities β β β’ JSON Schema β β β’ Phase 1: IDs β
β β’ Attributes β β β’ Business Logic β β β’ Phase 2: Rels β
β β’ Relationships β β β’ 96% Template β β β’ Phase 3: Data β
βββββββββββββββββββ β Compatibility β βββββββββββββββββββ
ββββββββββββββββββββ β
βΌ
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β CSV Output ββββββ Validation ββββββ Data Model β
β β β β β β
β β’ Multi-file β β β’ Referential β β β’ Graph β
β β’ Consistent β β Integrity β β β’ Entities β
β β’ Realistic β β β’ Uniqueness β β β’ Relationships β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β¨ Features
ποΈ Robust Architecture
- Pipeline-based processing with clean separation of concerns
- Comprehensive validation with JSON Schema + business logic layers
- Graph-based dependency resolution with topological sorting
π Data Generation
- Realistic test data with type-aware field generation
- Relationship consistency across all CSV files
- Variable cardinalities (1:1, 1:N, N:1, N:N) with auto-detection
- Configurable data volume from small samples to large datasets
- Per-entity row counts via configuration files for flexible test scenarios
π Validation & Quality
- YAML schema validation using industry-standard JSON Schema
- Relationship integrity checking across entities
- Uniqueness constraint validation
- Production template compatibility (96% of SGNL catalog templates supported)
π¨ User Experience
- Colorful CLI output with progress indicators
- SVG diagram generation for entity-relationship visualization
- Detailed error messages with actionable guidance
- Multiple operation modes (generate, validate-only, diagram-only)
Installation
Requires Go 1.24.3 or higher.
From Source
# Clone the repository
git clone https://github.com/SGNL-ai/fabricator.git
cd fabricator
# Build the project
make build
The binary will be built to build/fabricator.
From GitHub Releases
Pre-built binaries for Linux, macOS (Intel and Apple Silicon), and Windows are automatically generated for each release and can be downloaded from the GitHub Releases page.
# For macOS Intel
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-macos-intel -o fabricator
chmod +x fabricator
./fabricator --version
# For macOS Apple Silicon (M1/M2/M3)
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-macos-apple-silicon -o fabricator
chmod +x fabricator
./fabricator --version
# For Linux
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-linux -o fabricator
chmod +x fabricator
./fabricator --version
For Windows users, download the fabricator-windows.exe file from the releases page.
Usage
# Basic usage (short options)
./build/fabricator -f <yaml-file> [-o <dir>] [-n <count>] [-a]
# Basic usage (long options)
./build/fabricator --file <yaml-file> [--output <dir>] [--num-rows <count>] [--auto-cardinality]
# View version information
./build/fabricator -v
Command Line Options
| Short Flag | Long Flag | Description | Default |
|---|---|---|---|
-f |
--file |
Path to the YAML definition file (required) | - |
-o |
--output |
Directory to store generated CSV files | "output" |
-n |
--num-rows |
Number of rows to generate for each entity | 100 |
-c |
--count-config |
Path to row count configuration YAML file | - |
-a |
--auto-cardinality |
Enable automatic cardinality detection | false |
-d |
--diagram |
Generate Entity-Relationship diagram | true |
--validate |
Validate relationships in CSV files | true | |
--validate-only |
Validate existing CSV files without generation | false | |
-v |
--version |
Display version information | - |
Examples
# Generate CSV files from example.yaml with 500 rows per entity
./build/fabricator -f example.yaml -n 500 -o data/sgnl
# Using long-form options
./build/fabricator --file example.yaml --num-rows 1000 --output export/data
# Generate CSV files with automatic cardinality detection for relationships
./build/fabricator -f example.yaml -n 200 -a
# Using long-form options with auto-cardinality
./build/fabricator --file example.yaml --num-rows 500 --auto-cardinality --output data/variable-cardinality
# Generate CSV files but disable ER diagram generation
./build/fabricator -f example.yaml --diagram=false
# Validate existing CSV files without generating new data
./build/fabricator -f example.yaml -o existing/csv/data --validate-only
# Validate existing CSV files and generate an ER diagram
./build/fabricator -f example.yaml -o existing/csv/data --validate-only --diagram
Per-Entity Row Count Configuration
Fabricator now supports specifying different row counts for each entity using a configuration file, providing flexibility for realistic test data scenarios.
Generate a Configuration Template
# Generate a row count configuration template from your SOR YAML
./build/fabricator init-count-config -f example.yaml > counts.yaml
This creates a YAML file with all entities and default row counts:
# Row count configuration for fabricator
# Generated from: example.yaml
# Last updated: 2025-10-30 14:30:00
# Entity: users
# Description: User accounts
users: 100
# Entity: groups
# Description: User groups
groups: 100
# Entity: permissions
# Description: Access permissions
permissions: 100
Customize Row Counts
Edit the generated file to specify different row counts per entity:
users: 1000 # Large user base
groups: 50 # Fewer groups
permissions: 200 # Moderate permissions
Generate CSVs with Custom Counts
# Use the configuration file to generate CSVs
./build/fabricator -f example.yaml --count-config counts.yaml -o output/
Additional Options
| Flag | Long Flag | Description |
|---|---|---|
-c |
--count-config |
Path to row count configuration YAML file |
Note: The --count-config and -n flags are mutually exclusive. Use one or the other, not both.
Use Cases
Realistic Data Distributions:
employees: 10000
departments: 100
managers: 500
Performance Testing:
transactions: 1000000
customers: 100000
products: 10000
Minimal Test Data:
users: 5
groups: 2
permissions: 3
YAML Format
The YAML file should define a system-of-record structure, including:
- Entities with attributes
- Relationships between entities
- External IDs that will be used for CSV filenames
Each entity in the YAML file will result in a corresponding CSV file, with the filename derived from the entity's externalId.
Generated Data & Validation
The tool provides the following functionality:
-
CSV Generation:
- CSV files named after each entity's external ID (without the namespace prefix)
- Headers matching the entity's attribute external IDs
- Consistent data across relationships between entities
- Variable cardinality relationships (with the
-aflag) - Realistic test data based on attribute names and types
-
CSV Validation (via
--validate-only):- Checks existing CSV files against a YAML definition
- Validates relationship consistency across entities
- Verifies unique constraint requirements are met
- Helpful for validating production or manually-created data exports
- Use with the existing output directory containing CSV files
-
Entity-Relationship Diagram (enabled by default):
- SVG visualization of all entities and their relationships
- Color-coded entities with attributes listed
- Primary keys (uniqueId attributes) highlighted
- Relationship cardinality indicators (1:1, 1:N, N:1, N:M)
- Can be disabled with
--diagram=false - Works in both generation and validation-only modes
The data generator intelligently creates appropriate values based on field names:
- ID fields get unique identifiers
- Name fields get contextual names based on entity types (e.g., person names for users, company names for organizations)
- Date fields get properly formatted dates
- Email fields get valid email addresses
- Boolean fields get true/false values
- Numeric fields get appropriate numbers
Relationship Cardinality
When the auto-cardinality feature is enabled (-a flag), Fabricator automatically detects and generates appropriate cardinality for entity relationships:
- 1:1 relationships - Simple one-to-one mappings between entities
- 1:N relationships - One entity related to multiple instances of another entity
- N:1 relationships - Multiple entities related to a single instance of another entity
Cardinality detection is based on:
-
Entity metadata (primary detection method)
- Fields with
uniqueId: trueare used to identify key relationships - When a relationship links a unique ID to a non-unique field, cardinality is automatically determined
- Fields with
-
Field naming patterns (fallback method)
- Field names ending with "Id" typically indicate N:1 relationships
- Plural field names or names ending with "Ids" suggest 1:N relationships
Without the -a flag, all relationships default to 1:1 cardinality.
π Performance
Fabricator is designed for efficiency and can handle large datasets:
| Dataset Size | Entities | Time | Memory |
|---|---|---|---|
| Small | 5 | <1s | <50MB |
| Medium | 16 | 2-5s | <100MB |
| Large | 50 | 10-30s | <500MB |
Benchmarks (16 entities, complex relationships):
- 1,000 rows/entity: ~3 seconds, 16 CSV files
- 10,000 rows/entity: ~15 seconds, consistent relationships
- 100,000 rows/entity: ~2 minutes, 1.6M total records
π οΈ Development
Prerequisites for Development
- Go 1.23+ (tested with 1.23 and 1.24)
- golangci-lint for code quality
- Pre-commit hooks (optional but recommended)
Development Commands
# Run tests
make test
# Run tests with coverage
make coverage
# Format code
make fmt
# Static analysis
make vet
# Run linter
make lint
# Run all checks (CI pipeline)
make ci
# Security scanning
gosec ./...
govulncheck ./...
Contributing
See CONTRIBUTING.md for detailed development guidelines, architecture documentation, and contribution workflow.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Directories
ΒΆ
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
fabricator
command
|
|
|
pkg
|
|
|
generators/model
Package model is a generated GoMock package.
|
Package model is a generated GoMock package. |