bacalhau

command module
v1.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 23, 2025 License: Apache-2.0 Imports: 10 Imported by: 0

README

Bacalhau

Globally Distributed Compute Orchestrator ⚡
Compute Over Data (CoD)


Bacalhau contributors Bacalhau website follow on Twitter

Main Workflow Release Workflow PR Checks Workflow

What is Bacalhau?

Bacalhau is an open-source distributed compute orchestration framework designed to bring compute to the data. Instead of moving large datasets around networks, Bacalhau makes it easy to execute jobs close to the data's location, drastically reducing latency and resource overhead.

Why Bacalhau?

  • Fast job processing: Jobs in Bacalhau are processed where the data was created and all jobs are parallel by default
  • 💰 Low cost: Reduce (or eliminate) ingress/egress costs since jobs are processed closer to the source
  • 🔒 Secure: Data scrubbing and security can happen before migration, with a granular, code-based permission model
  • 🚛 Large-scale data: Process petabytes of data efficiently without massive data transfers
  • 🏢 Data sovereignty: Process sensitive data within security boundaries without requiring it to leave your premises
  • 🤝 Cross-organizational computation: Allow specific vetted computations on protected datasets without exposing raw data

Key Features

  1. Single Binary Simplicity: Bacalhau is a single self-contained binary that functions as a client, orchestrator, and compute node—making it incredibly easy to set up and scale

  2. Modular Architecture: Support for multiple execution engines (Docker, WebAssembly) and storage providers through clean interfaces

  3. Orchestrator-Compute Model: A dedicated orchestrator coordinates job scheduling, while compute nodes run tasks

  4. Flexible Storage Integrations: Integrates with S3, HTTP/HTTPS, IPFS, and local storage systems

  5. Multiple Job Types: Support for batch, ops, daemon, and service job types for different workflow requirements

  6. Declarative & Imperative Submissions: Define jobs in YAML (declarative) or pass arguments via CLI (imperative)

  7. Publisher Support: Output results to local volumes, S3, or other storage backends

Getting Started

Quick Installation

# Install Bacalhau CLI (Linux/macOS)
curl -sL https://get.bacalhau.org/install.sh | bash

# Verify installation
bacalhau version

For the complete quick start guide, including running your first job, see our Quick Start Documentation.

Use Cases

Bacalhau's distributed compute framework enables a wide range of applications:

  • Log Processing: Process logs efficiently at scale by running distributed jobs directly at the source
  • Distributed Data Warehousing: Query and analyze data across multiple regions without moving large datasets
  • Fleet Management: Efficiently manage distributed nodes across multiple environments
  • Distributed Machine Learning: Train and deploy ML models across a distributed compute fleet
  • Edge Computing: Run compute tasks closer to the data source for applications requiring low latency

Documentation

📚 Read the Bacalhau docs guide here! 📚

The Bacalhau documentation contains all the information you need to get started:

Community & Contributing

Bacalhau has a very friendly community, and we are always happy to help:

  • Join the Slack Community and go to the #general channel - it is the easiest way to engage with other members in the community and get help

If you are interested in contributing to the Bacalhau project:

We are excited to hear your feedback!

Open Source

This repository contains the Bacalhau software, covered under the Apache-2.0 license, except where noted (any Bacalhau logos or trademarks are not covered under the Apache License, and should be explicitly noted by a LICENSE file.)

Bacalhau is a product produced from this open source software, exclusively by Expanso, Inc. It is distributed under our commercial terms.

Others are allowed to make their own distribution of the software, but they cannot use any of the Bacalhau trademarks, cloud services, etc.

We explicitly grant permission for you to make a build that includes our trademarks while developing Bacalhau software itself. You may not publish or share the build, and you may not use that build to run Bacalhau software for any other purpose.

We have borrowed the above Open Source clause from the excellent System Initiative

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
apps
cmd
cli
util/printer
Package printer provides functionality for printing job events and progress.
Package printer provides functionality for printing job events and progress.
util/templates
Package templates provides utilities for formatting CLI help text.
Package templates provides utilities for formatting CLI help text.
pkg
bacerrors
Package bacerrors provides a rich error type for detailed error handling in Go applications.
Package bacerrors provides a rich error type for detailed error handling in Go applications.
bidstrategy
Package bidstrategy is a generated GoMock package.
Package bidstrategy is a generated GoMock package.
compute
Package compute is a generated GoMock package.
Package compute is a generated GoMock package.
compute/store
Package store is a generated GoMock package.
Package store is a generated GoMock package.
jobstore
Package jobstore is a generated GoMock package.
Package jobstore is a generated GoMock package.
lib/backoff
Package backoff is a generated GoMock package.
Package backoff is a generated GoMock package.
lib/ncl
Package ncl is a generated GoMock package.
Package ncl is a generated GoMock package.
lib/watcher
Package watcher is a generated GoMock package.
Package watcher is a generated GoMock package.
logger
DataFrame wraps the byte framing used by docker to multiplex output from a docker container when there is no TTY in use.
DataFrame wraps the byte framing used by docker to multiplex output from a docker container when there is no TTY in use.
nats/stream
Package stream provides a NATS client for streaming records between clients with asynchronous response handling.
Package stream provides a NATS client for streaming records between clients with asynchronous response handling.
orchestrator
Package orchestrator is a generated GoMock package.
Package orchestrator is a generated GoMock package.
orchestrator/nodes
Package nodes is a generated GoMock package.
Package nodes is a generated GoMock package.
s3
sso
storage/inline
Package inline provides a storage abstraction that stores data for use by Bacalhau jobs within the storage spec itself, without needing any connection to an external storage provider.
Package inline provides a storage abstraction that stores data for use by Bacalhau jobs within the storage spec itself, without needing any connection to an external storage provider.
swagger
Package swagger Code generated by swaggo/swag.
Package swagger Code generated by swaggo/swag.
test/scenario
Package scenario provides a high-level testing framework for running Bacalhau jobs in different configurations and making assertions against the results.
Package scenario provides a high-level testing framework for running Bacalhau jobs in different configurations and making assertions against the results.
test/teststack
Package testutils collects common test utilities.
Package testutils collects common test utilities.
transport/bprotocol/compute
Package compute provides transport layer implementation for compute nodes using the legacy bprotocol over NATS.
Package compute provides transport layer implementation for compute nodes using the legacy bprotocol over NATS.
transport/bprotocol/orchestrator
Package orchestrator provides transport layer implementation for orchestrator nodes using the legacy bprotocol over NATS.
Package orchestrator provides transport layer implementation for orchestrator nodes using the legacy bprotocol over NATS.
transport/nclprotocol
Package nclprotocol is a generated GoMock package.
Package nclprotocol is a generated GoMock package.
version
Package version provides information about what Bacalhau was built from.
Package version provides information about what Bacalhau was built from.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL