Automated backup with scheduling
This guide turns the one-off dump and upload flow from the per-database tutorials into a scheduled backup. It combines three pieces:
- A dump script that writes your database to
./db-dumps(you drop in the command from your database tutorial). - A small Go uploader built on the BaaS SDK that uploads
./db-dumpsas a snapshot and applies a retention policy. - A schedule (cron or systemd timer) that runs it nightly.
cron / timer
β
βΌ
backup.sh βββΆ make_dump() βββΆ ./db-dumps/app.dump (Stage 1: your DB)
β
ββββββββΆ lh-backup (Go) βββΆ snapshot on Lighthouse (Stage 2: SDK)
βββΆ prune old snapshots
Prerequisitesβ
- Go 1.24+ and the SDK installed:
go get github.com/lighthouse-web3/baas-go-sdk@latest - An API key scoped
backup:write,backup:read,snapshots:readβ see API Keys. - The dump tool for your database (e.g.
pg_dump,mysqldump,mongodump,aws).
Export your credentials (the job reads these from the environment):
export LH_API_KEY="lh_xxxxxxxxxxxxxxxxxxxxxxxx"
export LH_WORKSPACE_ID="550e8400-e29b-41d4-a716-446655440000"
1. The reusable Go uploaderβ
This program uploads the ./db-dumps directory and prunes old snapshots. It is database-agnostic β you never edit it; it just uploads whatever the dump script produced.
Create lh-backup/main.go:
package main
import (
"log"
"os"
"strconv"
"strings"
"time"
sdkclient "github.com/lighthouse-web3/baas-go-sdk/client"
sdktypes "github.com/lighthouse-web3/baas-go-sdk/types"
)
func main() {
apiKey := mustEnv("LH_API_KEY")
workspaceID := mustEnv("LH_WORKSPACE_ID")
dumpDir := envOr("LH_DUMP_DIR", "./db-dumps")
description := envOr("LH_DESCRIPTION", "scheduled db backup "+time.Now().UTC().Format(time.RFC3339))
client, err := sdkclient.NewBackupClient(sdkclient.BackupClientOptions{
APIURL: "https://baas-api.lighthouse.storage",
APIKey: apiKey,
WorkspaceID: workspaceID,
})
if err != nil {
log.Fatalf("client init: %v", err)
}
// Upload the dump directory as a new snapshot.
snap, err := client.Backup([]string{dumpDir}, &sdktypes.BackupOptions{
Description: description,
Tags: parseTags(os.Getenv("LH_TAGS")), // e.g. "env=prod,db=app_db"
OnProgress: func(e sdktypes.ProgressEvent) {
log.Printf("[%s] %d/%d stored=%dB", e.Phase, e.Current, e.Total, e.StoredBytes)
},
})
if err != nil {
log.Fatalf("backup failed: %v", err)
}
log.Printf("β
snapshot %s β %d chunks, %d bytes", snap.SnapshotID, snap.TotalChunks, snap.TotalSize)
// Optional retention: keep the latest N snapshots (set LH_KEEP_LATEST=0 to skip).
if keep := envInt("LH_KEEP_LATEST", 14); keep > 0 {
res, err := client.PruneSnapshots(sdktypes.PruneRequest{
KeepLatest: &keep,
DryRun: false,
})
if err != nil {
log.Printf("prune warning: %v", err) // don't fail the backup over a prune error
} else {
log.Printf("π§Ή pruned %d old snapshot(s), keeping latest %d", res.Count(), keep)
}
}
}
func mustEnv(k string) string {
v := os.Getenv(k)
if v == "" {
log.Fatalf("missing required env var %s", k)
}
return v
}
func envOr(k, def string) string {
if v := os.Getenv(k); v != "" {
return v
}
return def
}
func envInt(k string, def int) int {
if v := os.Getenv(k); v != "" {
if n, err := strconv.Atoi(v); err == nil {
return n
}
}
return def
}
func parseTags(s string) map[string]string {
tags := map[string]string{}
for _, pair := range strings.Split(s, ",") {
pair = strings.TrimSpace(pair)
if pair == "" {
continue
}
if k, v, ok := strings.Cut(pair, "="); ok {
tags[strings.TrimSpace(k)] = strings.TrimSpace(v)
}
}
return tags
}
Build it once into a static binary:
cd lh-backup
go mod init lh-backup && go mod tidy
go build -o ../bin/lh-backup .
cd ..
You now have ./bin/lh-backup. It needs only LH_API_KEY and LH_WORKSPACE_ID (plus optional LH_DUMP_DIR, LH_TAGS, LH_DESCRIPTION, LH_KEEP_LATEST).
command 'go' not found?If you installed Go from the official tarball into /usr/local/go, non-interactive shells (and CI) won't have it on PATH. Call the toolchain by absolute path: /usr/local/go/bin/go build -o ../bin/lh-backup .. Building once into a static binary is the point β the scheduler then runs bin/lh-backup directly and never needs Go installed at runtime. See Troubleshooting.
2. The dump-and-upload scriptβ
This wrapper is the only file you customize per database. Edit make_dump() with the command from your database tutorial; everything else stays the same.
Create backup.sh:
#!/usr/bin/env bash
set -euo pipefail
# ββ Paths βββββββββββββββββββββββββββββββββββββββββββββ
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
export LH_DUMP_DIR="${ROOT}/db-dumps"
UPLOADER="${ROOT}/bin/lh-backup"
mkdir -p "$LH_DUMP_DIR"
# ββ Credentials & policy (or set these in the environment) β
: "${LH_API_KEY:?set LH_API_KEY}"
: "${LH_WORKSPACE_ID:?set LH_WORKSPACE_ID}"
export LH_TAGS="${LH_TAGS:-env=prod,db=app_db}"
export LH_KEEP_LATEST="${LH_KEEP_LATEST:-14}"
# ββ Stage 1: create the dump ββββββββββββββββββββββββββ
# Replace the body with the command from your database tutorial.
make_dump() {
export PGPASSWORD="${DB_PASSWORD:?set DB_PASSWORD}"
pg_dump --host=127.0.0.1 --port=5432 --username=postgres \
--format=custom --file="${LH_DUMP_DIR}/app.dump" app_db
}
echo "[$(date -u +%FT%TZ)] creating dumpβ¦"
make_dump
# ββ Stage 2: upload + prune via the SDK βββββββββββββββ
echo "[$(date -u +%FT%TZ)] uploading to Lighthouseβ¦"
"$UPLOADER"
echo "[$(date -u +%FT%TZ)] backup complete."
Make it executable and test it by hand first:
chmod +x backup.sh
./backup.sh
You should see the dump run, an upload with a snapshot β¦ line, and a prune summary. Confirm the snapshot in the portal under Backup Sources (how to view).
Per-database make_dump() bodiesβ
Drop in the one that matches your database (full context in each tutorial):
# PostgreSQL
make_dump() {
export PGPASSWORD="$DB_PASSWORD"
pg_dump --host=127.0.0.1 --port=5432 --username=postgres \
--format=custom --file="${LH_DUMP_DIR}/app.dump" app_db
}
# MySQL / MariaDB
make_dump() {
mysqldump --host=127.0.0.1 --port=3306 --user=root --password="$DB_PASSWORD" \
--single-transaction --quick --routines --triggers \
app_db > "${LH_DUMP_DIR}/app.sql"
}
# SQLite
make_dump() {
sqlite3 /path/to/app.db ".backup '${LH_DUMP_DIR}/app.sqlite'"
}
# MongoDB
make_dump() {
mongodump --uri="$MONGO_URI" --archive="${LH_DUMP_DIR}/app.archive" --gzip
}
# Amazon DynamoDB
make_dump() {
aws dynamodb scan --table-name app_table --output json > "${LH_DUMP_DIR}/app_table.json"
}
# Amazon S3
make_dump() {
aws s3 sync s3://your-bucket "${LH_DUMP_DIR}/your-bucket" --delete
}
3. Schedule itβ
Option A β cron (simplest)β
Store secrets in a file the scheduler sources, so they aren't committed or visible in ps. Create /etc/lighthouse-backup.env (mode 600):
LH_API_KEY=lh_xxxxxxxxxxxxxxxxxxxxxxxx
LH_WORKSPACE_ID=550e8400-e29b-41d4-a716-446655440000
DB_PASSWORD=your_db_password
LH_TAGS=env=prod,db=app_db
LH_KEEP_LATEST=14
Add a crontab entry to run nightly at 02:30 and log output:
# crontab -e
30 2 * * * set -a; . /etc/lighthouse-backup.env; set +a; /opt/app/backup.sh >> /var/log/lh-backup.log 2>&1
set -a; . file; set +a exports every variable from the env file for the duration of the job.
Option B β systemd timer (better logging & reliability)β
/etc/systemd/system/lh-backup.service:
[Unit]
Description=Lighthouse database backup
After=network-online.target
Wants=network-online.target
[Service]
Type=oneshot
EnvironmentFile=/etc/lighthouse-backup.env
WorkingDirectory=/opt/app
ExecStart=/opt/app/backup.sh
/etc/systemd/system/lh-backup.timer:
[Unit]
Description=Run Lighthouse database backup nightly
[Timer]
OnCalendar=*-*-* 02:30:00
Persistent=true
[Install]
WantedBy=timers.target
Adjust OnCalendar to taste β for example, every hour on the hour:
OnCalendar=*-*-* *:00:00
A systemd service does not inherit your login shell's PATH. Reference the uploader and dump tools by absolute path inside backup.sh (the template already resolves bin/lh-backup from the script's own directory), and keep secrets in EnvironmentFile= as shown.
Enable and start:
sudo systemctl daemon-reload
sudo systemctl enable --now lh-backup.timer
systemctl list-timers lh-backup.timer # confirm next run
journalctl -u lh-backup.service -n 50 # view logs after it runs
Persistent=true means a missed run (machine was off) fires on next boot.
How incremental backups keep this cheapβ
Each run overwrites the same dump file, and the SDK:
- chunks the dump (FastCDC) and uploads only chunks it hasn't seen before (dedup),
- compresses packs before upload,
- creates a fresh snapshot that references both reused and new chunks.
So a daily job on a slowly-changing database uploads only the delta, while every snapshot remains a complete, independent restore point.
Verify & recoverβ
- List snapshots / inspect / restore: see Upload Backup Data and Manage Snapshots.
- Run a recovery drill on a schedule (e.g. monthly): restore the latest snapshot into a scratch directory and load it into a throwaway database, following the restore steps in your database tutorial.
Hitting errors?β
A few you may run into the first time you wire this up:
413 Storage limit exceededβ even if the workspace shows free space. Prune old snapshots (LH_KEEP_LATEST), or point the job at a fresh workspace (LH_WORKSPACE_ID), which starts at 0 bytes used.command 'go' not foundin the timer/cron run β build the static binary and call it by absolute path.
Full list with fixes: Troubleshooting.
Operational checklistβ
- β
Test
./backup.shby hand before scheduling it. - β
Keep secrets in a
600-mode env file, never in the script or repo. - β
Use a dedicated API key per job with only
backup:write,backup:read,snapshots:read. - β Give the key a hard expiry and rotate it (see API Keys).
- β
Set
LH_KEEP_LATESTto bound storage against your workspace limit. - β
Alert on a non-zero exit code from
backup.sh(the script usesset -e). - β Run a periodic recovery drill β an untested backup is not a backup.