Simple ClickHouse backup agent

Learn how to backup and restore ClickHouse data


In today’s data-driven world, managing and protecting your data is paramount. We have already learned how to backup PostgreSQL here. And now it’s our turn to discuss ClickHouse, an open-source columnar database management system, that is renowned for it’s high performance and scalability.

I will explain a Dockerized cron job designed to facilitate the backup of ClickHouse databases, whether they are located on a single host or within a clustered environment with shards and replicas.

The agent is based on the Alpine Docker image and uses the clickhouse-backup tool, which has the ability to work as a REST API service. Logrotate has been added to manage the log files produced by the backup agent.

The agent does the following:

  • creates scheduled FULL or DIFF backups (POST to /backup/create)
  • checks “create backup” action status before every upload (GET to /backup/status)
  • uploads each backup to a remote storage (POST to /backup/upload/)
  • checks and waits until upload operation finishes (GET to /backup/actions)
  • manages log file with API responses and errors
  • generates customized output to standard container logs
  • if a backup is not uploaded to remote storage, it’s marked as failed and will not be used as the last backup for subsequent DIFF backups

Important: according to the clickhouse-backup official FAQ, “incremental backups calculate increment only during executing upload or create_remote command or similar REST API request”. In other words, DIFF and FULL local backups are actually the same (clickhouse-backup list local). Clickhouse-backup creates local backups first before uploading them to remote storage.

If you list remote backups using the command (clickhouse-backup list remote), you will notice the distinction between these two backup types. This is why the agent only issues a warning when you attempt to create a DIFF backup for the first time without having any prior FULL backups

Default settings:

  • DIFF backups: every hour from Monday through Friday and Sunday, plus every hour from 0 through 20 on Saturday
  • FULL backups: every Saturday at 8.30 PM
  • Rotate and compess logs weekly, rotated 14 times before being removed
  • Clickhouse-backup API basic authentication is enabled (rlAPIuser)
  • Clickhouse server authentication is enabled (rlbackup)
  • Remote storage is ftp with authentication enabled
  • Backups to keep local: 6
  • Backups to keep remote: 336

Content

  • docker-compose.yml – describes environment to test the agent locally.
    There are the following services:
    – clickhouse server (clickhouse-server:23.8-alpine)
    – clickhouse-backup (altinity/clickhouse-backup:2.4.0)
    – our clickhouse-backup-agent (ch-backup-agent)
    – ftpd_server (stilliard/pure-ftpd)
  • ./clickhouse/clickhouse-backup-config.yml – clickhouse-backup config file
  • ./agent/Dockerfile – backup agent’s docker image
  • ./agent/ch-backup-logrotate.conf – logrotate config file
  • ./agent/clickhouse-backup.sh – script to define backup and upload steps
  • ./agent/cronfile – cron job backup and logrotate tasks
  • ./github/workflows/docker-image.yml – simple GitHub action to build agent’s docker image on every Dockerfile change

Possible use cases

  • As a resource for learning Docker, Docker Compose, Bash, cron, and logrotate.
  • As a source of scripts, cron job tasks, or Docker files.
    Just grab them and you are all set.
  • As a sample of pairing ClickHouse-Backup and ClickHouse Server.

How to use

  • check out logrotate and cron settings in the agent folder
  • verify the Dockerfile in the agent folder (if docker is being used)
  • adjust clickhouse backup settings if necessary (./clickhouse/clickhouse-backup-config.yml) Change credentials, clickhouse host and remote storage at least
  • clickhouse-backup API container or standalone service shoud have access to /var/clickhouse/ folders to create backup successfully. In case of a container, see docker-compose.yml. If your clickhouse-backup API is a Linux service, run the service on the first replica for each shard, and then update cronfile accordingly.
  • copy cron and script files to a remote host, and then make a test run
  • in the case of using Docker, please check the docker-compose.yml file and remove any unnecessary services (such as clickhouse and ftp). Afterward, run docker-compose up -d –build to get containers started
  • use docker logs or docker compose logs to check service logs.
    Log files are also located under the /var/log/clickhouse-backup/ folder

Since the ch-backup-agent image already comes with the clickhouse-backup application pre-installed, you have the option to run both the API and scripts within the agent’s container. To achieve this, you should create a volume mapping for the ./clickhouse/clickhouse-backup-config.yml file and then start the clickhouse-backup API server as a service.

Some excerpts

Function to create full backup:

function create_full_backup {

    # CREATE
    curl -u "$API_USERNAME:$API_PASSWORD" -s -X POST \
          http://"${SERVER_ADDRESS}":7171/backup/create?name="${FULL_BACKUP_NAME}" | jq . >> "$BACKUP_LOG"
    BACKUP_NAME="${FULL_BACKUP_NAME}"
    check_backup_status

    # UPLOAD
    check_upload_status
}

Function to check upload status

function check_upload_status {

        # Catch In Progress status
        while [ "$(check_upload_req | jq -r .status )" == "in progress" ]; do
            echo -e "\n${GREEN}[INFO]${OFF} ${BOLD} Upload of the ${BACKUP_NAME} is still in progress...${OFF}"
            sleep 1m
        done

        # Catch Error status
        if [ "$(check_upload_req | jq -r .status )" == "error" ]; then
            echo -e "${RED}::::[ERROR]${OFF} ${BOLD} Couldn't upload the backup ${BACKUP_NAME}:${OFF}"
            {
            printf '\n%s\n' "UPLOAD ERROR:"
            check_upload_req | jq -r .error
            printf '%s\n' "-------------"
            } | tee -a "$BACKUP_LOG"
            return 1

        # Catch Success status
        elif [ "$(check_upload_req | jq -r .status)" == "success" ]; then
            echo -e "\n${GREEN}[INFO]${OFF} ${BOLD} The ${BACKUP_NAME} is now the last since it's just been uploaded successfully${OFF}"
            touch "${LAST_BACKUP_NAME}"
            echo -n "${BACKUP_NAME}" > "${LAST_BACKUP_NAME}"
        fi
}

Useful commands

While I have tested every command below, I strongly encourage you to learn about clickhouse-backup and its FAQ before proceeding with any restore operations. Your setup may have unique configurations and requirements!

Create a FULL backup manually

/backup/config/clickhouse-backup.sh create_full_backup clickhouse-backup-API-address APIUSER APIPASSWORD

In case you have a cluster, repeat the command with different API address

Create a DIFF backup manually

/backup/config/clickhouse-backup.sh create_diff_backup clickhouse-backup-API-address APIUSER APIPASSWORD

In case you have a cluster, repeat the command with different API address

Delete ALL REMOTE backups one-liner (if you have old ClickHouse version):

for rembackup in $(clickhouse-backup list remote | tail -n +2 | awk '{print $1}'); do 
     clickhouse-backup delete remote "$rembackup"
done

Delete ALL REMOTE backups one-liner:

for rembackup in $(clickhouse-backup list remote| awk '{print $1}'); do
    clickhouse-backup delete remote "$rembackup"
done

Delete ALL LOCAL backups one-liner:

for localbackup in $(clickhouse-backup list local | awk '{print $1}'); do
    clickhouse-backup delete local "$localbackup"
done

add tail -n +2 if your ClickHouse has old version

Restore a single table data:

clickhouse-backup restore <backup name> --data --table db.table_name_shard

If you’re getting “warn <table name> skipped cause system.replicas entry already exists and replication in progress from another replica logger=clickhouse”, try to run the previous command with CLICKHOUSE_CHECK_REPLICAS_BEFORE_ATTACH=0

If you need to restore schema, use –schema parameter and –rm to drop existing schema objects: clickhouse-backup restore –rm<backup name> –schema or delete replica metadata in Zookeeper beforehand.

Result

ClickHouse Backup agent First Run
Noticed the warning? It’s not critical. Just FYI
Clickhouse Backup agent, Errors and warnings
A sample output with error, warning and info messages
ClickHouse Backup Agent Log Files
This is how the log file appears. All you need – errors and requests’ output
ClickHouse Backup agent DIFF backup create action
DIFF backup created successfully, therefore the last backup name had been updated as well

2 thoughts on “Simple ClickHouse backup agent”

Leave a comment