Create diagram as code in Python

In the previous post, we explored my custom ClickHouse backup agent, built upon the clickhouse-backup tool, logrotate, Cron and Bash scripts. I have also shared all the necessary resources for testing the agent on your local machine using Docker as well as Docker Compose or deploying it in a production environment. Let’s update the agent’s repo with some Python code.

You may be familiar with a main GitOps principle: use Git as the single source of truth; store your applications and infrastructure configurations in a Git repository along with application code. Kubernetes (yaml), Terraform (tf), Docker, Compose files, Jenkinsfile and even diagrams can be good examples of files kept in such repositories. But how to represent diagrams? As png, vsd or jpeg? Let’s pretend we’re developers and can draw diagrams using code.

The diagrams project brings this approach to life. I opted for Diagrams (mingrammer) because it’s free and built on Python and Graphviz, widely used language and tool that enable you to create various diagrams, whether it’s a flowchart or a cloud architecture. Another advantage is that the project is actively maintained and continuously developed. You can also check out other tools such as pyflowchart, mermaid, plantuml or terrastruct.

Let’s get started and draw a flowchart for the clickhouse backup agent using Diagrams (mingrammer). First, install Python (>3.7; mine is 3.11) and Graphviz (9.0.0, Windows in my env), then install diagrams module (0.23.4).

Diagrams include the following objects: node (=shapes; programming, azure, custom and others), edge (=connection lines; linkage between nodes), cluster (=group of isolated nodes) and diagram (represents your entire chart). Each object has it’s own attributes. Description of all attributes can be found at Graphviz docs. Also, check out basic examples to understand what we gonna “build”. I won’t describe every attribute. DYOR.

The first line of your code might look like this:

# import required modules
from diagrams import Diagram, Edge, Cluster, Node

Then we define attributes for each object (excerpt):

# define attributes for graphviz components
graph_attributes = {
    "fontsize": "9",
    "orientation": "portrait",
    "splines":"spline"
}

Next, we need to describe diagram object and it’s attributes (excerpt):

with Diagram(show=False, outformat="png", graph_attr=graph_attributes, direction="TB"):
    # nodes and icons
    start_end_icon = "diagram/custom-images/start-end.png"
    start = Node (label="Start", image=start_end_icon, labelloc="c", height="0.4", weight="0.45", **node_attributes)

I use general Node class with custom images which were taken from programming nodes and then optimized to my flowchart (I’ve deleted canvas and resized images). You could safely use diagrams.programming.flowchart node class instead, but be ready to play with height/width node’s attributes. Another way to add your own images as nodes is Custom node class.

We have described icons and shared nodes. Now we need to add the first group of nodes to represent the main process of the agent and flowchart (creating and uploading FULL backups):

# cluster/full backup
    with Cluster("main", graph_attr=graph_attributes):
       diff_or_full = Node (label="TYPE?", image=decision_icon, height="0.7", weight="", labelloc="c", **node_attributes )

Subroutine processes (diff backups and etc.) are clusters (excerpt):

# cluster/diff backup
    with Cluster("diff", graph_attr=graph_attributes):
      create_diff_backup = Node (label="Create DIFF", labelloc="c", height="0.5", weight="4", image=action_icon, **node_attributes)

Edges or connections between nodes are defined at the bottom (excerpt):

# Log connections
    diff_or_full - Edge(label="\n\n wrong type", tailport="e", headport="n", **edge_attributes ) - write_error 

As a result, I’ve updated the repo with diagram as code; slightly modified GitHub actions by adding a new step to “draw” diagram and check python code. When I push new commits to the repo, the diagram is created and published as an artifact with nodes (start, end, condition, action, catch, input/output), four clusters (main, diff, log, upload log) and edges between nodes.

Looks pretty good, doesn’t it?

Simple ClickHouse backup agent

Learn how to backup and restore ClickHouse data

In today’s data-driven world, managing and protecting your data is paramount. We have already learned how to backup PostgreSQL here. And now it’s our turn to discuss ClickHouse, an open-source columnar database management system, that is renowned for it’s high performance and scalability.

I will explain a Dockerized cron job designed to facilitate the backup of ClickHouse databases, whether they are located on a single host or within a clustered environment with shards and replicas.

The agent is based on the Alpine Docker image and uses the clickhouse-backup tool, which has the ability to work as a REST API service. Logrotate has been added to manage the log files produced by the backup agent.

The agent does the following:

  • creates scheduled FULL or DIFF backups (POST to /backup/create)
  • checks “create backup” action status before every upload (GET to /backup/status)
  • uploads each backup to a remote storage (POST to /backup/upload/)
  • checks and waits until upload operation finishes (GET to /backup/actions)
  • manages log file with API responses and errors
  • generates customized output to standard container logs
  • if a backup is not uploaded to remote storage, it’s marked as failed and will not be used as the last backup for subsequent DIFF backups

Important: according to the clickhouse-backup official FAQ, “incremental backups calculate increment only during executing upload or create_remote command or similar REST API request”. In other words, DIFF and FULL local backups are actually the same (clickhouse-backup list local). Clickhouse-backup creates local backups first before uploading them to remote storage.

If you list remote backups using the command (clickhouse-backup list remote), you will notice the distinction between these two backup types. This is why the agent only issues a warning when you attempt to create a DIFF backup for the first time without having any prior FULL backups

Default settings:

  • DIFF backups: every hour from Monday through Friday and Sunday, plus every hour from 0 through 20 on Saturday
  • FULL backups: every Saturday at 8.30 PM
  • Rotate and compess logs weekly, rotated 14 times before being removed
  • Clickhouse-backup API basic authentication is enabled (rlAPIuser)
  • Clickhouse server authentication is enabled (rlbackup)
  • Remote storage is ftp with authentication enabled
  • Backups to keep local: 6
  • Backups to keep remote: 336

Content

  • docker-compose.yml – describes environment to test the agent locally.
    There are the following services:
    – clickhouse server (clickhouse-server:23.8-alpine)
    – clickhouse-backup (altinity/clickhouse-backup:2.4.0)
    – our clickhouse-backup-agent (ch-backup-agent)
    – ftpd_server (stilliard/pure-ftpd)
  • ./clickhouse/clickhouse-backup-config.yml – clickhouse-backup config file
  • ./agent/Dockerfile – backup agent’s docker image
  • ./agent/ch-backup-logrotate.conf – logrotate config file
  • ./agent/clickhouse-backup.sh – script to define backup and upload steps
  • ./agent/cronfile – cron job backup and logrotate tasks
  • ./github/workflows/docker-image.yml – simple GitHub action to build agent’s docker image on every Dockerfile change

Possible use cases

  • As a resource for learning Docker, Docker Compose, Bash, cron, and logrotate.
  • As a source of scripts, cron job tasks, or Docker files.
    Just grab them and you are all set.
  • As a sample of pairing ClickHouse-Backup and ClickHouse Server.

How to use

  • check out logrotate and cron settings in the agent folder
  • verify the Dockerfile in the agent folder (if docker is being used)
  • adjust clickhouse backup settings if necessary (./clickhouse/clickhouse-backup-config.yml) Change credentials, clickhouse host and remote storage at least
  • clickhouse-backup API container or standalone service shoud have access to /var/clickhouse/ folders to create backup successfully. In case of a container, see docker-compose.yml. If your clickhouse-backup API is a Linux service, run the service on the first replica for each shard, and then update cronfile accordingly.
  • copy cron and script files to a remote host, and then make a test run
  • in the case of using Docker, please check the docker-compose.yml file and remove any unnecessary services (such as clickhouse and ftp). Afterward, run docker-compose up -d –build to get containers started
  • use docker logs or docker compose logs to check service logs.
    Log files are also located under the /var/log/clickhouse-backup/ folder

Since the ch-backup-agent image already comes with the clickhouse-backup application pre-installed, you have the option to run both the API and scripts within the agent’s container. To achieve this, you should create a volume mapping for the ./clickhouse/clickhouse-backup-config.yml file and then start the clickhouse-backup API server as a service.

Some excerpts

Function to create full backup:

function create_full_backup {

    # CREATE
    curl -u "$API_USERNAME:$API_PASSWORD" -s -X POST \
          http://"${SERVER_ADDRESS}":7171/backup/create?name="${FULL_BACKUP_NAME}" | jq . >> "$BACKUP_LOG"
    BACKUP_NAME="${FULL_BACKUP_NAME}"
    check_backup_status

    # UPLOAD
    check_upload_status
}

Function to check upload status

function check_upload_status {

        # Catch In Progress status
        while [ "$(check_upload_req | jq -r .status )" == "in progress" ]; do
            echo -e "\n${GREEN}[INFO]${OFF} ${BOLD} Upload of the ${BACKUP_NAME} is still in progress...${OFF}"
            sleep 1m
        done

        # Catch Error status
        if [ "$(check_upload_req | jq -r .status )" == "error" ]; then
            echo -e "${RED}::::[ERROR]${OFF} ${BOLD} Couldn't upload the backup ${BACKUP_NAME}:${OFF}"
            {
            printf '\n%s\n' "UPLOAD ERROR:"
            check_upload_req | jq -r .error
            printf '%s\n' "-------------"
            } | tee -a "$BACKUP_LOG"
            return 1

        # Catch Success status
        elif [ "$(check_upload_req | jq -r .status)" == "success" ]; then
            echo -e "\n${GREEN}[INFO]${OFF} ${BOLD} The ${BACKUP_NAME} is now the last since it's just been uploaded successfully${OFF}"
            touch "${LAST_BACKUP_NAME}"
            echo -n "${BACKUP_NAME}" > "${LAST_BACKUP_NAME}"
        fi
}

Useful commands

While I have tested every command below, I strongly encourage you to learn about clickhouse-backup and its FAQ before proceeding with any restore operations. Your setup may have unique configurations and requirements!

Create a FULL backup manually

/backup/config/clickhouse-backup.sh create_full_backup clickhouse-backup-API-address APIUSER APIPASSWORD

In case you have a cluster, repeat the command with different API address

Create a DIFF backup manually

/backup/config/clickhouse-backup.sh create_diff_backup clickhouse-backup-API-address APIUSER APIPASSWORD

In case you have a cluster, repeat the command with different API address

Delete ALL REMOTE backups one-liner (if you have old ClickHouse version):

for rembackup in $(clickhouse-backup list remote | tail -n +2 | awk '{print $1}'); do 
     clickhouse-backup delete remote "$rembackup"
done

Delete ALL REMOTE backups one-liner:

for rembackup in $(clickhouse-backup list remote| awk '{print $1}'); do
    clickhouse-backup delete remote "$rembackup"
done

Delete ALL LOCAL backups one-liner:

for localbackup in $(clickhouse-backup list local | awk '{print $1}'); do
    clickhouse-backup delete local "$localbackup"
done

add tail -n +2 if your ClickHouse has old version

Restore a single table data:

clickhouse-backup restore <backup name> --data --table db.table_name_shard

If you’re getting “warn <table name> skipped cause system.replicas entry already exists and replication in progress from another replica logger=clickhouse”, try to run the previous command with CLICKHOUSE_CHECK_REPLICAS_BEFORE_ATTACH=0

If you need to restore schema, use –schema parameter and –rm to drop existing schema objects: clickhouse-backup restore –rm<backup name> –schema or delete replica metadata in Zookeeper beforehand.

Result

ClickHouse Backup agent First Run
Noticed the warning? It’s not critical. Just FYI
Clickhouse Backup agent, Errors and warnings
A sample output with error, warning and info messages
ClickHouse Backup Agent Log Files
This is how the log file appears. All you need – errors and requests’ output
ClickHouse Backup agent DIFF backup create action
DIFF backup created successfully, therefore the last backup name had been updated as well