Secure ci/cd pipelines with B2sum and CLI

Ensuring file integrity is crucial in software development, especially when automating deployments
through CI/CD pipelines. One powerful tool for this purpose is b2sum
, a hashing utility based on
the BLAKE2 algorithm. Let's explore how you can leverage b2sum
in your command-line workflows to
enhance security and reliability.
Introduction to B2sum
b2sum
is a command-line utility implementing the BLAKE2 hashing algorithm. Compared to traditional
hashing algorithms like MD5 or SHA-1, BLAKE2 offers significant advantages:
- Speed: Faster than MD5, SHA-1, SHA-2, and SHA-3 on 64-bit platforms.
- Security: Provides security similar to SHA-3, including immunity to length extension attacks and indifferentiability from a random oracle.
- Versatility: Supports both 256-bit (BLAKE2s) and 512-bit (BLAKE2b) variants.
Blake2 variants
BLAKE2 comes in two main variants, each optimized for different use cases:
- BLAKE2b: Optimized for 64-bit platforms, producing hash values up to 512 bits. This is the
variant implemented by the
b2sum
utility found in GNU coreutils, making it ideal for modern server environments and CI/CD pipelines. - BLAKE2s: Optimized for 32-bit platforms and constrained environments, producing hash values up to 256 bits.
Installation
b2sum
comes pre-installed with GNU coreutils on most modern Linux distributions. To verify if it's
available and check its version:
b2sum --version
If b2sum
is not installed, you can typically install it as part of the coreutils
package:
-
Ubuntu/Debian:
sudo apt-get update sudo apt-get install coreutils
-
CentOS/RHEL:
sudo yum install coreutils
-
macOS (using Homebrew):
brew install coreutils
Note: On macOS, coreutils commands are often prefixed with
g
(e.g.,gb2sum
) to avoid conflicts with native BSD utilities. You might need to adjust your PATH or use the prefixed command.
Integrating B2sum into ci/cd pipelines
Integrating b2sum
into your CI/CD pipeline involves generating hashes for build artifacts and
verifying them at deployment time. Here's a practical approach:
Step 1: Generate hashes
After building your artifacts (e.g., a compiled binary, a zipped archive), generate a BLAKE2b hash and save it to a file:
# Example: generate hash for my-app.tar.gz
b2sum my-app.tar.gz > my-app.tar.gz.b2
This command calculates the BLAKE2b hash of my-app.tar.gz
and redirects the output (the hash and
the filename) to my-app.tar.gz.b2
. Store this .b2
file securely alongside your artifact, perhaps
in an artifact repository or secure storage.
Step 2: Verify hashes
Before deploying or using the artifact, verify its integrity using the generated hash file:
# Example: verify the integrity of my-app.tar.gz using its hash file
b2sum -c my-app.tar.gz.b2
The -c
(or --check
) flag tells b2sum
to read hash sums from the specified file and check them.
If the file my-app.tar.gz
matches the hash stored in my-app.tar.gz.b2
, b2sum
will output:
my-app.tar.gz: OK
If the file has been tampered with, is corrupted, or is missing, b2sum
will report an error and
exit with a non-zero status code, which can be used to halt the CI/CD pipeline.
Ci/cd integration example
Here's a practical example using GitHub Actions to integrate b2sum
into your workflow:
name: Verify Build Artifacts
on: [push, pull_request]
jobs:
build_and_verify:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v4 # Use the latest major version
- name: Set up environment
# Add steps to set up your build environment (e.g., install Node.js, Java, etc.)
run: echo "Setting up build environment..."
- name: Build artifact
run: |
echo "Building application..."
# Replace with your actual build commands
mkdir -p dist
echo "Build output" > dist/app.txt
# Create an archive of the build output
tar -czf artifact.tar.gz ./dist
- name: Generate hash
id: generate_hash # Give the step an ID to reference its output
run: |
b2sum artifact.tar.gz > artifact.tar.gz.b2
echo "Generated hash file artifact.tar.gz.b2"
# Optionally, output the hash value itself for logging
HASH_VALUE=$(cut -d' ' -f1 artifact.tar.gz.b2)
echo "hash_value=$HASH_VALUE" >> $GITHUB_OUTPUT
- name: Verify hash
run: |
echo "Verifying hash for artifact.tar.gz..."
b2sum -c artifact.tar.gz.b2
echo "Verification successful!"
- name: Upload artifact and hash
uses: actions/upload-artifact@v4 # Use the latest major version
with:
name: build-artifact
path: |
artifact.tar.gz
artifact.tar.gz.b2
retention-days: 7 # Optional: Adjust artifact retention period
This workflow:
- Checks out the source code.
- Sets up the build environment (placeholder).
- Builds the application and creates a
tar.gz
archive. - Generates a BLAKE2 hash of the archive and saves it to a
.b2
file. It also outputs the hash value itself. - Verifies the hash immediately to ensure the artifact wasn't corrupted during the process.
- Uploads both the artifact (
artifact.tar.gz
) and its hash file (artifact.tar.gz.b2
) for later use in deployment stages.
Automating integrity checks
For more complex scenarios or reuse across different pipelines, you can create a robust shell script to handle integrity verification:
#!/bin/bash
# Verify-integrity.sh: checks the integrity of a file using its corresponding .b2 hash file.
set -euo pipefail # Exit on error, undefined variable, or pipe failure
ARTIFACT_PATH=$1
HASH_FILE="${ARTIFACT_PATH}.b2"
# Check if artifact path is provided
if [ -z "$ARTIFACT_PATH" ]; then
echo "Usage: $0 <path/to/artifact>"
exit 1
fi
# Check if artifact file exists
if [ ! -f "$ARTIFACT_PATH" ]; then
echo "Error: Artifact file '$ARTIFACT_PATH' not found."
exit 1
fi
# Check if hash file exists
if [ ! -f "$HASH_FILE" ]; then
echo "Error: Hash file '$HASH_FILE' not found."
exit 1
fi
echo "Verifying integrity of '$ARTIFACT_PATH' using '$HASH_FILE'..."
# Perform the check
if b2sum --quiet -c "$HASH_FILE"; then
echo "Integrity check PASSED for '$ARTIFACT_PATH'."
exit 0
else
echo "Integrity check FAILED for '$ARTIFACT_PATH'!"
# b2sum already prints detailed error messages when check fails
exit 1
fi
Save this as verify-integrity.sh
, make it executable with chmod +x verify-integrity.sh
, and use
it in your pipeline scripts:
# Example usage in a ci/cd script after downloading the artifact and hash file
./verify-integrity.sh path/to/downloaded/my-app.tar.gz
# The script will exit with 0 on success, non-zero on failure
Error handling best practices
When implementing hash verification in your CI/CD pipeline, consider these error handling best practices:
- Fail Fast: Configure your pipeline to stop immediately if a hash verification fails. This prevents potentially corrupted or tampered artifacts from being deployed.
- Detailed Logging: Log the verification attempt, the expected hash (if easily available), and
the outcome.
b2sum -c
provides useful error messages on failure; ensure these are captured in your CI/CD logs. - Notification System: Integrate with your notification system (e.g., Slack, email) to alert the relevant team immediately when an integrity check fails.
- Secure Hash Storage: Ensure the
.b2
hash files are stored securely and cannot be easily tampered with. Storing them alongside artifacts in a repository that tracks versions or has immutability features is a good practice. Consider signing the hash files if extra security is needed. - Handle Missing Files: Ensure your scripts gracefully handle cases where either the artifact or the hash file is missing, providing clear error messages.
Troubleshooting common issues
- Hash Mismatch: This is the primary failure mode, indicating the file content has changed since
the hash was generated. Causes include:
- File corruption during transfer or storage.
- Intentional or unintentional modification of the artifact after hashing.
- Generating the hash on a different version of the file than the one being checked.
- Solution: Re-download or retrieve the original artifact and hash file. If the issue persists, investigate potential corruption sources or rebuild the artifact from source.
b2sum: command not found
: Theb2sum
utility is not installed or not in the system's PATH within the CI/CD execution environment.- Solution: Ensure the
coreutils
package (or equivalent) is installed in your CI/CD runner environment (e.g., Docker image, VM). See the Installation section.
- Solution: Ensure the
- Permission Issues: The CI/CD process might lack the necessary file system permissions to read
the artifact or the hash file.
- Solution: Verify the permissions and ownership of the files and the execution context of the
b2sum
command.
- Solution: Verify the permissions and ownership of the files and the execution context of the
- Line Ending Differences: For text files, differences in line endings (CRLF vs. LF) between the
environment where the hash was generated and where it's verified can cause mismatches.
- Solution: Ensure consistent line endings, often by configuring Git (
core.autocrlf
) or build tools appropriately. Hashing binary archives (like.zip
or.tar.gz
) usually avoids this issue.
- Solution: Ensure consistent line endings, often by configuring Git (
- Incorrect Hash File Format: The
.b2
file should contain the hash output exactly as produced byb2sum
(hash followed by filename). Manual editing can corrupt this format.- Solution: Regenerate the hash file using the standard
b2sum artifact > artifact.b2
command.
- Solution: Regenerate the hash file using the standard
Integration with Transloadit
Transloadit's /file/hash
Robot supports multiple hashing algorithms, including BLAKE2b
(b2
). You can integrate hashing directly into your file processing workflows. Here's an example
Assembly using BLAKE2:
{
"steps": {
":original": {
"robot": "/upload/handle"
},
"hashed": {
"use": ":original",
"robot": "/file/hash",
"algorithm": "b2"
}
}
}
After the Assembly executes, the BLAKE2b hash value for the processed file will be
available in the Assembly result JSON, typically under the results
object for the
hashed
Step, within the file.meta.hash
field. This allows you to generate hashes as
part of automated upload and processing pipelines.
Conclusion and best practices
Using b2sum
in your CI/CD pipeline significantly enhances security and reliability by providing
strong guarantees about file integrity. By generating and verifying BLAKE2 hashes, you can detect
accidental corruption or malicious tampering of your build artifacts before they reach production.
Key best practices include:
- Generate hashes immediately after artifact creation.
- Store hash files securely alongside or linked to their corresponding artifacts.
- Automate hash verification as a mandatory step before deployment or artifact consumption.
- Use BLAKE2b (
b2sum
) for its speed and security advantages on modern systems. - Implement robust error handling and notifications for verification failures.
- Consider using signed hashes or checksums stored in secure manifests for critical applications.
By implementing these practices, you build a more trustworthy and secure deployment pipeline.
At Transloadit, we leverage robust hashing algorithms like BLAKE2 within our Robot ecosystem as part of our file processing services, helping ensure file integrity for our users.