
In the last part, we explored how to configure the S3 interface and set up authentication in Ceph. With credentials in hand, it’s time to connect our legacy backup servers. We are going with Kopia for this!
This is where most administrators hit their first real-world challenge. The storage is ready, the S3 endpoint is up, but our backup tools may be older than the idea of object storage itself. Bacula, Amanda, Duplicity, or even rsync-based scripts were not designed with S3 in mind. Yet, replacing them overnight isn’t realistic. That’s why bridging these tools to Ceph S3 is so important.
If you missed the earlier parts, check out:
Getting servers Ready for Ceph backups
Setting up credentials for Ceph
Why Old Servers Make Perfect Ceph Candidates
Before diving into the deployment, let’s talk about why your aging hardware is actually ideal for this project.
Ceph is designed to handle hardware failures gracefully. Those older servers that might fail more frequently? Ceph expects that and plans for it. Plus, you’ll learn more about distributed systems when you occasionally need to handle node failures.
Think of it this way: Instead of buying expensive new hardware to learn Ceph, you’re getting hands-on experience with real-world scenarios where hardware isn’t perfect.
Step 1: Connect Kopia to Ceph S3
Now that Ceph S3 is ready with authentication, we’ll use Kopia as our backup client. Kopia is a modern, open source backup solution that supports snapshots, encryption, deduplication, and most importantly for us, direct S3 storage backends. Unlike many legacy tools, Kopia can talk to custom S3 endpoints without extra proxies.
Duplicity accepts an environment variable like
AWS_ENDPOINT_URL
.but for the ease, clarity and direction with the deployment, we will guide you through the Kopia setup.
Kopia is a lightweight, fast backup tool that loves deduplication, compression, and encryption. Ceph is a popular distributed object store that can scale horizontally with your storage needs. Together, they handle snapshots, backup routines, and restore operations; whether you’re protecting office documents or container volumes
Create the Repository on Ceph S3
On your backup server, install Kopia (binaries are available for Linux). Lets start with our case, we used ubuntu so our installation was with the following steps:
# Add Kopia APT repository curl -s https://kopia.io/signing-key | sudo gpg --dearmor -o /usr/share/keyrings/kopia-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/kopia-keyring.gpg] https://packages.kopia.io/apt/ stable main" | sudo tee /etc/apt/sources.list.d/kopia.list # Update and install sudo apt update sudo apt install kopia
But for the RHEL systems – things are not so much different, here is the list of commands:
# Add Kopia YUM repository sudo rpm --import https://kopia.io/signing-key cat <<EOF | sudo tee /etc/yum.repos.d/kopia.repo [kopia] name=Kopia Repository baseurl=https://packages.kopia.io/yum/ enabled=1 gpgcheck=1 gpgkey=https://kopia.io/signing-key EOF # Install sudo yum install kopia
Once installed, you can initialize a repository directly into a Ceph S3 bucket:
kopia repository create s3 \ --bucket=legacy-backups \ --access-key=YOUR_ACCESS_KEY \ --secret-access-key=YOUR_SECRET_KEY \ --endpoint=https://rgw.yourdomain.com \ --region=us-east-1
- bucket: The bucket name you created in Ceph (for example,
legacy-backups
). - access-key / secret-access-key: The credentials you generated earlier with
radosgw-admin
. - endpoint: The Ceph RGW URL.
- region: Kopia requires a region name even if Ceph doesn’t enforce one. You can safely use
us-east-1
or any placeholder.
So once that is done the major part of the setup is clear – we can test the connection using:
kopia repository connect s3 \ --bucket=legacy-backups \ --access-key=YOUR_ACCESS_KEY \ --secret-access-key=YOUR_SECRET_KEY \ --endpoint=https://rgw.yourdomain.com \ --region=us-east-1
So you are set with your backup processes:

So now we have the setup ready – lets create our first snapshot or backup!!
kopia snapshot create /etc
This creates a backup of your /etc folder and upload it to the s3 backup.
You can see the list of backups here at:
kopia snapshot list
This is why we love Kopia, it works almost like aws cli or a rsync setup. But in a more advanced way.
So you may ask, how easy is it to restore – it is just this:
kopia snapshot restore <snapshot-id> /tmp/restore-test
This restores the snapshot id to /tmp/restore-test
You can later download it off or replace your existing files.
Best Practice
Create a dedicated bucket per backup server. This makes it easier to manage policies, clean up old data, and isolate access. Kopia repositories aren’t designed to be casually shared between servers, so separation avoids accidental corruption.

Automating Backups with Kopia
Manually running kopia snapshot create
is fine for testing, but real-world backups need to run automatically and consistently. Kopia doesn’t run as a background daemon by default, so we rely on system schedulers like cron or systemd timers to handle automation.
Option 1: Using Cron
Cron is simple and works across most Linux distributions. For example, to snapshot /home
every night at 2 AM, edit the crontab of the backup user:
0 2 * * * kopia snapshot create /home >> /var/log/kopia-backup.log 2>&1
Option 2: Using Systemd Timers
If you prefer systemd’s built-in scheduling, create a service file: /etc/systemd/system/kopia-backup.service
and add the code:
[Unit] Description=Kopia Backup Job [Service] ExecStart=/usr/bin/kopia snapshot create /home User=backup
Now that your service is setup – we can create the timer service on : /etc/systemd/system/kopia-backup.timer
[Unit] Description=Run Kopia Backup Daily [Timer] OnCalendar=02:00 Persistent=true [Install] WantedBy=timers.target
So once this is done enable both the services:
sudo systemctl enable –now kopia-backup.timer
and check the status using: systemctl list-timers | grep kopia
That should take care of the backups continually. Now you can go on to the retentions, you can do this using:
kopia policy set /home --keep-daily=7 --keep-weekly=4 --keep-monthly=6
So this will keep 7 days worth of day of daily backups, 4 weekly backups and 6 monthly backups. You can play with this and make changes.
As with any backups software, always keep tracking the log files of Kopia.
Tuning Ceph S3 for Kopia Workloads
At this stage, your backups are running, but performance and efficiency depend on how Ceph is tuned. Legacy-style file backups can stress S3 storage differently depending on file sizes, frequency, and concurrency. Kopia handles much of the optimization internally, but you should still configure Ceph S3 to match your workload.

Optimize Multipart Uploads
Large files (over 5 GB) must use multipart uploads. Kopia does this automatically, but make sure Ceph RGW has multipart uploads enabled and tested. If multipart upload limits are too strict, backups may stall or fail mid-transfer.
Adjust the rgw_max_chunk_size
in Ceph’s RGW configuration if you consistently handle very large files.
Handle Small Files Efficiently
If your datasets contain thousands of tiny files (logs, configs, reports), Kopia deduplicates them, but Ceph can still be stressed with object overhead.
Options to help:
- Group files before backup (archive into tarballs for especially noisy directories).
- Enable compression in Kopia so smaller objects are packed efficiently.
Enable versioning and safe cleanup in the bucket
# enable versioning on the bucket aws --endpoint-url https://rgw.yourdomain.com \ s3api put-bucket-versioning \ --bucket legacy-backups \ --versioning-configuration Status=Enabled
Plan Your Bucket Layout
Buckets are logical boundaries in Ceph. The safest approach is one bucket per repository. This avoids mixing unrelated data and makes cleanup easier if you retire a server.
- One bucket per server
- Shared buckets is possible, but always prefix paths (
server1/
,server2/
) to avoid accidental overwrite.
Leverage Lifecycle Policies
Ceph supports lifecycle policies similar to AWS S3. This allows you to automatically move or delete old objects. Pair this with Kopia’s retention policies for a two-layer cleanup system:
- Kopia prunes old snapshots.
- Ceph lifecycle transitions pruned objects to cold storage pools or deletes them entirely.
This reduces storage costs and keeps buckets lean.
You can try to create lifecycle.json
with simple expiration and multipart cleanup. Adjust day counts to your policy.
{ "Rules": [ { "ID": "expire-old-snapshots", "Status": "Enabled", "Filter": { "Prefix": "" }, "Expiration": { "Days": 90 } }, { "ID": "abort-incomplete-multipart", "Status": "Enabled", "AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 7 } } ] }
Now you can apply this to your Kopia setup:
aws --endpoint-url https://rgw.yourdomain.com \ s3api put-bucket-lifecycle-configuration \ --bucket legacy-backups \ --lifecycle-configuration file://lifecycle.json
With Kopia now tied into Ceph S3, you’ve taken your first full cycle of modern backups, from setup, to repository creation, to scheduling, and finally to tuning for performance. This bridges the gap between old-style backup servers and modern object storage without needing to abandon legacy infrastructure.
Ceph S3 isn’t just a storage pool; it’s the backbone for a flexible backup ecosystem. Kopia makes it practical, efficient, and secure.
In the next part, we’ll explore advanced optimizations and scaling strategies: how to handle very large deployments, integrate multiple backup servers, and keep performance steady under heavy load.
FAQ
Q: Can I run multiple Kopia repositories in the same bucket?
Not recommended. Kopia repositories expect exclusive control. Use one bucket per repository for safety.
Q: Do I need to worry about multipart uploads with Kopia?
No, Kopia handles this transparently. Just ensure Ceph’s RGW is not restricted in upload chunk size.
Q: What happens if I lose the Kopia repository password?
Backups become unreadable. Always store the password in a secure vault or password manager.
Q: Why do my backups feel slow even though Ceph is healthy?
Often it’s due to many small files. Bundle them before backup, or review Kopia compression and parallelism settings.
Q: Is it safe to back up directly as root?
Not ideal. Use a dedicated non-root backup user and grant specific read permissions. Root backups can lead to accidental overwrites and unsafe access patterns.