Building a File Sync Daemon with a Real-Time Web Panel

April 30, 2026 · rclone · Python · Flask · systemd


I run a self-hosted media server backed by PikPak cloud storage. For a long time my workflow for pulling new content down to local disk was embarrassingly manual: SSH in, fire off an rclone copy, watch the terminal, repeat. It worked, but it didn’t scale — and it definitely didn’t survive a server reboot.

So I built auto_download_from_drive: a Linux daemon that watches rclone-mounted directories and automatically downloads new files, paired with a Flask web panel for configuration and real-time progress monitoring. This post walks through what the project does, how it’s designed, and how it came together over about three and a half weeks of focused work.


The Problem

rclone can mount remote cloud storage as a local FUSE filesystem. Once mounted, directories look like ordinary folders to any program on the host. What they don’t do on their own is automatically copy new files to local storage when they appear.

My requirements were narrow but specific:

  • One-way incremental sync — copy new files from the remote mount to a local path. Don’t touch files that were already there when the daemon started.
  • Survive crashes and reboots — persistent state, no re-downloading files that already landed.
  • Multiple source/destination pairs — one daemon instance covering several different remote directories.
  • Visible progress — I want to see which files are transferring, how fast, and how many are queued.
  • No manual SSH sessions — manage everything through a web UI.

That’s a precise enough scope that none of the existing tools (Syncthing, rsync cron jobs, rclone’s own sync mode) fit cleanly. Syncthing is bidirectional. rsync cron jobs don’t give you live progress or a web UI. rclone’s sync mode doesn’t have a built-in daemon. So: build it.


Architecture Overview

The system has two components that share no IPC beyond the filesystem:

[ Caddy (TLS reverse proxy) ]
        ↓ :5000
[ web-panel.service  (Gunicorn/gevent, 1 worker) ]
        ↓ reads JSON files
[ /opt/sync/*.json  (config, state, transfers, log) ]
        ↑ writes JSON files
[ sync.service  (sync_daemon.py) ]
        ↓ rclone copy --rc ...
[ rclone-pikpak.service  (FUSE mount) ]

[ Remote cloud storage (PikPak) ]

sync_daemon.py is the workhorse. It’s a multi-threaded Python process that runs as a systemd service. It scans source paths, queues new files, and runs rclone copy workers in parallel.

web_panel/ is a Flask + Flask-SocketIO application that reads the daemon’s JSON files and streams real-time progress to the browser. It’s strictly read-only except for config.json, and writing config triggers a daemon restart automatically.

The file-only coordination model is intentional. It keeps the two processes completely independent — you can restart the web panel without touching the daemon, and the daemon runs fine with no web panel at all.


The Daemon: sync_daemon.py

Thread Model

The daemon spins up three categories of threads:

ThreadPurpose
Main loopPeriodic directory scan every scan_interval_seconds
Refresh timer (inline)Restarts the rclone mount service every rclone_refresh_interval_seconds
Worker pool (N)N = max_concurrent_downloads threads consuming the download queue

The main loop runs at 1-second resolution, checking whether it’s time to scan or refresh. Both actions are gated on a pause_event that gets set during mount refresh to prevent scans from running against a stale or offline FUSE mount.

File State Machine

Every discovered file lives in one of five states:

(first seen at startup) → baseline      ← skipped entirely, no download
(newly appeared later)  → pending
pending → success       → synced
pending → failure       → failed
failed  → retries hit   → permanent_failed

The baseline state is the key design choice. On first startup, the daemon snapshots whatever is already on the mount and marks all of it baseline. Only files that appear after that snapshot are ever downloaded. This prevents the daemon from backfilling terabytes of existing content on day one.

State is persisted in sync_state.json across restarts with a backup-rotate-replace pattern. If the primary file is corrupt, the daemon falls back to .json.bak before giving up.

rclone RC for Progress

Each download runs as:

rclone copy --rc --rc-addr=127.0.0.1:<port> <source> <dest>

The --rc flag enables rclone’s built-in HTTP remote control API on a dedicated port. While a transfer is active, the daemon writes its RC port to active_transfers.json. The web panel reads that file and polls each port every second to get transfer speed, progress percentage, and bytes transferred.

Port allocation is coordinated with a lock and a reserved-port set, scanning the range 5572–5582. If a port is already in use (detected via a connect probe), the daemon skips it and retries with the next available one.

Safe Mount Refresh

The FUSE mount can go stale or drop connections over time. The daemon handles this by periodically restarting the rclone systemd unit. But restarting the mount mid-transfer would kill active downloads, so the refresh logic is carefully gated:

  1. Check if any downloads are active or queued — if so, skip this refresh cycle entirely.
  2. Set pause_event to prevent new downloads from starting.
  3. Double-check the counters after pausing (race condition guard).
  4. Run systemctl restart <rclone_service_name>.
  5. Probe each enabled source path until ls returns successfully (up to 120 seconds).
  6. Clear pause_event to resume normal operation.

The probe uses a subprocess ls with a 5-second timeout rather than a direct os.listdir() call. This prevents the daemon from hanging indefinitely if the FUSE mount becomes unresponsive.

Graceful Shutdown

On SIGTERM:

  1. Set stop_event.
  2. Drain the pending queue (drop items that haven’t started yet).
  3. Wait up to 300 seconds for in-progress downloads to finish.
  4. Join worker threads.
  5. Save state.

This means a systemctl stop sync.service won’t interrupt a file that’s 90% transferred — it’ll wait for it to complete before exiting.

Thread-Safe Atomic Writes

Every JSON file write uses a write-to-temp-then-os.replace() pattern. The temp file uses a <pid>.<thread_id>.tmp suffix so that concurrent writes from different worker threads don’t collide:

def _write_json_atomic(self, path: Path, payload: object) -> None:
    tmp_path = path.with_suffix(f".{os.getpid()}.{threading.get_ident()}.tmp")
    with tmp_path.open("w", encoding="utf-8") as fp:
        json.dump(payload, fp, indent=2)
    os.replace(tmp_path, path)

os.replace() is atomic on POSIX systems — a reader will always see either the old complete file or the new complete file, never a partial write.


The Web Panel

The panel is a Flask + Flask-SocketIO application served by Gunicorn with a single gevent worker. It must be a single worker — Flask-SocketIO stores WebSocket state in-process, so multiple workers would split the connections across processes and break the real-time streaming.

Authentication

The panel uses a layered auth model:

  1. API key exchange: POST /api/auth with X-API-Key header returns a session cookie.
  2. Session cookie: HttpOnly, SameSite=Strict, sliding TTL (default 30 minutes). Subsequent requests use the cookie, not the API key.
  3. CSRF tokens: All unsafe HTTP methods (POST, PUT, PATCH, DELETE) require a valid X-CSRF-Token header plus matching Origin or Referer.
  4. Socket.IO auth: WebSocket connections are rejected without a valid session.
  5. Brute-force protection: Per-IP failure counting with configurable window, failure limit, and lockout duration. All configurable through .env.

If WEB_PANEL_API_KEY is unset at startup, the application raises a RuntimeError rather than silently degrading to an open panel.

Real-Time Progress

A single gevent background task runs for the lifetime of the process:

def background_progress_monitor():
    while True:
        transfers = read_active_transfers()
        progress = get_all_transfers_progress(transfers)
        socketio.emit('progress_update', progress)
        time.sleep(1)

It reads active_transfers.json, polls each rclone RC port, and broadcasts the result to all connected Socket.IO clients. The browser displays per-transfer speed, progress bars, and a live log tail.

Config Changes Trigger Daemon Restarts

POST /api/config writes config.json and then calls systemctl restart sync.service. This means you can change max_concurrent_downloads, add a sync rule, or adjust the bandwidth limit through the web UI, and the daemon picks it up immediately — no SSH required.


Installation

Production deployment is designed for a Debian/Ubuntu Linux host:

sudo ./start.sh

The installer:

  1. Creates /opt/sync/ and copies the project files.
  2. Creates a web-panel system user.
  3. Creates a Python virtualenv and installs dependencies.
  4. Generates a random WEB_PANEL_API_KEY and writes /opt/sync/web_panel/.env.
  5. Writes and enables sync.service and web-panel.service systemd units.
  6. Writes a sudoers drop-in so the panel user can run systemctl restart sync.service without a password.

For updating an existing installation without wiping config:

sudo ./update.sh

update.sh preserves config.json and .env across the update.

The panel binds to 127.0.0.1:5000 only. Expose it through Caddy or Nginx with TLS and IP allowlisting:

panel.example.com {
    @allowed remote_ip YOUR.IP.ADDRESS
    handle @allowed {
        reverse_proxy 127.0.0.1:5000
    }
    respond 403
}

How It Came Together

The entire project was built in roughly three and a half weeks, from late March to mid-April 2026.

Week 1: The Core Daemon (March 28)

The first commit landed on March 28 and was already 720 lines of sync_daemon.py — the baseline scanning logic, state machine, worker pool, and rclone invocation. Starting with a nearly complete daemon before touching the web layer was the right call. It forced me to nail down the data model early: what does sync_state.json look like? How are file keys structured? What does the state machine need to express?

Later that same day I added rclone RC integration for transfer tracking. This required adding the --rc --rc-addr flags to each rclone invocation and building the port allocation logic. The early version was simple — no port conflict handling, no retry on busy ports — but it established the active_transfers.json data contract that the future web panel would depend on.

Days 2–3: The Web Panel (March 29)

The web panel came next, built in a single day (March 29). The initial commit was 1,912 lines: app.py, rclone_monitor.py, requirements.txt, and the full index.html template with Tailwind CSS, Socket.IO client, and tabs for configuration, status, live progress, and logs.

The same day I hardened the RC port management significantly — adding the reserved-port set, conflict detection by probing stderr output, and retry loops for startup failures. This was prompted by discovering that rapid test restarts could leave ports in TIME_WAIT, causing the next rclone process to fail to bind.

Week 2: Hardening (March 30–31)

March 30 was logistics: moving start.sh from web_panel/ to the repo root so it covered both components, and improving rule management in the UI (preserving in-progress edits when appending new rules rather than clobbering the form state).

March 31 was a security sprint. I’d been running without authentication for local testing, and I wanted to make the panel safe to expose through a reverse proxy. In one day I shipped:

  • API key → session cookie auth for all /api/* routes
  • Socket.IO authentication — connections without a valid session are rejected at the handshake level
  • CSRF protectionX-CSRF-Token + Origin/Referer checks on all unsafe methods
  • Brute-force protection — per-IP rate limiting with a sliding window and lockout, all configurable through .env
  • Strict env var validation — the app fails fast at startup if WEB_PANEL_API_KEY or WEB_PANEL_SECRET_KEY are missing

I’d written a SECURITY_AUDIT.md earlier as a checklist of things to address. Once all the items were checked off, I deleted it. It had served its purpose.

Week 3: Reliability and Observability (April 1–22)

April 1 brought runtime status tracking — a runtime_status.json file that the daemon writes continuously with active_downloads, queued_downloads, pause_requested, and service_restart_allowed. The web panel reads this to gate the “restart service” button: it’s disabled whenever downloads are active. This was a UX fix for a real footgun I’d hit during testing.

April 12 fixed a subtle thread safety bug in write_runtime_status. Multiple download workers could call the function simultaneously, each writing to the same .tmp file, with the last writer winning — potentially corrupting the output. The fix was to use a per-call unique temp filename incorporating both the PID and the thread ID.

April 22 added an architecture diagram to the README.


What I’d Do Differently

The web UI is functional, not beautiful. The Tailwind-based design works but I spent almost no time on it. The tab-based layout made sense at 400 lines of template HTML; at 1,000+ lines it starts to show its age.

The RC port range is too small. 5572–5582 is 11 ports for max_concurrent_downloads workers plus retry headroom. It works fine at the default concurrency of 3, but bump the worker count and you’ll run out. A dynamic port allocation strategy (pick any free port from the OS) would be more robust.

No metrics. The daemon emits structured JSON log lines and writes status files, but there’s no Prometheus endpoint, no Grafana dashboard. For a homelab deployment this is fine; for anything shared it would be the first thing to add.

permanent_failed has no UI-level recovery path. Files in permanent_failed state are stuck there. The only way out is to edit sync_state.json by hand. A “reset to pending” button on the web panel would be a one-afternoon addition.


Using It

If you run a similar setup — rclone-mounted cloud storage, Linux server, need automated one-way sync — the project is open source and has a one-command installer. The configuration is a single JSON file:

{
  "scan_interval_seconds": 300,
  "rclone_refresh_interval_seconds": 1800,
  "max_concurrent_downloads": 3,
  "max_retry_count": 5,
  "bandwidth_limit_mbps": 0,
  "rclone_command": "rclone",
  "rclone_service_name": "rclone-pikpak",
  "rules": [
    {
      "source_path": "/mnt/pikpak/incoming",
      "dest_path": "/data/downloads",
      "enabled": true
    }
  ]
}

The web panel runs on port 5000. Put it behind Caddy or Nginx with TLS and an IP allowlist, set a strong WEB_PANEL_API_KEY, and you have a fully managed sync setup accessible from any browser.

Source: github.com/Z1rconium/auto_download_from_drive

Auto Tool for Rclone

Author

Shayne Wong

Publish Date

04 - 29 - 2026

Last Modified

04 - 30 - 2026

License

Shayne Wong

Avatar
Shayne Wong

All time is no time when it is past.

Friend Links
Blog Statistics

Total Posts

39

Category

7

Tag

24

Total Words

40,247

Reading Time

205 mins