Technology

Auditable storage automation.

ArgosFS is organized around a conservative split: a POSIX-facing FUSE layer, an erasure-coded data plane, a copy-on-write metadata plane, and a health-aware controller that is allowed to act only when safety invariants hold.

FUSE frontendPermission modelStripe transformPlacement scorerMetadata journalAutopilot

Designed as a filesystem and a research instrument.

The project is built so experiments can inspect the exact state used by each storage decision: disk health, capacity, placement weights, workload heat, metadata snapshots, journal events, and post-action verification results.

Architecture

Four layers keep the design understandable.

Each layer has a narrow responsibility, so bugs in POSIX behavior, data encoding, placement, or automation can be isolated without treating the system as a single opaque daemon.

Frontend

FUSE request handling

The mounted interface handles lookup, getattr, setattr, create, read, write, truncate, rename, link, symlink, readdir, statfs, fsync, xattrs, and special inode types.

Semantics

POSIX-facing metadata

Mode bits, ownership, timestamps, sticky directory checks, non-UTF-8 names, symlink targets, hard links, ACL xattrs, and root metadata are preserved through import/export paths.

Data plane

Stripe lifecycle

File bytes are divided into raw stripes, transformed by compression/encryption when configured, encoded into data and parity shards, and stored as disk-local shard objects.

Control plane

Health autopilot

The controller observes SMART data, latency, capacity pressure, and workload state, then proposes bounded actions such as observe, scrub, drain, self-heal, and rebalance.

Write path

From one write to recoverable shards.

A write starts as a FUSE operation, passes metadata and permission checks, then becomes a sequence of stripes. Every shard placement is scored against capacity, tier, health, locality, and failure-domain constraints.

Validate request

Resolve the path or inode, check permissions, update inode timing fields, and preserve byte-oriented names and xattrs without lossy UTF-8 assumptions.

Transform stripe

Apply the configured compression mode, optionally encrypt with an authenticated cipher, and keep enough raw-size information to reconstruct the original byte range.

Encode shards

Produce k data shards and m parity shards so degraded reads can recover missing pieces when the redundancy threshold is still satisfied.

Commit metadata

Install new block references through a metadata transaction, account disk usage, update journal state, and roll back accounting if a write path fails.

Core mechanisms

The important algorithms are explicit.

The website should not hide the project behind vague claims. These are the mechanisms that make ArgosFS different from a simple FUSE demo.

Redundancy

Reed-Solomon layouts

Layouts such as 4+2 split each stripe into data and parity shards. Reads can proceed in degraded mode if enough shards survive, and repair paths can reconstruct missing shards.

Placement

Weighted rendezvous scoring

Stable hashing is combined with disk weight, class, latency, capacity headroom, NUMA hints, and failure-domain diversity so placement is deterministic but policy-aware.

Capacity

Reservation before mutation

Writes and rebalances reserve space before committing references. Failed writes return reserved usage instead of leaving stale accounting behind.

Metadata

Copy-on-write snapshots

Primary and secondary metadata files, journal snapshots, hash checks, replay, and mismatch detection make partial writes observable and repairable.

Security

Authenticated encryption

Keys are derived with Argon2id and stripe data can be protected with XChaCha20-Poly1305. Authentication failures are treated as data integrity failures.

Cache

RAM and L2 read acceleration

Hot reads can be served from RAM or persistent L2 cache. Cache hits refresh recency, and pruning keeps cached data bounded.

I/O modes

Conservative fast paths

Buffered I/O is the safe default. Direct I/O and io_uring paths are optional and fall back when alignment, platform, or safety conditions do not fit.

Rootfs

Boot-first integration

Root filesystem support is treated as a safety requirement: optimization must not compromise mountability, recovery, or emergency-mode operation.

Safety model

Automation is useful only when it can stop.

The autopilot is intentionally conservative. It can observe and explain unsafe states, but it should not optimize through them. When safety and performance conflict, the controller preserves recoverability and mountability first.

Controller gates

Before an action can run

Check that enough disks and shards remain available for the configured redundancy level.
Reject placements that would collapse failure-domain diversity or exhaust a target disk.
Apply cooldowns and risk memory so noisy health samples do not trigger unstable oscillation.
Prefer dry-run explanations and bounded maintenance budgets over large unreviewable rewrites.
Verify journal, fsck, and shard state after mutation; downgrade automation if verification fails.

Inspectability

Every storage decision should leave evidence.

The CLI exposes health snapshots, dry-run plans, journal validation, fsck repair, Prometheus metrics, and experiment artifacts.

argosfs health ROOT --json
argosfs autopilot ROOT --dry-run --explain --json
argosfs verify-journal ROOT
argosfs fsck ROOT --repair --remove-orphans
argosfs export ROOT ./rootfs-export --preserve-metadata