Research agenda

Safety-constrained self-driving storage.

ArgosFS treats storage autonomy as a measurable control problem: observations, plans, rejected actions, mutations, and verification results all become inspectable data.

Observe

Collect SMART fields, parser coverage, latency EWMAs, capacity provenance, boot-critical classification, workload heat, and stale-measurement markers.

Diagnose

Use risk memory and repeated evidence to distinguish unhealthy disks from missing or stale telemetry, avoiding one-sample overreaction.

Plan

Emit dry-run records containing the selected action, rejected alternatives, safety checks, expected utility, cooldowns, and capacity constraints.

Verify

Finish mutations with journal validation, fsck checks, shard verification, and adaptive mode downgrade when a post-action invariant fails.

Research questions

What ArgosFS is trying to make testable.

The project targets the gap between static local filesystems and policy-rich distributed storage: a single-machine root filesystem can still make autonomous placement and repair decisions, but only under strict boot and recovery constraints.

RQ1

Can local storage become policy-driven?

Evaluate whether capacity, health, tier, and workload signals can guide placement without hiding the decision process.

RQ2

Can automation remain safe for rootfs?

Study whether the controller can improve layout while preserving mountability, repairability, and emergency-mode behavior.

RQ3

Can experiments be reproducible?

Retain raw JSONL/CSV, manifests, command logs, and summaries so each figure or table can be regenerated from artifacts.

RQ4

Can heterogeneous disks be first-class?

Test SSD/HDD mixes, uneven capacities, tier changes, disk drain, add-disk, and degraded operation rather than assuming uniform devices.

Artifact evaluation

One command to produce reviewable artifacts.

The experiment framework writes raw event streams, processed summaries, compatibility records, manifests, and command logs. The goal is to make review less dependent on screenshots or undocumented manual runs.

scripts/experiments/run_all.sh --quick --output paper-data/runs/ae-quick
python3 scripts/experiments/summarize_results.py \
  paper-data/runs/ae-quick/raw \
  paper-data/runs/ae-quick

Experiment families

From smoke tests to paper figures.

ArgosFS separates correctness-oriented tests from research-oriented measurements so quick CI checks and artifact evaluation can share infrastructure without pretending they have the same cost.

Recovery

Failure matrix

Clean writes, interrupted metadata commits, degraded reads, shard reconstruction, rename recovery, and orphan cleanup are exercised as explicit scenarios.

Placement

Workload shifts

Hot/cold phase changes measure whether placement converges and how much background interference the controller introduces.

Boot

QEMU rootfs matrix

Root filesystem boot, degraded boot, interrupted boot, and emergency-mode outcomes are recorded as first-class research artifacts.

Baseline

Manual and autopilot modes

ArgosFS manual policy, autopilot policy, and documented comparisons against familiar Linux storage stacks can be evaluated side by side.

Metadata

Scalability probes

Metadata size, journal behavior, snapshot costs, and import/export performance are tracked under growing file and directory populations.

Compatibility

POSIX-facing suites

Mounted smoke tests, pjdfstest-oriented checks, and skipped-record handling give compatibility results that are reproducible even when external suites are absent.

Health

Telemetry robustness

SMART parsing, stale refreshes, missing fields, latency signals, and risk memory can be studied separately from the data-path implementation.

Artifacts

Manifested runs

Each run can include commit, kernel, mode, seed, command output, generated files, and processed summary tables.

Publication angle

The central claim is not “another filesystem”.

The research contribution is the combination of rootfs-capable local storage, heterogeneous disk control, and safety-constrained autonomy.

Contribution map

What a paper can emphasize

  • A boot-first self-driving storage controller where mountability dominates optimization.
  • An explainable maintenance planner with rejected-action records and safety gates.
  • A heterogeneous placement model that treats disk capacity, tier, health, and failure domain as inputs.
  • A reproducible artifact pipeline that links raw events to summarized research claims.