OSS / remaining concerns.

So much on my mind I need to dump all my ideas so please be nice and maybe help.

by c4lliope
2025-06-27

Labeled as:
open source, linux

During the Open Source Summit this week, I'd occasionally log on during a talk to a home-lab machine, running in Maryland, on which I manage a couple of disc arrays.

I am in the middle of copying many of my files from a Synology NAS, a 6-bay spinning-disc bank, to an 8-disc enclosure that I am manually managing using ZFS.

So I'd mount the Synology over IP using NFS, which is only secure because it is on essentially an air-gapped router, held over from my prior lab. For now you'll need to imagine the lab, sorry.

(a) Spin up PenPot to easily produce diagrams again.

I logged on today, and realized that my earlier copy had finished in 6 hours, and somehow in the process had unbalanced my ZFS array:

➜ sudo zpool clear brea
[sudo] password for calliope:
cannot clear errors for brea: I/O error

Oh no, this has been popping up as long as I've been relying on ZFS, and the only real way to clear the error... is a reboot?

This seems exceptionally silly to me - especially since the original issue is around nine years old, the coders have approached a solution and then de-prioritized it, and users keep pulling their hair out because their precious discs are causing their computers to reboot - likely causing real domain and program disconnections.

In my case, there's a murkier problem: my computer's main OS is running on an NVME M.2 SSD disc - the new shape of laptop memory, in essence. As a simple security measure in case any of the machines are unplugged and packed into a burglar's duffel bag, I use LUKS encryption to keep the disc inaccessible when plugged in, until a physical keyboard is used to key in a passcode. Then, the computer spins up.

So, adding up these pieces:

Machine in Maryland is unable to reboot remotely.
Disc array is useless until a reboot.
I am in Colorado, around a week's drive from Maryland.
Bonus: I have basically no gas money.

So, the machine responsible for keeping this domain up now has a dead limb for the next week, until I can go massage the connection online again.

Because this has happened again, and again, and again - and this is a damn silly problem to be causing for users who are focused on keeping the machines running 24/7, I am perhaps gonna say bye to ZFS as soon as I can.

Only, can I?

(b) choose a different RAID-capable filesystem to manage the disc array.

Normally, I'd be on board with this problem. Maybe I should hop back on board the SeaweedFS fan club? Eh, I'm opposed to freemium business models for code, and the self-healing piece seems really crucial. Although I'm surely under the 100TB pricing threshold, I'm also opposed to hiding pieces of your code in private repos.

Uh... maybe my only options for (b) are Linux RAID or BTRFS, which I've never looked at. I hope one of those is capable and simple, otherwise I'm maybe (c) is gonna be:

Nah, I'm sure BTRFS is gonna be easy, yeah? Yeah? ...yeah?...

In any case, I need to focus on application changes until I can go rearrange the hard discs; even the SSD is filling up quickly because of:

(d) keep pulling in case.law.

case.law has been a long-running exercise by Harvard's Caselaw Access Project, probably the most philanthropically-minded group of lawyers and librarians anyone can recommend. They spent four years managing a deal with the National Archives, to scan copies of a bunch of court decisions, which had been written up in federal reporters - the large old dusty volumes you've perhaps seen in some law libraries.

These cases are the main bulk of the business of the judicial branch, and now they are not only scanned as PDFs (the redactions are necessary and do no harm to the corpus), but also as plain text accompanied by JSON-arranged metadata.

Dang, some days machines can be inspiring. Especially if all of those cases can be packed onto my SSD properly. This is the main business of my lab for the next week, I guess.

Now, aside from the open-sourcing of case law, there is a bunch of open-source machine code to examine!

This has been a real focus of the open source summit: how can you be sure that you're not drinking poison from the immense ocean of program you're consuming?

For me, I build from the ground, up. Specifically, I've made some GitHub-specific commands for copying an organization's code in one easy go, or for choosing a specific codebase to copy locally.

A bunch of my command-line access during the conference was also focused on copying good new codebases for local examination.

So, when I got to the Wireshark talk and perhaps signed up to build a new UI to peruse the captured logs, I quickly realized that my GitHub clone process is going to slip up for the WireShark org on GitLab. Luckily, there is simple API access for GitLab also.

(e) Build Nushell commands for cloning GitLab orgs.

So, clearly I am hoarding too many records to easily search or examine.

(f) Hook up OpenSearch I guess, Quickwit as a backup plan.

I'd like to be able to harness Tree-Sitter to query specific logical shapes among all the code, and cross-reference them to explore deeper logical connections than the simple dependency graphs that are popular among SBOM adherents.

Although, to really make use of Tree-Sitter Queries, I guess I need a JavaScript eval environment. Since I'm a language snob, I am unable to imagine seriously using Node, Bun, Deno, or similar on my lab machine. The only place I'm going to run Tree-Sitter is on the client, which means I need to somehow get all my cloned codebases to a web browser, for exploration.

Oh, libgit2 is already in WASM. I already have a branch of the Operand codebase that pulls in a sample repo from this lib's author, so the next phase is to:

(g) Run a git relay in the lab that dgaf about CORS.

Then, I guess I can:

(h) manage any "grams" in share.operand.online using the git relay.
(i) rebuild gram:op using real git clone to the client!

Only, I'm getting so far ahead of myself! This domain has nearly no client-side code, anyhow!

So I should focus on how to display that code base, once it reaches the client. I clearly need to begin with the human-readable .md, or .rst pages, the ones I'm already rendering relay-side.

I'd be thrilled to enable more dynamic README experiences, so the clear choice is:

(j) Use MDX to render all .md pages on ://operand.online.

This is going to become complex in a hurry, so I'll need to make one more choice:

(k) Decide on either Solid or Astro for a serious rebuild.

Now, while I've surely secured the future of this blog (unless I really do decide to build a RAID as I learn Rust), I need to check on my hoard of public records, to please the completionist in me.

Let's go for basically all the code there is:

(l) Maybe compile GitOxide to WASM for simpler code clones.

Let's find some cousins to keep the Case Laws company:

(m) Copy in and cross-reference all federal legislation.
(n) Did you see that GovInfo scanned in all the Statutes at Large?
(o) I guess there's probably something useful in the Federal Regulations.
(p) Oh yeah, those records collections should all become huge git codebases!

I sure hope I can find some people to help on all this. For now you'll need to email me, although we also have a chance to:

(q) compose a new baseline of online secure collaboration.

Dang, sure hope people decide to email me.

Why am I doing so much of this, when (remember?) I have no gas money to reach home?

I suppose because I succeeded in abandoning Google and Microsoft, because their unilateral legal contracting agreements scare the hell out of me.

Somehow, I'm the only one who disagrees to terms and conditions as a normal baseline; I'm no longer going to be using Zoom, sorry people. More chances for me to go breathe un-conditioned air.

Although, I'd like all of us to be able to choose options like Jitsi that are recognizably secure by being open-source - to choose programs that are happy to show off their gnarly pieces and inner couplings (Happy Pride!).

The only choice we have to accomplish this aim is to use the legal resources around us, before they're gone. Go underground and build a new base; change the rules of the game that the lawyers play, so they change the rules of the game the coders play, so they change the rules of the game the public plays.

And here's, in essence, the crucial realization:

coding is legislation, the building of a program the users agree to be bound to.

Engineers co-opted the language of the legal discipline, to describe the logical recipes they produced.

After years of handling all the crap those engineers produced, we need to decide how to reconcile the legal and the logical sides of this space.

So, here's the full backlog lingering in my head. I hardly can keep up with these, because I spend too many days on the road with no easy means of recording my aims.

Now, you can be as anxious as I am about how messy things have become around here, and how much clean-up needs to happen.

(a) Spin up PenPot to easily produce diagrams again.
(b) choose a different RAID-capable filesystem to manage the disc array.
(c) build a RAID filesystem in Rust, why not?
(d) keep pulling in case.law.
(e) Build Nushell commands for cloning GitLab orgs.
(f) Hook up OpenSearch I guess, Quickwit as a backup plan.
(g) Run a git relay in the lab that dgaf about CORS.
(h) manage any "grams" in share.operand.online using the git relay.
(i) rebuild gram:op using real git clone to the client!
(j) Use MDX to render all .md pages on ://operand.online.
(k) Decide on either Solid or Astro for a serious rebuild.
(l) Maybe compile GitOxide to WASM for simpler code clones.
(m) Copy in and cross-reference all federal legislation.
(n) Did you see that GovInfo scanned in all the Statutes at Large?
(o) I guess there's probably something useful in the Federal Regulations.
(p) Oh yeah, those records collections should all become huge git codebases!
(q) compose a new baseline of online secure collaboration.

← Chronicle