[research] Local LLMs (making Lemonade)

by c4lliope
2026-05-14

Labeled as:

Lemonade is AMD-officiated and open-source mechanisms for binding deep-learning models onto AMD hardware, from integrated GPU (on the main CPU dye), to dedicated GPU, to the Neural Processing Unit.

On NixOS, there is a module called nix-amd-ai; here is our usage for the Framework 16.

To begin, the most challenging idea is how an agent, an "LLM server", and accessories such as "tools and skills", relate to one another. Lemonade has guidance on these mechanisms.

Lemonade can be accessed in two modes, because of its design as a locally-resident web app managed through a command-line program.

Mainly, lemonade is a package around the program llama-cpp. I had some issues in my log that originated in llama-cpp, because I needed to specify through the Lemonade web UI that llama-cpp use the "ROCm" backend commonly recommended for AMD on Linux. The llama-cpp link includes guidance for Windows and MacOS.

The link to my issues also includes the crucial --ctx-size option, which I needed to increase so llama-cpp could access enough GPU memory (also called vRAM).

Lemonade's FAQ is a nice place to begin also, as this page is outlined by comprehensive sections for each of lemonade's main concerns, such as tex-to-speech, model setup, behaviors, and hardware.

Direction: Embedded

I'm eager to explore Embedded Lemonade ("half & half"?) - this is made for application programmers, such as I used to be, who decide to bring lemonade onto end-user machines for full-fleet deployments of specialized apps.

Embeddable lemonade brings the full scope of a machine's hardware into the reach of an application engineer - no subscriptions needed.

This can be used to process sensor data in vehicles, for example - the Baltimore Node recently had a visit from a Boeing employee who was looking to process more than 1TB / engine / hour of in-flight sensor readings.

Direction: Full-OS agency

Screenenv, from HuggingFace promises to bring LLM reasoning to a full desktop environment - perhaps the Ubuntu & XFCE pairing can be upgraded, to make use of the Niri desktop that I find more capable.

Direction: Plugins

The Lemonade Marketplace ("stand"?) indexes many popular channels for running your LLM in coding or generative scenarios, and I'm likely going to experiment with the open-source options:

Pi is a simple, command-line replacement for my current agent program, OpenCode. I expect the changeover to be painless, although I may be adding Pi to the Nix Package Index. Pi also comes with many packages, opening the lemonade stand to more purpose-designed operations.

If some mix of skills and procedures seems to fill your needs, you can encode it perhaps, using tiny-agents, packaged for node-js. This package was included and recommended in AMD's announcement for the lemonade program.

Finally, I suppose I need to learn of all the MCP plugins; on the open web, so many as to need numerous indices, and in language-specific packages (elixir, rust). These mcp layers mean that you can build the core mechanisms of a program in isolation, and arrange a process of the pieces in a more nimble, human-commandable form. For the AMD users on the channel,

← Chronicle