Josh Weir
AI cluster Africa All articles Contact
Home›AI cluster
Archive · 19 pieces

AI cluster

Long-form essays on sovereign AI infrastructure — local-first inference, agent orchestration, voice synthesis, and the operating systems that own their stack instead of renting it.

Articles
19
long-form pieces
Total words
37,297
in this archive
Avg. read time
9
minutes per piece
Updated
May 2026
latest publish
AI cluster · 1,849 words
Sovereign AI vs Cloud AI: a working operator's framework for choosing where inference happens
The cleanest question I get from technically-fluent founders is also the most consequential: where should our model calls actually happen? Not which model. Not which prompt framewo…
AI cluster · 1,687 words
Running production-grade local LLMs on Apple Silicon: what works in 2026
Two years ago, running a local large language model on a desktop machine was a curiosity. Today, with the right hardware and a careful model lineup, it is a viable production subst…
AI cluster · 1,876 words
Designing AI-agent workflows that actually compound: an operator's pattern library
The phrase "AI agent" has been laundered through enough conference decks that it has lost most of its operational meaning.…
AI cluster · 1,894 words
Home Assistant + local AI: replacing Big Tech voice assistants in a way that doesn't degrade
The voice assistants from the major Big Tech vendors got worse, not better, between 2022 and 2025. They became less capable, more advertising-laden, and more aggressively cloud-dep…
AI cluster · 1,835 words
MCP-style tool protocols for AI agents: the architectural shift that changes everything
Every few years a piece of infrastructure shows up that quietly redraws the lines of what's possible. Most of the time we don't notice while it's happening — the change looks like …
AI cluster · 1,823 words
GEO: how to be cited by ChatGPT, Perplexity, Claude, and Google AI Overviews in 2026
Search has split. There is still a traditional search engine returning ten blue links to a query, and that traffic is not zero. But increasingly, the answer to the query never open…
AI cluster · 1,809 words
Voice synthesis for content production: cloned voices, local TTS, and the death of cookie-cutter narration
The state of AI voice synthesis in 2026 is genuinely strange. The technology is good enough that — for many use cases — listeners cannot reliably distinguish a cloned voice from th…
AI cluster · 1,862 words
AI for commodity-trade verification: where document forensics meets institutional infrastructure
Commodity trade is full of paper that needs to be true. Letters of intent, irrevocable corporate purchase orders, soft and hard offers, refinery certificates, SGS reports, performa…
AI cluster · 1,856 words
Replacing SaaS with on-premise AI: the unit economics that make sovereign-AI inevitable
The case for sovereign AI usually gets made on values: privacy, control, sovereignty. Those arguments are real.…
AI cluster · 1,742 words
AI-assisted project feasibility for Africa: satellite data, structured research, and the verification stack
The standard objection to investing in early-stage African development projects is that the cost of pre-investment due diligence is disproportionate to the deal size.…
AI cluster · 2,139 words
The unit economics of agent orchestration
The breathless 2024 framing of AI agents has matured into something more useful and considerably duller: a loosely-coupled chain of model calls, tool invocations, and deterministic…
AI cluster · 1,894 words
The verification stack
If you have spent any time selling AI-derived work into financial institutions, government departments, defence primes, or large compliance-bound corporates, you will have noticed …
AI cluster · 2,047 words
Sovereign retrieval
Retrieval-augmented generation — putting relevant documents in front of a model so it can answer with reference to them rather than from training-data memory alone — has become the…
AI cluster · 2,086 words
Voice cloning, deepfake liability, and the consent stack
Synthetic voice has moved, in the space of about thirty months, from a research curiosity to a deployment pattern that businesses are quietly using at scale — for content productio…
AI cluster · 2,178 words
The cost of agent failure modes
The popular framing of agent systems treats failure as a rare edge case — something the retry budget catches, something the validator rejects, something the human-in-the-loop notic…
AI cluster · 2,030 words
The deferred cost of cloud AI lock-in
The decision to build an AI stack on a single closed-vendor platform looks rational on day one. The integration is smooth, the pricing is competitive, the documentation is clear, a…
AI cluster · 2,246 words
Self-hosted RAG architecture in 2026
Retrieval-augmented generation is the pattern that has graduated from clever-trick to default-architecture in approximately eighteen months.…
AI cluster · 2,227 words
AI cost discipline
The pattern is consistent across AI-native businesses I have worked with or advised. Year one, the AI bill is a rounding error and nobody pays it much attention.…
AI cluster · 2,217 words
Multi-tenant AI
The shape of B2B AI services in 2026 is converging on a pattern that creates a specific architectural problem.…
Josh Weir · Home · Newsletter
© 2026 Josh Weir · part of the Weir Digital Media network
New here? Start with The Sovereign Architect newsletter — quarterly long-form, free.
Subscribe free