festival POS·live event POS·offline POS
Offline-First Is the Future of Live Event POS, and On-Device AI Is the Multiplier
Offline-first is the only POS architecture that survives a real festival, and on-device AI is what turns it from a survival mechanism into a competitive advantage. Here's how Zerobeat is building a mesh-replicated, CRDT-backed, AI-native stack for live events.
Every time we talk to an event ops team, somebody says it: "just throw Starlink at it."
It's the wrong answer to a different problem.
Starlink is great. We're not anti-Starlink. We've watched it cleanly handle livestreams and ticketing at outdoor venues no terrestrial provider would touch. But "throw Starlink at it" is what happens when you mistake a bandwidth problem for an architecture problem. At a festival, the POS still goes down even when the dish is fast. Once you've stood in an ops tent watching it happen, you can't un-see why.
We're building Zerobeat for that reality. Mesh-first, CRDT-replicated, on-device AI, cloud-optional. It's a bet on local-first as the default for live events, not as a fallback. Some of it ships in iOS 26. Some of it ships in Ollama. Some of it is the part nobody's gluing together yet, and that's the part we're building.
The Starlink myth
Starlink solves WAN backhaul: the connection between your venue and the rest of the internet. That's a real problem. Festivals are often in remote fields, parking lots, or beaches with no fiber, no useful cell signal, and no time to run a circuit. A dish in a road case is a legitimately good answer to "how do we get internet on this site at all."
What Starlink does not solve is everything that happens inside the venue boundary, which is exactly where festival POS systems fail.
The headline-grabbing failure at most festivals isn't that the venue lost internet. It's that the venue has internet, and the POS is still timing out, because the LAN between the AP and the POS terminal is competing with 50,000 streaming phones for airtime. A cloud round-trip on a saturated Wi-Fi network is just as broken as a cloud round-trip on no Wi-Fi at all.
Even on the WAN side, Starlink isn't a free pass. Performance depends on clear sky visibility, and the dish needs roughly 100 degrees of unobstructed view. Trees, structures, metal containers, all of which exist at every festival, can occlude the link. Capacity per cell drops in dense regions and during peak. Popular areas already see speeds fall from 150 Mbps down to the 50 to 80 Mbps range during evenings. Heavy weather degrades the signal further. None of that is fatal, but it's enough to make "we have Starlink, we're good" a hopeful statement, not an architectural one.
The deeper problem is that Starlink doesn't change what depends on the network. If your POS architecture assumes a reliable cloud round-trip for every transaction, every menu update, every dashboard query, and every AI inference, then adding faster backhaul just lets it fail more elegantly. The architecture is what has to change.
Three flavors of "offline"
Most cloud POS vendors say they support offline operation. They're not lying. They all do, in some form. But how they do it ranges from useless-at-scale to actually solving the problem.
The siloed model is what Square and Clover ship out of the box. It doesn't even pretend to keep the venue coherent. Each terminal queues its own transactions, holds its own copy of the menu, and counts its own inventory in its own local cache. None of them talk to each other. By the time the cloud comes back, the inventory count is different on every device, and the cashier metrics aren't comparable until you reconcile post-event. The transactions clear. The decisions are gone.
The next step up is the single-primary hub. Toast's "offline mode with local sync" is the most public example. One designated terminal acts as the relay, and every other device routes orders through it during an outage. It's a real improvement over fully siloed offline. But it inherits a familiar set of problems for anyone who's run a distributed system. The hub is a single point of failure. You can only have one per local network (Toast's docs are explicit about this). The state that propagates is mostly orders rather than the full venue picture. And the hub itself is a performance bottleneck at peak, which is exactly when you can't afford a slow lane.
The third option is peer mesh. No designated primary, no shared bottleneck, no per-network limit. Every terminal holds a full local replica of relevant venue state, and edits propagate to peers as they happen. This is the model we're building Zerobeat on, and it's the only one of the three that lets the venue picture survive any single device, link, or uplink dropping. It's also the one with the most interesting downstream consequences. To get there we have to talk briefly about the part that makes it actually work.
CRDTs are the part that makes mesh non-trivial
Mesh networks are easy to draw and hard to operate. Every node has its own evolving state, and when two nodes have been disconnected from each other for ten minutes and then reconnect, somebody has to resolve "you ran out of IPA over there but I sold three more over here, what's the count?" If every reconciliation needs a coordinator to break ties, you're back to a single point of failure with extra steps.
CRDTs, conflict-free replicated data types, are the part that makes this tractable. A CRDT is a data structure designed so that concurrent edits from any number of devices, in any order, provably converge on the same final state without any coordination. Add to a CRDT counter, increment one on each side, reconnect, and the merged value is correct. There's no winner-takes-all, no last-writer-wins, no need for a central referee.
The CRDT layer we use is Ditto, a CRDT-based offline-first peer-to-peer database that already runs point-of-sale at Chick-fil-A and field operations at Japan Airlines, Alaska Airlines, and the U.S. Air Force. Ditto gives us the hard distributed-systems primitives, peer discovery, mesh replication, and conflict-free convergence, in production. We model the POS domain on top: orders, menus, prices, inventory, staff actions, all expressed as CRDTs that Ditto then keeps in sync across every device in the venue.
What that means in practice: every terminal can independently take orders, deduct inventory, log staff actions, and edit menu state. When peers reconnect, the merge is a no-op from the operator's perspective. The venue picture is always self-consistent, even after partial-disconnection scenarios that would melt a traditional master-replica system. Combined with the mesh's active uplink traversal (when any device in the mesh has internet, payment authorization for the rest of the mesh routes through it in real time), store-and-forward only kicks in when the entire venue is dark. And even then, the operator sees exactly which transactions are at risk.
Engineers who've worked on collaborative editors, distributed databases, or local-first apps will recognize the pattern. We're applying it to a problem domain, live event POS, where the stakes are revenue per minute, not document conflicts. We covered the payment side of this in more depth in the future of event payments. The interesting question is what unified state lets you do next.
Once decisions are local, AI can be too
Every decision worth making at an event is a question about the whole venue, not one terminal. Which zones are underperforming and where do I move staff? Are we about to run out of IPA across the festival, or just at one bar? Is cashier 47 voiding more than the team average, or is the whole shift doing it? Did the price change at 4pm actually move revenue? Answer those without a unified live picture and you're guessing.
Cloud AI, in 2026, is the way most ops dashboards try to answer those questions. It works fine in an office. It fails at events for the exact reasons cloud POS does. The LAN is saturated. The WAN is degraded. The round-trip is unpredictable. The per-token cost is hard to budget across a six-hour peak. And operationally, sending venue data offsite has its own problems. Even if Starlink is connected, even if the dish is fast, asking a hosted LLM "where should I move staff" can take ten seconds while the line you wanted to fix has already cleared.
So the AI has to run on the device. The good news, and the part that makes this post possible to write today rather than as a five-year prediction: it can.
Two things shifted at once. First, the silicon is finally there. iPhone and iPad have shipped Apple Neural Engines for years. On A17 Pro and later, the ANE runs at 35 trillion operations per second, with Core ML and the ANE delivering roughly 30 tokens per second of generation and around 0.6ms time-to-first-token per prompt token on something the size of an iPhone 15 Pro. Second, the framework layer is finally there. In iOS 26, Apple shipped a Foundation Models framework that exposes the same ~3-billion-parameter on-device LLM that powers Apple Intelligence. A few lines of Swift, structured output, tool calling, guided generation, LoRA adapters for fine-tuning. No API key, no per-token bill, no internet round-trip. The model is on the iPad already.
Outside the Apple stack, Ollama has become the de facto runtime for local open models. Llama 3.x, Mistral, Gemma, Qwen, Phi all run on a laptop or a small back-of-house mini-PC with no cloud dependency. A single ruggedized M-series mini behind the generator can run a model big enough to summarize the last five minutes of an event in plain English. For workloads that need more, like fine-tuning on a venue's twelve previous events, or running a custom model alongside Apple's, there's MLX, Apple's array framework optimized for unified memory on Apple Silicon. Same hardware that runs the POS.
The thing that ties this together at a venue is quantization. A model trained in 16-bit floating point can be compressed to 8-bit, 4-bit, or lower precision with surprisingly little quality loss. Llama 3.1 8B in full precision is around 16 GB. The same model at Q4 is around 4 to 5 GB. Phi-4-mini at Q4_K_M sustains 15 to 20 tokens per second on an M1 MacBook Air. These aren't theoretical numbers. They're what the open-model ecosystem has converged on in the past 18 months.
The more interesting consequence is what that smallest row implies. We don't need a generalist chatbot at a festival. We need a fleet of small, sharp, purpose-built models. A demand-forecasting model trained on the last twelve events at this venue. A fraud-anomaly classifier that knows what sweethearting at a festival bar actually looks like. A menu-extraction model that turns a sponsor PDF into a configured zone in seconds. A structured-query model that maps "revenue by zone last 30 minutes" to a local query plan against the mesh state. Each is 1 to 4 billion parameters at most. Quantize them to Q4 and they're a few hundred megabytes. You can pin one per task and have them all resident on the same iPad without thrashing, alongside Apple's Foundation Model, which you call for the long-tail conversational stuff.
So that's the architecture. Mesh-replicated state on every device, plus a small fleet of quantized purpose-built models running on the silicon already in the cashier's hand. The cloud is no longer in the critical path for any of it.
What this unlocks
The point of the architecture isn't the architecture. It's what becomes possible once the state and the inference both live at the edge.
Predictive staffing, local. A cashier at the East Gate doesn't have to wait for a cloud round-trip. The on-device model reads the local mesh state, the same shared view every other terminal sees, and surfaces "move 2 from Main Stage to East Gate, line about to break 8 minutes." The model fits in memory. The data is already on the device. There is no internet involved.
Anomaly detection, local. Sweethearting, void clusters, and discount stacking are inherently cross-terminal patterns. With mesh state, every device sees the venue-wide pattern. With a local model, every device can score it. Fraud catches itself in real time, even when the venue is offline.
Natural-language ops, local. "Show me revenue by zone for the last 30 minutes." On a cloud POS that's a query that travels to a data center, runs against a warehouse, and waits for a network already saturated by 50,000 phones. On a Foundation Models powered POS with mesh state, that query runs on the iPad in the ops tent in under a second. No internet required.
Demand forecasting, local. A LoRA-tuned Foundation Model that's seen the last twelve events at this venue can project IPA depletion, popcorn runout, and bar congestion 30 minutes ahead. The model is a few hundred megabytes. The history is on the mesh. The inference happens between two card swipes.
Smart menus, local. Mid-event menu and price adjustments based on what's selling, what's not, and what's expiring no longer require a cloud call. The model recommends. The change propagates over the mesh. Every terminal updates in seconds.
The bet we're making
Cloud architectures are great when connectivity is plentiful and the workload is occasional. Events flip both. So the architecture flips too. Local is primary. Cloud is optional. AI sits where the data already is. The mesh is the substrate. Starlink is a useful upgrade to the WAN, not a strategy.
That's the whole bet. Mesh-first, so the venue's state survives any single device or any single uplink dropping. CRDT-replicated, so every terminal sees the same picture without a coordinator. On-device AI ready, because every iPad already has the silicon, and quantization makes purpose-built models small enough to ship alongside the POS app. Cloud-optional, because the cloud is a nice-to-have, not a dependency.
None of these pieces are speculative. iOS 26 is shipping. Ollama is shipping. MLX is shipping. The Apple Neural Engine has been delivering trillions of operations per second for years. CRDTs have been load-bearing in collaborative software since well before 2026. The piece that's been missing for live events isn't compute. It's a POS architecture that can hand a useful, current, venue-wide view to that compute, even when the internet is gone.
That's the part we're building. If you've stood in a festival ops tent watching a cloud dashboard freeze during peak, you already know why it matters. The terminals never stopped working. The decisions did. Getting offline right at events isn't really about clearing a card when the Wi-Fi dies. It's about knowing what to do next.
If you run live events, see how Zerobeat does it or request early access.
If you're an investor and architecture posts like this matter to your thesis, we're currently raising. Get in touch.
Keep Reading
Related posts
AI Point of Sale for Live Events: How Intelligent POS Is Rewriting Festival and Venue Operations
AI is changing live event point of sale from the ground up — turning hours of menu setup into minutes, enabling real-time changes mid-event, and surfacing insights operators never had before.
Why Offline-First POS Is the Future of Event Payments (And Why Cloud Systems Fail at Festivals)
Cloud POS systems break under festival conditions. Here's why offline-first is the future of event payments — and the difference between true offline and store-and-forward.
