Poolside Releases Free Open-Weight Coding Model With July 9 Upgrade Deadline
6 hour ago / Read about 36 minute
Source:TechTimes

Poolside.ai

Poolside released its latest open-weight coding model last Thursday, July 2, and developers already running the predecessor on OpenRouter have five days to upgrade before it disappears. The San Francisco-based AI lab dropped Laguna XS 2.1 as a free download on Hugging Face and a free API tier on OpenRouter — the lowest-friction entry point Poolside has offered — while simultaneously announcing that Laguna XS.2 will retire from its API and OpenRouter on July 9, 2026.

For developers who have been evaluating whether to bring a capable agentic coding model into their local workflow without a cloud subscription or enterprise contract, the timing matters. The model is free to download and run; it handles multi-step software engineering tasks on a single GPU; and it ships under OpenMDW-1.1, a new permissive license designed specifically for AI model weights that gives enterprise legal teams cleaner standing than Apache 2.0 ever did.

How Mixture-of-Experts Makes a 33-Billion-Parameter Model Fit on One GPU

The technical story behind XS 2.1's local viability comes down to its Mixture-of-Experts (MoE) architecture. The model has 33 billion total parameters — a figure that sounds like it requires a data center — but it only activates 3 billion of those parameters per token during inference. A gating network, or router, routes each incoming token to a small subset of specialized feedforward sub-networks called "experts," leaving the rest idle. The result is that XS 2.1's per-token compute cost is comparable to a 3-billion-parameter dense model, while its total parameter count gives it the knowledge capacity of a much larger system.

This design is why the model runs on a single GPU. At INT4 quantization — the most compressed format available — the 33B model fits in roughly 16 to 20 gigabytes of VRAM, within reach of a consumer RTX 4090 or a Mac configured with 36GB of unified memory. The predecessor XS.2 used the same architecture; XS 2.1 carries over the design without structural changes.

What is new in XS 2.1 is the performance and the local inference story. On SWE-bench Multilingual, a benchmark measuring the model's ability to fix real GitHub issues across multiple programming languages, XS 2.1 climbed 5.4 percentage points from its predecessor to reach 63.1%. Terminal-style agentic tasks also show improvements.

DFlash Speculative Decoding: Why Local Inference Just Got Faster

Alongside XS 2.1, Poolside open-weighted a set of DFlash speculator models — one tuned for each quantization checkpoint (FP8, INT4, and NVFP4). These companion models implement speculative decoding, an inference acceleration technique that pairs a small, fast draft model with a larger target model.

The mechanism works like this: the DFlash draft model proposes several candidate tokens ahead, then the larger XS 2.1 model verifies them all in parallel in a single forward pass. Tokens the target model agrees with are accepted; mismatches are corrected on the fly. The verification happens in parallel rather than sequentially, which amortizes the cost of each XS 2.1 forward pass across multiple tokens. The output is mathematically identical to running XS 2.1 alone — no quality is lost. Poolside says the DFlash speculator roughly doubles tokens per second in local inference.

Developers running SGLang can activate DFlash via --speculative-algorithm DFLASH and point the draft model path to the corresponding checkpoint. Integration with vLLM is also supported.

The practical consequence is that local coding agents powered by XS 2.1 become significantly more responsive — an important criterion for the agentic, multi-step workflows the model is designed for, where a slow model quickly breaks a developer's focus.

What the Benchmarks Mean — and What They Don't

Poolside's benchmark figures deserve careful handling. All XS 2.1 benchmarking was completed using the Laude Institute's Harbor Framework, with Poolside's own agent harness, capped at 500 steps per task inside a sandboxed environment. The comparison model scores included in Poolside's benchmark tables come almost entirely from other vendors' own release materials — not from a neutral, independently-run evaluation.

The SWE-bench ecosystem has well-documented reliability problems. Independent analysis of top leaderboard entries has found that a meaningful percentage of cases labeled as "solved" by agent evaluations are semantically incorrect — passing unit tests by coincidence or by reward-hacking the evaluation harness, not by producing code that actually works. Poolside ran a post-hoc reward-hack judge on its XS 2.1 evaluation runs and reported no significant reward hacking found. That is better practice than many labs, but it is still self-reported.

The honest framing: a 63.1% SWE-bench Multilingual score is a meaningful signal that XS 2.1 is competitive in its weight class. Developers who want confirmation before committing infrastructure should benchmark the model on their own actual tasks rather than assuming the leaderboard number transfers directly.

XS 2.1 also ships in four quantization formats. BF16 delivers the highest accuracy but requires the most VRAM. FP8 is Poolside's production-optimized format, balancing accuracy and speed. NVFP4 targets NVIDIA Blackwell architecture (GB300 and similar). INT4 gives the smallest footprint for memory-constrained hardware. GGUF checkpoints for llama.cpp are planned but not yet available.

Read more: NASA Moves Roman Space Telescope Launch Up To August 30: A Billion-Galaxy Survey Arrives 8 Months Early

OpenMDW-1.1: What the License Change Actually Means

XS.2 shipped under Apache 2.0. XS 2.1 ships under OpenMDW-1.1, released by the Linux Foundation on May 28, 2026. NVIDIA simultaneously adopted OpenMDW-1.1 for its Cosmos, Isaac GR00T, Ising, and Nemotron model families.

The change is more substantive than a branding update. Apache 2.0 was designed for software code, not AI model weights. Applying it to a set of trained parameters leaves several legal questions unanswered — whether patent rights are conveyed, whether database rights apply, what happens with model-generated outputs. OpenMDW-1.1 was purpose-built to cover model weights, architecture documentation, training-related code, and generated outputs under a single legal framework. It explicitly grants copyright, patent, database, and trade secret rights, and it imposes no conditions on model outputs. For enterprise procurement and legal teams, that is a meaningfully cleaner foundation than repurposed software licensing.

The practical effect: developers and organizations can use, modify, and deploy XS 2.1 commercially without the restrictions that have slowed adoption of some custom AI licenses. The only obligation is retaining a copy of the OpenMDW-1.1 license in any redistribution of the model materials themselves.

How Poolside Fits Into the Coding AI Market in 2026

Poolside's decision to release XS 2.1 free follows a proven playbook in frontier AI: give developers a capable open-weight model, earn trust and mindshare, and convert adoption into enterprise pipeline. GitHub Copilot had surpassed 25 million total users by early 2026; Anthropic's Claude Code crossed $1 billion in annualized run rate early in its commercial life; Anysphere's Cursor was reportedly in talks for funding at valuations above $50 billion the same period. The AI coding tool market has become one of the most competitive in enterprise software.

Against those incumbents, Poolside's competitive argument rests on a specific claim: for developers who need to run a capable coding model locally — within their own infrastructure, on their own hardware, without sending proprietary code to a third-party API — no closed competitor can match an open-weight model with strong agentic performance and a clean license. On that specific axis, the argument is sound.

Poolside was founded in 2023 by Jason Warner, the former CTO of GitHub who helped build the original GitHub Copilot, alongside co-founder Eiso Kant. The company raised a $500 million Series B in October 2024, led by Bain Capital Ventures, at a $3 billion valuation — and subsequently attracted an announced commitment from NVIDIA of up to $1 billion as part of a broader funding round. That round did not close. In April 2026, DataCenterDynamics reported that the CoreWeave anchor-tenant agreement for Poolside's planned 2-gigawatt Texas data campus had collapsed, and the associated $2 billion Series C — which would have included the NVIDIA commitment — fell apart after investors raised concerns about Poolside's ability to train models competitive with Anthropic, OpenAI, and Google DeepMind at the frontier.

Poolside is now a smaller company competing in a market where the tier-1 labs have dramatically outpaced it on ARR. The XS 2.1 release is a genuine technical contribution, and the open-weight angle fills a real gap. Developers evaluating it for production use should factor in the company's current financial position alongside the model's performance characteristics.

Is Poolside Reliable Enough for Production Use?

The technical fundamentals of XS 2.1 are not in question. Poolside's Model Factory process — all training configurations version-controlled, experiments tracked by unique IDs, the entire pipeline reproducible from committed code — reflects serious infrastructure engineering. The model was trained using Reinforcement Learning from Code Execution Feedback (RLCEF), Poolside's proprietary method in which the model learns from actual code execution results rather than human preference labels, which is directly relevant to agentic coding quality. The open weights, once downloaded, are not dependent on Poolside's continued operation; GGUF and Hugging Face availability means the model can run offline and independently.

The business risk is real but addressable. Developers who rely on Poolside's API for inference should understand that the service exists at a company managing significant financial uncertainty. Developers who download and self-host the weights face no such dependency.

Poolside's paid API tier prices XS 2.1 at $0.10 per million input tokens, $0.20 per million output tokens, and $0.05 per million cache-read tokens — through Poolside's dedicated endpoint. OpenRouter's paid tier reflects slightly different pricing. The free tier on OpenRouter carries no cost for evaluation.

Upgrade Path and July 9 Deadline

Laguna XS.2 (free on OpenRouter) displays an active "Going away July 9, 2026" notice. After that date, XS.2 will no longer be available through the OpenRouter or Poolside API. XS.2 will continue to be hosted by Baseten as a dedicated deployment option for teams running it on private infrastructure.

For developers currently on the paid XS.2 API tier, the upgrade to XS 2.1 is seamless — pricing is matched, the model ID changes from poolside/laguna-xs.2 to poolside/laguna-xs-2.1, and no other integration changes are required.

For new evaluators starting fresh, Poolside recommends using its terminal-based coding agent, pool, as the primary interface. The agent is open-source, ACP (Agent Client Protocol) compatible, and the same harness Poolside uses internally for model training and evaluation. Developers can also access the model through any ACP-compatible client.

Read more: AI Agent Business Models Split Four Ways: Open-Source Infrastructure, Token Distribution, SaaS, Acquisition


Frequently Asked Questions

What is Mixture-of-Experts (MoE) and why does it matter for local deployment?

Mixture-of-Experts is a neural network design that replaces dense feedforward layers with a set of specialized "expert" sub-networks and a router that directs each token to only a few of them. Because only a small fraction of total parameters activate per token, a large MoE model costs no more compute per inference step than a much smaller dense model. Laguna XS 2.1 has 33 billion total parameters but activates 3 billion per token — about 9% of its capacity. That is what allows it to run on a single GPU while retaining the knowledge breadth of a much larger system.

How does Laguna XS 2.1 compare to Claude Code, GitHub Copilot, and Cursor?

Claude Code, GitHub Copilot, and Cursor are closed APIs or API-dependent tools — they cannot be downloaded and run on private infrastructure without sending code to a third-party server. Laguna XS 2.1's open weights can be downloaded and run entirely on a developer's own hardware, making it the relevant alternative specifically for organizations with data residency requirements, air-gapped environments, or budgets that favor compute over per-token costs at scale. On benchmark scores, XS 2.1 is competitive within the open-weight 33B class but trails the frontier closed models (Claude Mythos Preview, GPT-5 series) on absolute performance. The right comparison is against other locally-deployable open-weight coding models: Cohere's North Mini Code, Qwen3.6-35B-A3B, and similar weight-class options.

Are the SWE-bench scores independently verified?

No. All Poolside benchmark results for XS 2.1 come from Poolside's own evaluation runs, using the Laude Institute's Harbor Framework with Poolside's agent harness. Comparison model scores came from other vendors' release materials, not a neutral third-party evaluation. Poolside ran a post-hoc reward-hacking check and found no significant issues, but the numbers are self-reported. Developers should treat these as signals rather than proofs and run their own benchmarks on representative tasks before making infrastructure commitments.

What is the OpenMDW-1.1 license, and is XS 2.1 safe for commercial use?

OpenMDW-1.1, developed by the Linux Foundation with Amazon, Meta, IBM, Microsoft, and others, is a permissive license built specifically for AI model weights and related artifacts. It grants copyright, patent, database, and trade secret rights for any use — including commercial deployment, modification, and redistribution. The only obligation is retaining the license file in any redistribution of the model materials. Generated outputs carry no license obligations. NVIDIA has adopted OpenMDW-1.1 for its Cosmos, Isaac GR00T, and Nemotron model families. For teams that previously avoided open-weight models due to Apache 2.0 ambiguity around weights versus software, OpenMDW-1.1 resolves most of those concerns.