Skip to main content
  1. Blog
  2. Article

Canonical
on 23 October 2025

Introducing silicon-optimized inference snaps


Install a well-known model like DeepSeek R1 or Qwen 2.5 VL with a single command, and get the silicon-optimized AI engine automatically.

London, October 23 – Canonical today announced optimized inference snaps, a new way to deploy AI models on Ubuntu devices, with automatic selection of optimized engines, quantizations and architectures based on the specific silicon of the device. Canonical is working with a wide range of silicon providers to deliver their optimizations of well-known LLMs to the developers and devices.

A single well-known model like Qwen 2.5 VL or DeepSeek R1 has many different sizes and setup configurations, each of which is optimized for specific silicon. It can be difficult for an end-user to know which model size and runtime to use on their device. Now, a single command gets you the best combination, automatically. Canonical is working with silicon partners to integrate their optimizations. As new partners publish their optimizations, the models will become more efficient on more devices.

This enables developers to integrate well-known AI capabilities seamlessly into their applications and have them run optimally across desktops, servers, and edge devices.

A snap package can dynamically load components. We fetch the recommended build for the host system, simplifying dependency management while improving latency.  The public beta includes Intel and Ampere®-optimized DeepSeek R1 and Qwen 2.5 VL as examples, and open sources the framework by which these are built.

“We are making silicon-optimized AI models available for everyone. When enabled by the user, they will be deeply integrated down to the silicon level,” said Jon Seager, VP Engineering at Canonical, “I’m excited to work with silicon partners to ensure that their silicon-optimized models ‘just work.’ Developers and end-users no longer need to worry about the complex matrix of engines, builds and quantizations. Instead, they can reliably integrate a local version of the model that is as efficient as possible and continuously improves.”

The silicon ecosystem invests heavily in performance optimizations for AI, but developer environments are complex and lack simple tools for unpacking all the necessary components for building complete runtime environments. On Ubuntu, the community can now distribute their optimized stacks straight to end users. Canonical worked closely with Intel and Ampere to deliver hardware-tuned inference snaps that maximize performance.

“By working with Canonical to package and distribute large language models optimized for Ampere hardware through our AIO software, developers can simply get our recommended builds by default, already tuned for Ampere processors in their servers,” said Jeff Wittich, Chief Product Officer at Ampere, “This brings Ampere’s high performance and efficiency to end users right out of the box. Together, we’re enabling enterprises to rapidly deploy and scale their preferred AI models on Ampere systems with Ubuntu’s AI-ready ecosystem.”

“Intel optimizes for AI workloads from silicon to high-level software libraries. Until now, a developer has needed the skills and knowledge to select which model variants and optimizations may be best for their client system,” said Jim Johnson, Senior VP, GM of Client Computing Group, Intel, “Canonical’s approach to packaging and distributing AI models overcomes this challenge, enabling developers to extract the performance and cost benefits of Intel hardware with ease. One command detects the hardware and uses OpenVINO, our open source toolkit for accelerating AI inference, to deploy a recommended model variant, with recommended parameters, onto the most suitable device.”

Get started today 

Get started and run silicon-optimized models on Ubuntu with the following commands:

sudo snap install qwen-vl --beta

sudo snap install deepseek-r1 --beta

Developers can begin experimenting with the local and standard inference endpoints of these models to power AI capabilities in their end-user applications. 

Learn more and provide feedback

About Canonical 

Canonical, the publisher of Ubuntu, provides open source security, support and services. Our portfolio covers critical systems, from the smallest devices to the largest clouds, from the kernel to containers, from databases to AI. With customers that include top tech brands, emerging startups, governments and home users, Canonical delivers trusted open source for everyone.

Learn more at https://canonical.com/

Related posts


ijlal-loutfi
23 March 2026

Hot code burns

Ubuntu Article

Zero CVEs doesn’t mean secure. It means unexamined. New code has zero CVEs because no one has studied it yet, and if you’re rebuilding nightly from upstream, you’re signing first and asking questions later. In software supply chain security, the freshest code isn’t always the safest. Sometimes the most secure component in your pipeline is ...


Canonical
23 March 2026

Canonical joins the Rust Foundation as a Gold Member

Canonical announcements Article

Canonical’s Gold-level investment in the Rust Foundation supports the long-term health of the Rust programming language and highlights its growing role in building resilient systems on Ubuntu and beyond. AMSTERDAM, THE NETHERLANDS — March 23, 2026 (Open Source SecurityCon, KubeCon Europe 2026) — Today Canonical announced that it has joine ...


Canonical
20 March 2026

Canonical partners with Snyk for scanning chiseled Ubuntu containers

Canonical announcements Article

Canonical, the publisher of Ubuntu, is pleased to announce a new partnership with developer-focused cybersecurity company Snyk. Snyk Container, Snyk’s container security solution, now offers native support for scanning chiseled Ubuntu containers. This partnership will create a path to a more secure container ecosystem, where developers wi ...