Products \ Saber AI

Find-A-Tenant

Find-A-Tenant is an AI-powered rental assistant for international students in the UK. Instead of filling out rigid search filters, users describe their ideal home in everyday language — and the AI matches them with suitable listings.

Learn more

AI QR Code

AI QR Code generates artistic QR codes using AI, with a breakthrough capability: embedding multiple independently scannable QR codes within a single cohesive artwork. The signature feature — couple QR codes — places two people in one image, each with their own scannable code linking to different content.

Learn more

DeepReader

DeepReader is your AI reading companion. It automatically aggregates book reviews from multiple platforms and provides an AI partner for in-depth book discussions. Read a book and want to know what others think? DeepReader collects reviews from Douban, Bilibili, and YouTube, then lets you discuss any book with AI anytime.

Learn more

BTMR Paper Extractor

BTMR Paper Extractor is an AI-powered reading assistant that transforms complex academic papers into easy-to-read structured summaries. Built as a Cerebras Hackathon project, it supports importing from ArXiv links, PDF uploads, and web URLs, then automatically generates digestible summaries exportable as HTML and PDF.

Learn more

ContextKeeper

ContextKeeper is a local AI assistant that lets you control your PC with voice or text commands — adjusting system settings, game parameters, and peripheral lighting. Built on NVIDIA's G-Assist platform, it runs entirely on your local RTX GPU with no cloud dependency, ensuring privacy and fast response times. It placed 4th at the NVIDIA G-Assist Hackathon and is listed on the NVIDIA Store.

Learn more

SwarmX

SwarmX is a scheduler agent framework for large-scale agentic workflow clusters. Submitted to OSDI 2026 and deployed in Tencent WeChat's production environment, it addresses the critical challenge of efficiently scheduling complex AI agent workflows across hundreds of heterogeneous GPUs and millions of CPU cores.

Learn more

AI4Whisky

AI4Whisky is a collaboration with the Scottish Government, the University of Edinburgh, and ICBD to help whisky distilleries calculate their carbon emissions and receive tailored reduction recommendations. Scotland's whisky industry generated £710 million in added value in 2022, and the sector aims for net-zero emissions by 2040 — but most small and medium distilleries lack the resources for proper carbon footprint assessment.

Learn more

PrivAgent

PrivAgent is an efficient AI agent architecture for real-time privacy risk monitoring in sensitive environments. Developed in collaboration with the NHS (UK National Health Service) Sandbox and submitted to ACL, it solves the critical challenge of ensuring AI agents comply with privacy regulations like HIPAA and GDPR at every action — in real time, not as an afterthought.

Learn more

BioVLM

BioVLM is a cost-efficient scientific domain vision-language model that surpasses GPT-5.2 on biological research tasks. Developed in collaboration with Harvard Medical School and Edinburgh's Roslin Institute, it uses automated rich-text data synthesis from raw PDF papers to train a domain-specialized VLM — with the entire pipeline costing less than $200.

Learn more

WaferLLM

WaferLLM is the first wafer-scale LLM inference system, designed for a next-generation AI accelerator with hundreds of thousands of cores, tens of gigabytes of distributed on-chip memory, and tens of PB/s on-chip bandwidth. It introduces novel parallel strategies and kernel implementations that achieve orders-of-magnitude performance improvements over GPU-based systems.

Learn more

ServerlessLLM

ServerlessLLM is a low-latency serverless inference system for large language models. Its core innovations — multi-tier checkpoint loading, live inference migration, and startup-time-optimized scheduling — have been adopted by nearly every major AI cloud provider, delivering 10–200x latency reductions over state-of-the-art serverless systems.

Learn more

MICA

MICA is the first end-to-end compiler for mesh-based AI accelerators. Submitted to OSDI 2026 and developed in collaboration with leading AI chip manufacturers, it enables automatic model adaptation for next-generation AI hardware — turning weeks of manual wafer-scale scheduling into hours of automated compilation, with generated code outperforming expert hand-tuned implementations.

Learn more

ContextPilot

ContextPilot accelerates long-context LLM inference through context reuse — a new paradigm that identifies overlapping context blocks across users and conversation turns to maximize KV-cache reuse while maintaining or even improving inference quality. Developed in collaboration with Tencent.

Learn more

Our Products

Find-A-Tenant

AI QR Code

DeepReader

BTMR Paper Extractor

ContextKeeper

SwarmX

AI4Whisky

PrivAgent

BioVLM

WaferLLM

ServerlessLLM

MICA

ContextPilot