opensource

5 stories

Running Transformers.js in a Chrome extension: the Manifest V3 architecture that actually works

A practical walkthrough of how to host Transformers.js models in a Chrome extension background service worker under Manifest V3, covering runtime separation, messaging contracts, model caching, and the agent tool-execution loop.

Apr 26, 2026

1 source · primary

opensource Official

Gemma 4 runs as a vision-language-action agent on an 8 GB Jetson Orin Nano Super

A step-by-step demo shows Gemma 4 handling speech input, autonomous webcam activation, and spoken output on NVIDIA's Jetson Orin Nano Super using llama.cpp, Parakeet STT, and Kokoro TTS — no keyword triggers, no hardcoded logic.

Apr 26, 2026

1 source · primary

opensource Official

Gemma 4 releases four model sizes under Apache 2.0, with the 31B ranked third among all open models

Google DeepMind's Gemma 4 family spans a 2B mobile model to a 31B dense model, supports 140+ languages, adds 256K context for larger variants, and ships under a fully permissive open-source license.

Apr 26, 2026

1 source · primary

opensource Official

QIMMA validates Arabic benchmarks before running models on them — and finds systematic problems in established datasets

Researchers from TII UAE built QIMMA, the only Arabic LLM leaderboard combining quality validation, native content, and code evaluation. A two-stage pipeline of LLM scoring and human review revealed recurring quality failures across widely-used Arabic benchmarks.

Apr 26, 2026

1 source · primary

opensource Official

DeepSeek-V4 cuts KV cache to 2% of standard cost to make million-token agent context practical

DeepSeek-V4 combines two new attention mechanisms with agent-specific post-training to reduce KV cache memory to roughly 2% of a standard grouped-query-attention architecture, targeting long-horizon agentic workloads over chat.

Apr 26, 2026

1 source · primary