gemma on Two Tigers Engineering

gemma on Two Tigers Engineering https://blog.twotigers.xyz/tags/gemma/ Recent content in gemma on Two Tigers Engineering Hugo -- gohugo.io en-us Sat, 04 Apr 2026 14:00:00 +0800 在 ARM VPS 上用 llama.cpp 部署 Gemma 4 E2B 本地模型 https://blog.twotigers.xyz/posts/llama-cpp/ Sat, 04 Apr 2026 14:00:00 +0800 https://blog.twotigers.xyz/posts/llama-cpp/ 在 ARM VPS 上，使用 llama.cpp 的 Docker 镜像部署本地 LLM，记录从 Gemma 3 到 Gemma 4 的部署过程、性能测试和资源监控。背景：Gemma 4 发布 2026 年 3 月，Google DeepMind 发布了 Gemma 4 系列开源模型。Gemma 4 带来了多项重大升级：推理能力：全系模型支持可配置的思维链（Chain of Thought）推理模式多模态：支持文本、图像输入，小模型（E2B/E4B）额外支持音频 MoE + Dense 双架构：提供 Dense 和混合专家（Mixture-of-Experts）两种架构超长上下文：小模型 128K tokens，大模型 256K tokens 原生函数调用：支持 function calling，适用于 Agent 场景原生系统提示：首次原生支持 system role Gemma 4 共 4 个规格：模型架构有效参数上下文长度特点 E2B Dense 2.3B 128K 轻量高效，支持音频，适合手机/边缘设备 E4B Dense 4.5B 128K 平衡性能，支持音频 26B A4B MoE 3.