[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-how-to-run-gemma-4-locally-unsloth-en":3,"article-related-how-to-run-gemma-4-locally-unsloth-en":21,"series-industry-b7998c1b-8e10-4f65-aef3-59a428f36541":64},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":11,"views":17,"created_at":18,"published_at":19,"topic_cluster_id":20},"b7998c1b-8e10-4f65-aef3-59a428f36541","how-to-run-gemma-4-locally-unsloth-en","How to Run Gemma 4 Locally","\u003Cp data-speakable=\"summary\">Run \u003Ca href=\"\u002Ftag\u002Fgoogle\">Google\u003C\u002Fa> Gemma 4 locally with Unsloth Studio or llama.cpp.\u003C\u002Fp>\u003Cp>This guide is for developers who want to run Google’s Gemma 4 models on a laptop, desktop, or workstation without relying on a hosted \u003Ca href=\"\u002Ftag\u002Fapi\">API\u003C\u002Fa>. After you follow the steps, you will have a local setup for downloading, launching, and chatting with Gemma 4, plus the settings you need for thinking mode, multimodal input, and memory planning.\u003C\u002Fp>\u003Cp>You can use either \u003Ca href=\"https:\u002F\u002Funsloth.ai\u002Fdocs\" target=\"_blank\" rel=\"noopener noreferrer\">Unsloth documentation\u003C\u002Fa> and \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Funslothai\u002Funsloth\" target=\"_blank\" rel=\"noopener noreferrer\">Unsloth on GitHub\u003C\u002Fa> for a browser-based workflow, or \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp\" target=\"_blank\" rel=\"noopener noreferrer\">llama.cpp on GitHub\u003C\u002Fa> for direct local \u003Ca href=\"\u002Ftag\u002Finference\">inference\u003C\u002Fa>. Gemma 4 is Apache-2.0 licensed, supports text, image, and audio on selected variants, and can run with quantized weights to fit smaller machines.\u003C\u002Fp>\u003Ch2>Before you start\u003C\u002Fh2>\u003Cul>\u003Cli>Google or Hugging Face account for model downloads.\u003C\u002Fli>\u003Cli>Local machine with macOS, Windows, Linux, or WSL.\u003C\u002Fli>\u003Cli>Node not required.\u003C\u002Fli>\u003Cli>Python 3.10+ for Unsloth Studio workflows.\u003C\u002Fli>\u003Cli>CMake 3.22+ and a C++ compiler for building llama.cpp.\u003C\u002Fli>\u003Cli>Git 2.30+ installed.\u003C\u002Fli>\u003Cli>NVIDIA GPU optional, but helpful for faster inference.\u003C\u002Fli>\u003Cli>At least 8 GB RAM for Gemma-4-12B in 4-bit, or 5 GB RAM for E2B in 4-bit.\u003C\u002Fli>\u003Cli>Hugging Face CLI or pip access for model downloads.\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>Step 1: Choose a Gemma 4 variant\u003C\u002Fh2>\u003Cp>Goal: pick the model that matches your hardware before you download anything. Gemma 4 comes in E2B, E4B, 12B Unified, 26B-A4B, and 31B, with different memory needs and tradeoffs between speed and quality.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780777065791-3eje.png\" alt=\"How to Run Gemma 4 Locally\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Use the smallest model that still matches your task. E2B and E4B are best for laptops and \u003Ca href=\"\u002Fnews\u002Ftether-bitnet-fine-tuning-edge-devices-en\">edge devices\u003C\u002Fa>. 12B Unified is a balanced local multimodal option. 26B-A4B is the speed and quality middle ground. 31B is the strongest model if you can afford the memory.\u003C\u002Fp>\u003Cp>Verification: you should be able to state the target memory budget, such as 8 GB for 12B at 4-bit or 20 GB for 31B at 4-bit.\u003C\u002Fp>\u003Ch2>Step 2: Install Unsloth Studio\u003C\u002Fh2>\u003Cp>Goal: get a browser UI that can search, download, and run Gemma 4 locally. Unsloth Studio supports GGUF and MLX files and can auto-set inference parameters for you.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780777068701-in0h.png\" alt=\"How to Run Gemma 4 Locally\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Install it following the Unsloth Studio guide in the docs, then launch the local server and open the UI in your browser. The workflow is: install, start the app, and sign in with the local password you create on first launch.\u003C\u002Fp>\u003Cpre>\u003Ccode>python -m pip install unsloth-studio\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verification: you should see the Studio UI at \u003Ccode>http:\u002F\u002F127.0.0.1:8888\u003C\u002Fcode> and be able to reach the Chat tab.\u003C\u002Fp>\u003Ch2>Step 3: Download the Gemma 4 model\u003C\u002Fh2>\u003Cp>Goal: fetch the quantized model that fits your device. In Unsloth Studio, search for Gemma 4 in the model browser and download the quant you want. In direct workflows, use Hugging Face and choose a GGUF or MLX build.\u003C\u002Fp>\u003Cp>If you are starting with local inference, use 8-bit for E2B or E4B, and Dynamic 4-bit for 12B, 26B-A4B, or 31B. If downloads stall, the source recommends checking Hugging Face Hub and XET debugging guidance.\u003C\u002Fp>\u003Cp>Verification: you should see the model file or shard list fully downloaded, with enough free memory left for runtime overhead.\u003C\u002Fp>\u003Ch2>Step 4: Run Gemma 4 with the right chat settings\u003C\u002Fh2>\u003Cp>Goal: start inference with Gemma 4’s expected prompt format and reasoning controls. Gemma 4 uses standard system, user, and assistant roles, and it can enable or disable thinking with a chat template flag.\u003C\u002Fp>\u003Cp>For llama.cpp, the source recommends \u003Ccode>llama-server\u003C\u002Fcode> when you want to disable reasoning reliably. Use the chat-template kwargs flag to turn thinking off, and keep only the final visible answer in multi-turn history.\u003C\u002Fp>\u003Cpre>\u003Ccode>llama-server -m model.gguf --chat-template-kwargs '{","Run Google Gemma 4 locally with Unsloth Studio or llama.cpp.","unsloth.ai","https:\u002F\u002Funsloth.ai\u002Fdocs\u002Fmodels\u002Fgemma-4",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780777065791-3eje.png","industry","en","8041b1f8-e409-44dc-b574-210938430234",[],0,"2026-06-06T20:17:21.697706+00:00","2026-06-06T20:17:21.691+00:00","f64ca195-5023-4a62-902d-e41d46c29c0b",{"tags":22,"relatedLang":23,"relatedPosts":27},[],{"id":15,"slug":24,"title":25,"language":26},"how-to-run-gemma-4-locally-unsloth-zh","怎麼在本機跑 Gemma 4","zh",[28,34,40,46,52,58],{"id":29,"slug":30,"title":31,"cover_image":32,"image_url":32,"created_at":33,"category":13},"04abb0e5-93b7-41e6-9f43-c7721b3ab84e","6-bullpen-notes-for-fantasy-managers-en","6 bullpen notes for fantasy managers","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780786973443-k2al.png","2026-06-06T23:02:26.50808+00:00",{"id":35,"slug":36,"title":37,"cover_image":38,"image_url":38,"created_at":39,"category":13},"6afa3e13-019b-49a8-9d91-f056dfb1598a","why-dynamic-leverage-schedules-are-sane-risk-control-en","Why dynamic leverage schedules are a sane risk control, not a trader …","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780786064039-q62h.png","2026-06-06T22:47:20.191361+00:00",{"id":41,"slug":42,"title":43,"cover_image":44,"image_url":44,"created_at":45,"category":13},"34ea1937-5d5b-44c6-8c9a-623f86d027a0","4-hail-risks-for-colorado-on-monday-en","4 hail risks for Colorado on Monday","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780783365476-wiky.png","2026-06-06T22:02:19.0312+00:00",{"id":47,"slug":48,"title":49,"cover_image":50,"image_url":50,"created_at":51,"category":13},"3efa73ac-5629-4c25-aa62-6c806fa95fdb","denver-hail-storm-downtown-dia-delay-en","Denver Hail Storm Slams Metro and DIA","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780782468657-ysxi.png","2026-06-06T21:47:24.736073+00:00",{"id":53,"slug":54,"title":55,"cover_image":56,"image_url":56,"created_at":57,"category":13},"aebd60b6-dcc5-4ccc-8a2e-a283f2254de1","5-storm-timing-cues-for-denver-this-week-en","5 storm timing cues for Denver this week","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780781571074-5raf.png","2026-06-06T21:32:21.123696+00:00",{"id":59,"slug":60,"title":61,"cover_image":62,"image_url":62,"created_at":63,"category":13},"a71ad261-e32d-44b8-ab3e-4bff5ac98055","denver-hailstorm-roads-damage-checklist-en","Denver hailstorm turns roads into a damage checklist","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780780693560-vyq8.png","2026-06-06T21:17:48.840853+00:00",[65,70,75,80,85,90,95,100,105,110],{"id":66,"slug":67,"title":68,"created_at":69},"d35a1bd9-e709-412e-a2df-392df1dc572a","ai-impact-2026-developments-market-en","AI's Impact in 2026: Key Developments and Market Shifts","2026-03-25T16:20:33.205823+00:00",{"id":71,"slug":72,"title":73,"created_at":74},"5ed27921-5fd6-492e-8c59-78393bf37710","trumps-ai-legislative-framework-en","Trump's AI Legislative Framework: What's Inside?","2026-03-25T16:22:20.005325+00:00",{"id":76,"slug":77,"title":78,"created_at":79},"e454a642-f03c-4794-b185-5f651aebbaca","nvidia-gtc-2026-key-highlights-innovations-en","NVIDIA GTC 2026: Key Highlights and Innovations","2026-03-25T16:22:47.882615+00:00",{"id":81,"slug":82,"title":83,"created_at":84},"0ebb5b16-774a-4922-945d-5f2ce1df5a6d","claude-usage-diversifies-learning-curves-en","Claude Usage Diversifies, Learning Curves Emerge","2026-03-25T16:25:50.770376+00:00",{"id":86,"slug":87,"title":88,"created_at":89},"69934e86-2fc5-4280-8223-7b917a48ace8","openclaw-ai-commoditization-concerns-en","OpenClaw's Rise Raises Concerns of AI Model Commoditization","2026-03-25T16:26:30.582047+00:00",{"id":91,"slug":92,"title":93,"created_at":94},"b4b2575b-2ac8-46b2-b90e-ab1d7c060797","google-gemini-ai-rollout-2026-en","Google's Gemini AI Rollout Extended to 2026","2026-03-25T16:28:14.808842+00:00",{"id":96,"slug":97,"title":98,"created_at":99},"6e18bc65-42ae-4ad0-b564-67d7f66b979e","meta-llama4-fabricated-results-scandal-en","Meta's Llama 4 Scandal: Fabricated AI Test Results Unveiled","2026-03-25T16:29:15.482836+00:00",{"id":101,"slug":102,"title":103,"created_at":104},"bf888e9d-08be-4f47-996c-7b24b5ab3500","accenture-mistral-ai-deployment-en","Accenture and Mistral AI Team Up for AI Deployment","2026-03-25T16:31:01.894655+00:00",{"id":106,"slug":107,"title":108,"created_at":109},"5382b536-fad2-49c6-ac85-9eb2bae49f35","mistral-ai-high-stakes-2026-en","Mistral AI: Facing High Stakes in 2026","2026-03-25T16:31:39.941974+00:00",{"id":111,"slug":112,"title":113,"created_at":114},"9da3d2d6-b669-4971-ba1d-17fdb3548ed5","cursors-meteoric-rise-pressures-en","Cursor's Meteoric Rise Faces Industry Pressures","2026-03-25T16:32:21.899217+00:00"]