[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-gemini-35-flash-pricing-benchmarks-zh":3,"article-related-gemini-35-flash-pricing-benchmarks-zh":32,"series-model-release-948a7dc4-b172-42f9-9bef-abcbbffaca18":85},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":24,"views":28,"created_at":29,"published_at":30,"topic_cluster_id":31},"948a7dc4-b172-42f9-9bef-abcbbffaca18","gemini-35-flash-pricing-benchmarks-zh","Gemini 3.5 Flash 價格與長上下文解析","\u003Cp data-speakable=\"summary\">\u003Ca href=\"\u002Ftag\u002Fgemini\">Gemini\u003C\u002Fa> 3.5 Flash 把 1048576 token 長上下文和低價 API 綁在一起，適合文件、程式碼和 \u003Ca href=\"\u002Ftag\u002Fagent\">agent\u003C\u002Fa> 工作流。\u003C\u002Fp>\u003Cp>說真的，這組數字很直接。\u003Ca href=\"https:\u002F\u002Fopenrouter.ai\u002Fgoogle\u002Fgemini-3.5-flash\" target=\"_blank\" rel=\"noopener\">Gemini 3.5 Flash\u003C\u002Fa> 在 \u003Ca href=\"https:\u002F\u002Fopenrouter.ai\" target=\"_blank\" rel=\"noopener\">OpenRouter\u003C\u002Fa> 上的輸入價是每百萬 token 1.50 美元，輸出價是 9 美元。模型發布日期是 \u003Ca href=\"\u002Fnews\u002Fbell-2026-slate-meatballs-littlest-hobo-zh\">2026\u003C\u002Fa> 年 5 月 19 日。\u003C\u002Fp>\u003Cp>它最吸睛的地方，不是名字。是 1,048,576 token 的 context window。這種長度，已經可以直接把大型文件、整個 codebase，甚至多輪對話一起塞進去。對開發者來說，這代表少切 chunk，少做土炮拼接。\u003C\u002Fp>\u003Cp>如果你在做客服、文件分析、程式碼助理，這顆模型很容易進入成本試算表。因為它不是只會喊口號。它真的把價格和容量都壓到一個能上線的範圍。\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>指標\u003C\u002Fth>\u003Cth>數值\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>輸入價格\u003C\u002Ftd>\u003Ctd>每百萬 token 1.50 美元\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>輸出價格\u003C\u002Ftd>\u003Ctd>每百萬 token 9 美元\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Context window\u003C\u002Ftd>\u003Ctd>1,048,576 token\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Weekly tokens\u003C\u002Ftd>\u003Ctd>525B\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>發布日期\u003C\u002Ftd>\u003Ctd>2026-05-19\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>Google 這次在賣什麼\u003C\u002Fh2>\u003Cp>\u003Ca href=\"\u002Ftag\u002Fgoogle\">Google\u003C\u002Fa> 把 \u003Ca href=\"https:\u002F\u002Fai.google.dev\u002Fgemini-api\u002Fdocs\u002Fmodels\u002Fgemini-3-5-flash\" target=\"_blank\" rel=\"noopener\">Gemini 3.5 Flash\u003C\u002Fa> 定位成高效率的 multimodal model。官方說法很明確，重點是 coding、reasoning，還有平行 agent loop。白話講，就是要你拿它去跑大量任務，不是只拿來聊天。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780840978961-6b9n.png\" alt=\"Gemini 3.5 Flash 價格與長上下文解析\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>它支援 text、image、video、audio、PDF。這很實用。因為很多產品現在早就不是純文字。你可能要讀會議錄音，也可能要看截圖，再順手抓 PDF 裡的規格書。這種情境下，單一模型比東拼西湊的流程好維護很多。\u003C\u002Fp>\u003Cp>另外一個重點是 thinking effort。預設是 medium，還有 minimal、low、medium、high 可選。這不是花俏設定。這是成本控制鈕。簡單任務就別硬開高檔，錢真的會燒很快。\u003C\u002Fp>\u003Cul>\u003Cli>支援輸入：text、image、video、audio、PDF\u003C\u002Fli>\u003Cli>預設 thinking effort：medium\u003C\u002Fli>\u003Cli>可調等級：minimal、low、medium、high\u003C\u002Fli>\u003Cli>官方主打：coding 與 parallel agent loops\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>價格為什麼重要\u003C\u002Fh2>\u003Cp>每百萬輸入 token 1.50 美元，這個價位很有意思。它讓長上下文任務不再那麼痛。以前很多團隊會因為成本，把文件切得很碎。結果是上下文斷掉，模型回答也跟著飄。\u003C\u002Fp>\u003Cp>輸出價每百萬 token 9 美元，明顯比輸入貴。這很正常。因為輸出通常代表模型真的在生成內容。講白了，Google 也在提醒你，別把廢話全丟出去。能先摘要，就先摘要。\u003C\u002Fp>\u003Cblockquote>“The right model is the one that gives you the best quality at the lowest cost.” — Sundar Pichai, Google I\u002FO 2024 keynote\u003C\u002Fblockquote>\u003Cp>這句話放在這裡剛好。因為 Gemini 3.5 Flash 的核心賣點，不是最貴，也不是最炫。它是在算帳。對產品團隊來說，算帳比喊口號重要得多。\u003C\u002Fp>\u003Cp>如果你在做大量請求的產品，這種價格結構很關鍵。輸入便宜，代表你可以餵更多背景資料。輸出偏貴，代表你要管好回答長度。這會直接影響你的 prompt 設計。\u003C\u002Fp>\u003Ch2>和其他模型怎麼比\u003C\u002Fh2>\u003Cp>先看同家產品。\u003Ca href=\"https:\u002F\u002Fopenrouter.ai\u002Fgoogle\u002Fgemini-3.5-pro\" target=\"_blank\" rel=\"noopener\">Gemini 3.5 Pro\u003C\u002Fa> 會更偏向高階推理。Flash 則是吞吐量和成本優先。兩者差別很像一台重型工作站，跟一台跑量機器。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780840985858-outi.png\" alt=\"Gemini 3.5 Flash 價格與長上下文解析\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>再看平台層。\u003Ca href=\"https:\u002F\u002Fopenrouter.ai\u002Fmodels\" target=\"_blank\" rel=\"noopener\">OpenRouter models\u003C\u002Fa> 把不同供應商放在一起比較。這對開發團隊很\u003Ca href=\"\u002Fnews\u002Fwhy-leverage-is-making-indias-rally-look-stronger-en-zh\">實際\u003C\u002Fa>。你不用一個個去查 API 文件，也不用在每家平台之間重寫一堆整合碼。\u003C\u002Fp>\u003Cp>1M token context 的意義也很直接。以前很多長文件任務，要先做 chunking，再做 retrieval，再做 rerank。現在有些情境可以少繞幾圈。這不代表 RAG 沒用了。只是工具鏈可以更短。\u003C\u002Fp>\u003Cul>\u003Cli>Gemini 3.5 Pro：更偏高階推理\u003C\u002Fli>\u003Cli>Gemini 3.5 Flash：更偏成本與吞吐\u003C\u002Fli>\u003Cli>OpenRouter：可集中比較供應商與價格\u003C\u002Fli>\u003Cli>1M token context：可減少 chunking 與拼接\u003C\u002Fli>\u003Cli>OpenRouter 列出的 weekly tokens：525B\u003C\u002Fli>\u003C\u002Ful>\u003Cp>525B weekly tokens 這個數字也值得看。它代表平台預期有很大的流量，不是只給 \u003Ca href=\"\u002Ftag\u002Fbenchmark\">benchmark\u003C\u002Fa> 玩家玩玩。只要模型真的能跑產品，token 消耗會很快上來。\u003C\u002Fp>\u003Ch2>開發者該怎麼看這顆模型\u003C\u002Fh2>\u003Cp>我覺得最實際的問題，不是它能不能看長文件。是它能不能穩定、便宜、反覆地看長文件。這三件事同時成立，模型才會進到 production。\u003C\u002Fp>\u003Cp>如果你做的是客服助理，長上下文可以把歷史工單一起帶進來。你做的是 \u003Ca href=\"\u002Fnews\u002F5-model-config-tips-for-claude-code-users-zh\">code\u003C\u002Fa> assistant，就能把整個 repo 片段和錯誤 log 一起丟進去。你做的是文件產品，PDF 和圖片也能一起處理。\u003C\u002Fp>\u003Cp>但別太浪漫。長上下文不等於高品質。模型可能會讀很多東西，卻抓錯重點。這種時候，benchmark 和真實工作流測試就很重要。光看官方宣傳，容易翻車。\u003C\u002Fp>\u003Cp>對台灣團隊來說，這種模型很適合拿來試高流量場景。像是內部知識庫、法務摘要、客服回覆、程式碼審查。這些場景都很吃 context，也很吃成本。\u003C\u002Fp>\u003Ch2>背景脈絡：Flash 為什麼越來越重要\u003C\u002Fh2>\u003Cp>過去大家談 LLM，常常先看最強模型。現在很多產品團隊反而先看便宜模型。原因很簡單。產品不是 demo。產品要算單位成本，也要算延遲。\u003C\u002Fp>\u003Cp>Flash 類型模型的價值，就是把夠用的能力壓進可接受的價格。這讓很多原本只能做 PoC 的功能，變成真的能上線。尤其是每天要跑幾十萬次請求的服務，差一點點單價，月底帳單就差很多。\u003C\u002Fp>\u003Cp>Google 這次把 multimodal、長上下文、agent loop、價格一起包進來，方向很清楚。它不是只想跟別人比參數。它要你真的把模型放進產品流程。\u003C\u002Fp>\u003Cp>對開發者來說，下一步很簡單。先挑一個真實任務。測 100 筆資料。看準確率、延遲、token 花費。不要只看 demo。demo 很會騙人。\u003C\u002Fp>\u003Ch2>結尾：先拿一個場景去測\u003C\u002Fh2>\u003Cp>如果你正在選模型，我會建議先拿文件摘要或 \u003Ca href=\"\u002Ftag\u002Fcode-review\">code review\u003C\u002Fa> 來試。這兩種任務最容易看出長上下文有沒有真的派上用場。\u003C\u002Fp>\u003Cp>Gemini 3.5 Flash 的重點很明白。它不是要你重新想像 AI。它是要你用比較低的成本，把更多資料丟進同一個流程裡。接下來真正要看的是，你的產品能不能把這個能力變成穩定功能，而不是一次性的展示。\u003C\u002Fp>","Gemini 3.5 Flash 主打 1048576 token 長上下文，API 價格為每百萬輸入 1.50 美元、輸出 9 美元，適合文件、程式碼與 agent 工作流。","openrouter.ai","https:\u002F\u002Fopenrouter.ai\u002Fgoogle\u002Fgemini-3.5-flash",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780840978961-6b9n.png","model-release","zh","12af5a0d-1bbf-4a50-a391-b53f8003f234",[17,18,19,20,21,22,23],"Gemini 3.5 Flash","Google Gemini","LLM pricing","long context","OpenRouter","multimodal model","API pricing",[25,26,27],"每百萬輸入 token 1.50 美元、輸出 9 美元，適合高流量產品。","1,048,576 token context window，對文件與 codebase 很有用。","支援 text、image、video、audio、PDF，適合多模態工作流。",3,"2026-06-07T14:02:29.835438+00:00","2026-06-07T14:02:29.83+00:00","0a3b4f35-7be1-430e-b708-37bdc8b5219a",{"tags":33,"relatedLang":44,"relatedPosts":48},[34,36,38,40,42],{"name":18,"slug":35},"google-gemini",{"name":21,"slug":37},"openrouter",{"name":17,"slug":39},"gemini-35-flash",{"name":19,"slug":41},"llm-pricing",{"name":20,"slug":43},"long-context",{"id":15,"slug":45,"title":46,"language":47},"gemini-35-flash-pricing-benchmarks-en","Gemini 3.5 Flash Pricing, Context, Benchmarks","en",[49,55,61,67,73,79],{"id":50,"slug":51,"title":52,"cover_image":53,"image_url":53,"created_at":54,"category":13},"5507f140-5223-4f68-ade6-30d9e5457638","gemma-4-12b-specs-benchmarks-run-locally-zh","怎麼做 Gemma 4 12B 本地部署","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780777971165-4bit.png","2026-06-06T20:32:24.857611+00:00",{"id":56,"slug":57,"title":58,"cover_image":59,"image_url":59,"created_at":60,"category":13},"ef42a437-8b06-4ff5-a135-ece7662c01f4","best-kimi-models-2026-k2-5-vs-k2-thinking-zh","2026 最佳 Kimi 模型：K2.5 對 K2 Thinking","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780770790333-x3lk.png","2026-06-06T18:32:39.410186+00:00",{"id":62,"slug":63,"title":64,"cover_image":65,"image_url":65,"created_at":66,"category":13},"fd2ad557-5c09-4758-964d-cda1c3c87a4c","kimi-k2-6-open-source-coding-agent-swarm-zh","Kimi K2.6 開源加上 Agent Swarm","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780761795960-0zg9.png","2026-06-06T16:02:21.702099+00:00",{"id":68,"slug":69,"title":70,"cover_image":71,"image_url":71,"created_at":72,"category":13},"8102ddec-e015-4294-9940-bf65553ae70d","minimax-m3-triple-capability-open-model-zh","MiniMax M3：開源三合一模型","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780756383081-hr0b.png","2026-06-06T14:32:35.396612+00:00",{"id":74,"slug":75,"title":76,"cover_image":77,"image_url":77,"created_at":78,"category":13},"409fc126-8ed2-42e3-bec3-9d114c4aca23","why-minimax-m3-matters-long-context-model-zh","為什麼 MiniMax M3 比又一個長上下文模型更重要","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780755468369-c0ia.png","2026-06-06T14:17:20.522361+00:00",{"id":80,"slug":81,"title":82,"cover_image":83,"image_url":83,"created_at":84,"category":13},"c92651ec-b626-49a2-bceb-230763733e3c","minimax-m3-engineer-workflow-agent-zh","MiniMax M3 讓工程師工作流更像代理","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780754606789-1jpm.png","2026-06-06T14:02:54.658299+00:00",[86,91,96,101,106,111,116,121,126,131],{"id":87,"slug":88,"title":89,"created_at":90},"58b64033-7eb6-49b9-9aab-01cf8ae1b2f2","nvidia-rubin-six-chips-one-ai-supercomputer-zh","NVIDIA Rubin 把六顆晶片塞進 AI 機櫃","2026-03-26T07:18:45.861277+00:00",{"id":92,"slug":93,"title":94,"created_at":95},"0dcc2c61-c2a6-480d-adb8-dd225fc68914","march-2026-ai-model-news-what-mattered-zh","2026 年 3 月 AI 模型新聞重點","2026-03-26T07:32:08.386348+00:00",{"id":97,"slug":98,"title":99,"created_at":100},"214ab08b-5ce5-4b5c-8b72-47619d8675dd","why-small-models-are-winning-on-device-ai-zh","小模型為何吃下裝置端 AI","2026-03-26T07:36:30.488966+00:00",{"id":102,"slug":103,"title":104,"created_at":105},"785624b2-0355-4b82-adc3-de5e45eecd88","midjourney-v8-faster-images-higher-costs-zh","Midjourney V8 變快了，也變貴了","2026-03-26T07:52:03.562971+00:00",{"id":107,"slug":108,"title":109,"created_at":110},"cda76b92-d209-4134-86c1-a60f5bc7b128","xiaomi-mimo-trio-agents-robots-voice-zh","小米 MiMo 三模型瞄準代理、機器人與語音","2026-03-28T03:05:08.779489+00:00",{"id":112,"slug":113,"title":114,"created_at":115},"9e1044b4-946d-47fe-9e2a-c2ee032e1164","xiaomi-mimo-v2-pro-1t-moe-agents-zh","小米 MiMo-V2-Pro 登場：1T MoE 模型","2026-03-28T03:06:19.002353+00:00",{"id":117,"slug":118,"title":119,"created_at":120},"c4b6186f-bd84-4598-997e-c6e31d543c0d","cursor-composer-2-agentic-coding-model-zh","Cursor Composer 2 走向代理式寫碼","2026-03-28T03:13:06.422716+00:00",{"id":122,"slug":123,"title":124,"created_at":125},"e112e76f-ec3b-408f-810e-e93ae21a888a","apple-siri-gemini-distilled-models-zh","Apple Siri 牽手 Gemini 的真相","2026-03-29T04:52:57.886544+00:00",{"id":127,"slug":128,"title":129,"created_at":130},"c679b51f-194a-463b-87fc-7695256ff752","mimo-v2-pro-vs-omni-vs-flash-2026-zh","MiMo V2 Pro、Omni、Flash 怎麼選","2026-04-02T01:18:43.576128+00:00",{"id":132,"slug":133,"title":134,"created_at":135},"3b988fd7-6749-4f01-ba25-c0ad7486dc31","z-ai-glm-5v-turbo-design2code-claude-zh","GLM-5V-Turbo 在 Design2Code 贏了…","2026-04-02T04:03:36.31741+00:00"]