[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-world-action-models-robotics-second-bet-zh":3,"article-related-world-action-models-robotics-second-bet-zh":34,"series-industry-0ae8c8be-03b1-4907-a3ab-c19064291db4":78},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":25,"views":30,"created_at":31,"published_at":32,"topic_cluster_id":33},"0ae8c8be-03b1-4907-a3ab-c19064291db4","world-action-models-robotics-second-bet-zh","4 種世界動作模型，正在改寫機器人策略","\u003Cp data-speakable=\"summary\">世界動作模型正在成為機器人策略的第二條路，和 VLA 並行競爭。\u003C\u002Fp>\n\u003Cp>看完這 4 種設計\u003Ca href=\"\u002Fnews\u002Fmlops-roadmap-2026-turns-learning-into-delivery-zh\">路線\u003C\u002Fa>，你就能判斷：團隊該先押影片先驗、逆動力學、聯合預測，還是直接做混合式架構。這不是抽象趨勢，而是會影響資料需求、控制方式與部署難度的選擇。\u003C\u002Fp>\n\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>項目\u003C\u002Fth>\u003Cth>核心思路\u003C\u002Fth>\u003Cth>代表訊號\u003C\u002Fth>\u003Cth>對機器人策略的影響\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>1. 影片骨幹 WAM\u003C\u002Ftd>\u003Ctd>先用預訓練影片模型當 policy backbone\u003C\u002Ftd>\u003Ctd>強世界動態先驗\u003C\u002Ftd>\u003Ctd>較容易吃進視覺變化與未來狀態\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>2. 逆動力學 WAM\u003C\u002Ftd>\u003Ctd>從狀態轉移反推動作\u003C\u002Ftd>\u003Ctd>動作標註較少也能學\u003C\u002Ftd>\u003Ctd>適合從影片學控制結構\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>3. 聯合預測模型\u003C\u002Ftd>\u003Ctd>同時預測未來狀態與動作\u003C\u002Ftd>\u003Ctd>感知與控制一起對齊\u003C\u002Ftd>\u003Ctd>有助長時序任務與規劃\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>4. 混合 VLA-WAM 堆疊\u003C\u002Ftd>\u003Ctd>語言理解加上世界預測\u003C\u002Ftd>\u003Ctd>語意與物理一起處理\u003C\u002Ftd>\u003Ctd>更接近可落地的通用策略\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\n\u003Ch2>1. 影片骨幹先上場\u003C\u002Fh2>\n\u003Cp>WAM 最直觀的做法，是直接拿預訓練影片模型當控制骨幹，再把它\u003Ca href=\"\u002Fnews\u002Fllm-fine-tuning-turns-generic-models-into-domain-tools-zh\">微調\u003C\u002Fa>成機器人策略。這樣做的重點不是從零學「怎麼動」，而是先借到一個對物體持續性、運動軌跡、場景變化都很熟悉的先驗。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782571677906-sezx.png\" alt=\"4 種世界動作模型，正在改寫機器人策略\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\n\u003Cp>像 \u003Ca href=\"https:\u002F\u002Fdeveloper.nvidia.com\u002Fblog\u002Fpretrained-to-imagine-fine-tuned-to-act-the-rise-of-world-action-models\u002F\">NVIDIA Cosmos\u003C\u002Fa> 這類方向，還有 DreamZero、LingBot-VA、Cortex 2.0，都在示範同一件事：影片預訓練可能比純機器人資料更省樣本。\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>常見輸入：影像、影片、語言、潛在狀態\u003C\u002Fli>\n  \u003Cli>常見輸出：未來幀、latent 特徵、動作片段\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>2. 逆動力學把問題倒過來\u003C\u002Fh2>\n\u003Cp>逆動力學的思路更像「看結果猜過程」。模型先看現在觀測與未來觀測，再反推出最可能造成這段變化的動作序列。這讓它很適合處理動作標註少、但影片資料多的場景。\u003C\u002Fp>\n\u003Cp>如果你的資料大多來自\u003Ca href=\"\u002Fnews\u002Fage-verification-surveillance-checkpoint-internet-zh\">網路\u003C\u002Fa>影片或示範錄影，而不是大量機器人 log，這條路會特別有吸引力。它也常和離散動作 \u003Ca href=\"\u002Ftag\u002Ftoken\">token\u003C\u002Fa> 或 latent action space 搭配，讓模型能從狀態轉移中學到控制結構。\u003C\u002Fp>\n\u003Ccode>o_t + o_t+k -> a_t:t+k-1\u003C\u002Fcode>\n\u003Ch2>3. 聯合預測把感知和控制綁在一起\u003C\u002Fh2>\n\u003Cp>聯合預測的核心，是讓同一個 policy 同時輸出未來狀態與動作。這種做法的好處很直接：模型如果要同時說明「接下來會發生什麼」與「現在該做什麼」，內部表徵就比較不容易脫鉤。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782571679138-abb3.png\" alt=\"4 種世界動作模型，正在改寫機器人策略\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\n\u003Cp>對長時序任務來說，這通常比只吐動作更穩。因為策略不只是在執行，還在持續檢查自己對環境變化的理解是否一致。\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>可用 action chunk 取代單步控制\u003C\u002Fli>\n  \u003Cli>可搭配 latent 預測或明確未來幀\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>4. 混合 VLA-WAM 最像落地解\u003C\u002Fh2>\n\u003Cp>現在最值得注意的，不一定是純 VLA 或純 WAM，而是兩者混合。做法通常是先用語言能力強的 vision-language backbone 負責指令理解，再交給 world model 去處理場景演化與動作生成。\u003C\u002Fp>\n\u003Cp>這種堆疊很符合實際機器人需求：一邊要懂人講什麼，一邊要知道物理世界接下來會怎麼變。當任務同時牽涉語意對齊、接觸動作與分佈外情境時，混合架構往往最有工程價值。\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>適合已有 VLM 與影片基礎設施的團隊\u003C\u002Fli>\n  \u003Cli>適合需要指令理解加上動態預測的任務\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>5. 大規模預訓練正在改變門檻\u003C\u002Fh2>\n\u003Cp>WAM 之所以突然變得更像主流，不只是架構巧，而是預訓練規模開始到位。文中提到 \u003Ca href=\"https:\u002F\u002Fvla-foundry.github.io\u002F\">VLA Foundry\u003C\u002Fa> 的 Foundry-\u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa> checkpoint，採用 1.2B 非 embedding 參數、在 800B DCLM-Baseline-1.0 tokens 上訓練，這代表通用預訓練已經能先把底子打好。\u003C\u002Fp>\n\u003Cp>對 WAM 來說，這個訊號很重要：機器人資料不再是唯一燃料。影片語料、世界模型目標、跨模態基座一起進來後，模型在進入真實控制前就已經有更強的先驗。\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>預訓練來源可以是文字、影片或多模態資料\u003C\u002Fli>\n  \u003Cli>真實機器人微調仍然必要\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>哪種適合你\u003C\u002Fh2>\n\u003Cp>如果你最在意的是指令跟隨、現有機器人堆疊與快速落地，VLA 仍然是保守選擇；如果你更在意場景動態、從影片學控制，影片骨幹 WAM 和逆動力學會更有吸引力。\u003C\u002Fp>\n\u003Cp>如果你的目標是做通用策略、又不想在語言與物理之間二選一，混合 VLA-WAM 是最值得追的路線。這份清單真正要幫你做的決定，就是先分清楚瓶頸在語言、世界預測，還是兩者之間的落差。\u003C\u002Fp>","4 種世界動作模型設計路線，從影片先驗到逆動力學與混合堆疊，幫你判斷機器人政策該往哪個方向投資。","developer.nvidia.com","https:\u002F\u002Fdeveloper.nvidia.com\u002Fblog\u002Fpretrained-to-imagine-fine-tuned-to-act-the-rise-of-world-action-models\u002F",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782571677906-sezx.png","industry","zh","987bcfba-7789-428b-bfad-76fe040976a5",[17,18,19,20,21,22,23,24],"world-action models","robotics","VLA","VLM","inverse dynamics","video backbone","hybrid policy","robot policy design",[26,27,28,29],"影片骨幹 WAM 先借用世界動態先驗，能降低機器人資料依賴。","逆動力學適合從影片學控制，特別是在動作標註稀缺時。","聯合預測能把感知與控制綁在一起，對長時序任務更穩。","混合 VLA-WAM 最接近落地，適合同時需要語言理解與物理預測的團隊。",0,"2026-06-27T14:47:28.833773+00:00","2026-06-27T14:47:28.817+00:00","fe20f6f6-432b-47bf-a410-a5f516d885ed",{"tags":35,"relatedLang":37,"relatedPosts":41},[36],{"name":18,"slug":18},{"id":15,"slug":38,"title":39,"language":40},"world-action-models-robotics-second-bet-en","World-action models are becoming robotics’ second bet","en",[42,48,54,60,66,72],{"id":43,"slug":44,"title":45,"cover_image":46,"image_url":46,"created_at":47,"category":13},"32300eb0-be8b-4b0a-8f81-86a1b849bb7d","kehua-charging-stack-turns-ev-sites-into-power-hubs-zh","Kehua 充電堆疊把站點變電力樞紐","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782588807437-5671.png","2026-06-27T19:32:58.46664+00:00",{"id":49,"slug":50,"title":51,"cover_image":52,"image_url":52,"created_at":53,"category":13},"f3140e56-e2fe-4b8d-92c9-14f518a7fcd4","distributed-finance-us-payments-trading-zh","美國金融已靠分散式系統運作","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782587884552-zcew.png","2026-06-27T19:17:36.274689+00:00",{"id":55,"slug":56,"title":57,"cover_image":58,"image_url":58,"created_at":59,"category":13},"963c8999-024c-475b-941c-29c131a1b0b0","lore-binary-first-version-control-scales-zh","Lore 以二進位優先擴展大型版本控制","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782586964155-06do.png","2026-06-27T19:02:20.073226+00:00",{"id":61,"slug":62,"title":63,"cover_image":64,"image_url":64,"created_at":65,"category":13},"a46f53da-8f37-45a9-b838-82d7d0b9f433","pakistan-banks-crypto-services-letter-zh","巴基斯坦讓銀行碰加密服務","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782585173687-1od9.png","2026-06-27T18:32:28.091491+00:00",{"id":67,"slug":68,"title":69,"cover_image":70,"image_url":70,"created_at":71,"category":13},"218528dc-b109-45ba-b6f2-6b8181b9c84d","anthropic-alibaba-claude-distillation-attack-zh","Anthropic 指控 Alibaba 大量蒸餾 Claude","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782583382703-xndr.png","2026-06-27T18:02:38.014434+00:00",{"id":73,"slug":74,"title":75,"cover_image":76,"image_url":76,"created_at":77,"category":13},"bbda7f1c-1014-42cf-8a7d-5421923e3170","ai-payment-bots-strict-limits-web3-zh","AI 付款代理該受嚴格限制，不該全自動放權","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782582495923-7xn0.png","2026-06-27T17:47:22.096587+00:00",[79,84,89,94,99,104,109,114,119,124],{"id":80,"slug":81,"title":82,"created_at":83},"ee073da7-28b3-4752-a319-5a501459fb87","ai-in-2026-what-actually-matters-now-zh","2026 AI 真正重要的事","2026-03-26T07:09:12.008134+00:00",{"id":85,"slug":86,"title":87,"created_at":88},"83bd1795-8548-44c9-9a7e-de50a0923f71","trump-ai-framework-power-speech-state-preemption-zh","川普 AI 框架瞄準電力、言論與州權","2026-03-26T07:12:18.695466+00:00",{"id":90,"slug":91,"title":92,"created_at":93},"ea6be18b-c903-4e54-97b7-5f7447a612e0","nvidia-gtc-2026-big-ai-announcements-zh","NVIDIA GTC 2026 重點拆解","2026-03-26T07:14:26.62638+00:00",{"id":95,"slug":96,"title":97,"created_at":98},"4bcec76f-4c36-4daa-909f-54cd702f7c93","claude-users-spreading-out-and-getting-better-zh","Claude 用戶更分散，也更會用","2026-03-26T07:22:52.325888+00:00",{"id":100,"slug":101,"title":102,"created_at":103},"bd903b15-2473-4178-9789-b7557816e535","openclaw-raises-hard-question-for-ai-models-zh","OpenClaw 逼問 AI 模型價值","2026-03-26T07:24:54.707486+00:00",{"id":105,"slug":106,"title":107,"created_at":108},"eeac6b9e-ad9d-4831-8eec-8bba3f9bca6a","gap-google-gemini-checkout-fashion-search-zh","Gap 把結帳搬進 Gemini","2026-03-26T07:28:23.937768+00:00",{"id":110,"slug":111,"title":112,"created_at":113},"0740e53f-605d-4d57-8601-c10beb126f3c","google-pushes-gemini-transition-to-march-2026-zh","Google 把 Gemini 轉換延到 2026 年 3…","2026-03-26T07:30:12.825269+00:00",{"id":115,"slug":116,"title":117,"created_at":118},"e660d801-2421-4529-8fa9-86b82b066990","metas-llama-4-benchmark-scandal-gets-worse-zh","Meta Llama 4 分數風波又擴大","2026-03-26T07:34:21.156421+00:00",{"id":120,"slug":121,"title":122,"created_at":123},"183f9e7c-e143-40bb-a6d5-67ba84a3a8bc","accenture-mistral-ai-sovereign-enterprise-deal-zh","Accenture 攜手 Mistral AI 賣主權 AI","2026-03-26T07:38:14.818906+00:00",{"id":125,"slug":126,"title":127,"created_at":128},"191d9b1b-768a-478c-978c-dd7431a38149","mistral-ai-faces-its-hardest-year-yet-zh","Mistral AI 迎來最硬的一年","2026-03-26T07:40:23.716374+00:00"]