[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tag-multimodal-model":3},{"tag":4,"articles":11,"peer_article_count":12},{"id":5,"name":6,"slug":7,"article_count":8,"description_zh":9,"description_en":10},"e8dc5ded-c6ae-4bc2-a394-20e0ae5cd8d2","multimodal model","multimodal-model",3,"多模態模型把文字、圖片、程式碼與其他感知訊號放進同一個推理框架，讓模型能看圖寫碼、理解介面、處理文件與代理任務。這類模型的重點不只在能力提升，也在部署成本、上下文長度與工具整合方式。","Multimodal models combine text with vision, code, and other signals in one system, enabling tasks like image-grounded coding, UI understanding, document analysis, and agent workflows. Their real impact depends on capability, context length, and deployment cost.",[],4]