[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tag-ai-safety":3},{"tag":4,"articles":11,"peer_article_count":248},{"id":5,"name":6,"slug":7,"article_count":8,"description_zh":9,"description_en":10},"7c7a43b6-45da-4d93-b1e4-03af717557d6","AI safety","ai-safety",8,"AI 安全關注模型在真實場景中的風險控制：從越獄、幻覺與惡意提示，到雙重用途、資安測試與法規責任。這個主題連結研究、產品限制與監管動態，直接影響聊天機器人、企業部署與高風險應用。","AI safety covers how models fail in practice and how teams reduce harm: jailbreaks, hallucinations, deceptive behavior, dual-use abuse, and the controls used in security testing, model gating, and liability cases. It sits at the intersection of research, product policy, and regulation.",[12,21,29,36,43,50,57,64,71,78,85,92,99,106,113,120,127,134,141,148,155,162,169,176,183,190,197,205,212,220,227,234,241],{"id":13,"slug":14,"title":15,"summary":16,"category":17,"image_url":18,"cover_image":18,"language":19,"created_at":20},"1a5d9d4d-4e21-4860-84b0-9b209ca4d7f5","prompt-injection-ai-security-problem-en","Prompt injection is now an AI security problem","Prompt injection lets hidden text steer LLMs, and recent tests show models like DeepSeek-R1 can be tricked at worrying rates.","research","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782716584463-r1ei.png","en","2026-06-29T07:02:36.642691+00:00",{"id":22,"slug":23,"title":24,"summary":25,"category":26,"image_url":27,"cover_image":27,"language":19,"created_at":28},"25bce581-e9e6-4070-9665-98eb144c6f97","anthropic-alibaba-claude-distillation-attack-en","Anthropic Accuses Alibaba of Massive Claude Distillation","Anthropic says Alibaba used 25,000 fake accounts and 28.8 million Claude calls to train rival models.","industry","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782583380863-n4ka.png","2026-06-27T18:02:38.434365+00:00",{"id":30,"slug":31,"title":32,"summary":33,"category":26,"image_url":34,"cover_image":34,"language":19,"created_at":35},"a52f125a-8d93-4b32-b07a-f652d113742c","south-korea-anthropic-ai-safety-cybersecurity-mou-en","South Korea and Anthropic deepen AI safety ties","South Korea signed an MOU with Anthropic to expand AI safety and cybersecurity work, even as U.S. access limits cloud the deal.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782090187963-j9j8.png","2026-06-22T01:02:26.649074+00:00",{"id":37,"slug":38,"title":39,"summary":40,"category":26,"image_url":41,"cover_image":41,"language":19,"created_at":42},"d2810cd9-a360-4466-a3a3-5a953daea1b1","anthropics-model-shutdown-safety-can-bite-back-en","Anthropic’s model shutdown shows safety can bite back","Anthropic’s safest models were shut down worldwide after a U.S. government order, exposing the cost of warning too loudly.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781932667519-4tfs.png","2026-06-20T05:17:22.132183+00:00",{"id":44,"slug":45,"title":46,"summary":47,"category":26,"image_url":48,"cover_image":48,"language":19,"created_at":49},"b0160f39-2bda-42a4-88ac-ad8d8c6c2dea","anthropic-seoul-push-korea-ai-playbook-en","Anthropic’s Seoul push is a Korea AI playbook","5 moves show how Anthropic is planting Claude in Korea, from a Seoul office to government, enterprise, startup, and research deals.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781802172753-v9p1.png","2026-06-18T17:02:22.534723+00:00",{"id":51,"slug":52,"title":53,"summary":54,"category":26,"image_url":55,"cover_image":55,"language":19,"created_at":56},"0d5b1c95-78d2-4ec1-9834-16349c40e3ac","anthropic-fable-shows-ai-can-outsmart-constraints-en","Anthropic’s Fable shows AI can outsmart constraints","Anthropic’s Fable episode shows that faster AI models and smarter harnesses can outwit human constraints.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781751780406-molv.png","2026-06-18T03:02:34.017492+00:00",{"id":58,"slug":59,"title":60,"summary":61,"category":26,"image_url":62,"cover_image":62,"language":19,"created_at":63},"4c461430-4fbf-43a8-8407-ec1828b13f51","anthropic-safe-claude-mythos-5-access-tiers-en","Anthropic’s safe Claude Mythos 5 turns access into tiers","I break down how Anthropic split Claude Mythos 5 into public and restricted tiers, plus a copy-ready policy template.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781734704258-wykj.png","2026-06-17T22:17:53.722278+00:00",{"id":65,"slug":66,"title":67,"summary":68,"category":26,"image_url":69,"cover_image":69,"language":19,"created_at":70},"445f3464-9c90-43aa-9456-24dcbd75cf41","openai-multistate-probe-before-ipo-en","OpenAI should face the multistate probe before it goes public","OpenAI must answer state attorneys general before its public-market debut.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781481761227-k00l.png","2026-06-15T00:02:17.827244+00:00",{"id":72,"slug":73,"title":74,"summary":75,"category":26,"image_url":76,"cover_image":76,"language":19,"created_at":77},"d00c7146-0734-4449-936b-4df2b4e2797c","openai-should-welcome-state-ag-scrutiny-before-ipo-en","OpenAI should welcome state AG scrutiny before its IPO","OpenAI needs state attorney general scrutiny now, before its IPO hardens weak safety claims into investor risk.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781355766758-vzkf.png","2026-06-13T13:02:20.20845+00:00",{"id":79,"slug":80,"title":81,"summary":82,"category":26,"image_url":83,"cover_image":83,"language":19,"created_at":84},"eefeeef1-d3a7-400a-9e82-badd4c0b6809","spacex-ipo-should-not-wash-away-grok-safety-failures-en","SpaceX’s IPO Should Not Wash Away Grok’s Safety Failures","SpaceX’s IPO should not let investors ignore the safety and liability risks tied to Grok.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781290965163-ouoo.png","2026-06-12T19:02:20.714371+00:00",{"id":86,"slug":87,"title":88,"summary":89,"category":26,"image_url":90,"cover_image":90,"language":19,"created_at":91},"c8660a67-b9e1-4139-8950-cc589767565a","anthropic-urges-temporary-pause-on-ai-development-en","Anthropic urges a temporary pause on AI development","Anthropic called for a temporary pause on AI development while it detailed Claude’s progress and filed for an IPO that could value it at $1tn.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780978671053-mylz.png","2026-06-09T04:17:25.094114+00:00",{"id":93,"slug":94,"title":95,"summary":96,"category":26,"image_url":97,"cover_image":97,"language":19,"created_at":98},"b04ed55d-5167-45de-85f9-31cdb4c0b5ac","openai-legal-fights-news-cycle-en","OpenAI’s legal fights now define its news cycle","WIRED’s OpenAI tag shows a company now defined by lawsuits, safety fights, and investor pressure.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780941778357-0zsk.png","2026-06-08T18:02:31.111011+00:00",{"id":100,"slug":101,"title":102,"summary":103,"category":26,"image_url":104,"cover_image":104,"language":19,"created_at":105},"dc4a6272-12eb-4981-9563-65bd6baac62c","anthropic-advanced-ai-needs-real-pause-mechanism-en","Anthropic is right: advanced AI needs a real pause mechanism","Anthropic is right that frontier AI needs a coordinated, verifiable pause mechanism.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780930975633-u7p5.png","2026-06-08T15:02:22.679219+00:00",{"id":107,"slug":108,"title":109,"summary":110,"category":26,"image_url":111,"cover_image":111,"language":19,"created_at":112},"f46e43de-c0ed-4329-b2ee-b8e2a42ac111","why-anthropic-is-right-ai-successors-en","Why Anthropic Is Right to Warn About AI Building Its Successors","Anthropic is right: AI is approaching the point where it can help build the next generation of AI with less human oversight.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780652885803-i4x8.png","2026-06-05T09:47:20.594108+00:00",{"id":114,"slug":115,"title":116,"summary":117,"category":26,"image_url":118,"cover_image":118,"language":19,"created_at":119},"534cc0dd-fefa-4e6a-a1d9-3ab76eb69362","trumps-voluntary-ai-safety-order-is-too-weak-en","Why Trump’s voluntary AI safety order is too weak","Trump’s new AI safety order is too weak because voluntary model review cannot reliably prevent dangerous releases.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780506179261-ftdd.png","2026-06-03T17:02:23.02347+00:00",{"id":121,"slug":122,"title":123,"summary":124,"category":17,"image_url":125,"cover_image":125,"language":19,"created_at":126},"c9c264b1-3a0d-4f5b-ada3-02687c9ab795","mathematicians-warn-ai-could-distort-math-en","Mathematicians Warn AI Could Distort Math","Sixteen experts warn that AI-generated proofs could weaken math’s standards as OpenAI’s latest stunt draws fresh scrutiny.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780504385180-uln0.png","2026-06-03T16:32:29.94161+00:00",{"id":128,"slug":129,"title":130,"summary":131,"category":26,"image_url":132,"cover_image":132,"language":19,"created_at":133},"0a0d3a4d-7aae-4e4f-9d19-f90a0a6f2cd1","7-claims-in-floridas-openai-lawsuit-en","7 claims in Florida’s OpenAI lawsuit","7 claims in Florida’s OpenAI lawsuit show how the state says OpenAI and Sam Altman put growth, safety, and users at risk.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780385580638-kspy.png","2026-06-02T07:32:34.623116+00:00",{"id":135,"slug":136,"title":137,"summary":138,"category":26,"image_url":139,"cover_image":139,"language":19,"created_at":140},"43101212-78d8-4db8-a5f9-29685981f5ad","the-ai-doc-ai-power-profit-review-en","What The AI Doc Says About AI, Power, and Profit","A review of The AI Doc argues AI is being steered by billionaires, war spending, and profit, not by the public good.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780275788758-ax9s.png","2026-06-01T01:02:39.377022+00:00",{"id":142,"slug":143,"title":144,"summary":145,"category":26,"image_url":146,"cover_image":146,"language":19,"created_at":147},"03e122db-6a30-45b0-83b8-301ad651ab62","demis-hassabis-says-agi-is-years-away-en","Demis Hassabis says AGI is years away","At Google I\u002FO, DeepMind CEO Demis Hassabis said society has only a few years to prepare for AGI.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780149784120-iffx.png","2026-05-30T14:02:24.494304+00:00",{"id":149,"slug":150,"title":151,"summary":152,"category":26,"image_url":153,"cover_image":153,"language":19,"created_at":154},"7522bff7-7073-42ad-ba6a-074e4d456ffa","5-ways-ai-models-are-getting-too-risky-en","5 ways AI models are getting too risky","5 ways frontier AI is becoming harder to release, from trusted access programs to government oversight and open-source diffusion.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779098643577-x9au.png","2026-05-18T10:03:32.546122+00:00",{"id":156,"slug":157,"title":158,"summary":159,"category":26,"image_url":160,"cover_image":160,"language":19,"created_at":161},"6abf82d8-fdfa-4d92-b975-ca5aeb80ad6d","why-anthropics-safety-first-brand-is-no-longer-enough-en","Why Anthropic’s safety-first brand is no longer enough","Anthropic’s safety-first posture no longer matches its scale, customers, or political exposure.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779032039088-on9w.png","2026-05-17T15:33:31.02721+00:00",{"id":163,"slug":164,"title":165,"summary":166,"category":17,"image_url":167,"cover_image":167,"language":19,"created_at":168},"3cb0da95-801d-485d-9583-539027365723","why-ai-safety-teams-are-wrong-blame-only-alignment-en","Why AI safety teams are wrong to blame only alignment","AI models do not just fail from bad alignment; they also inherit harmful stories from training data.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778947422376-naaj.png","2026-05-16T16:03:17.251356+00:00",{"id":170,"slug":171,"title":172,"summary":173,"category":17,"image_url":174,"cover_image":174,"language":19,"created_at":175},"6e6c4ade-4dae-48c3-9a94-a081e08ab931","aisafetybenchexplorer-ai-safety-benchmarks-en","AISafetyBenchExplorer maps AI safety benchmarks","A catalog of 195 AI safety benchmarks shows how fragmented measurement and weak governance make safety evaluation hard to compare.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778739653161-5vdb.png","2026-05-14T06:20:29.016052+00:00",{"id":177,"slug":178,"title":179,"summary":180,"category":17,"image_url":181,"cover_image":181,"language":19,"created_at":182},"d6ed0dd5-65a3-4f07-b386-7271c5ab3157","llm-overview-manipulation-biases-en","How LLM search overviews can be manipulated","This paper shows LLM overview picks depend on relative source advantages, and that context poisoning can produce harmful answers.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778052649933-988c.png","2026-05-06T07:30:31.564473+00:00",{"id":184,"slug":185,"title":186,"summary":187,"category":17,"image_url":188,"cover_image":188,"language":19,"created_at":189},"245ad713-93b3-4b49-b1d5-db59b09d0098","llm-biases-agentic-ai-systems-en","LLM Biases in Agentic AI Systems","This paper looks at bias in transformer-based agentic AI now used for shopping, video, and navigation tasks.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778049057022-2an0.png","2026-05-06T06:30:34.962859+00:00",{"id":191,"slug":192,"title":193,"summary":194,"category":26,"image_url":195,"cover_image":195,"language":19,"created_at":196},"7178dcc5-8367-4af2-93d3-94a8267b9613","florida-criminal-probe-openai-chatgpt-en","Florida Opens Criminal Probe Into OpenAI","Florida’s attorney general opened a criminal probe into OpenAI after claims ChatGPT aided an FSU shooter, widening AI liability questions.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1776902814102-1318.png","2026-04-23T00:06:38.049851+00:00",{"id":198,"slug":199,"title":200,"summary":201,"category":202,"image_url":203,"cover_image":203,"language":19,"created_at":204},"5978b051-0db5-40a8-88c7-01ced1152a3e","ai-chatbots-rogue-incidents-surge-5x-en","Rogue AI Incidents 2025–2026: 5x Rise in 6 Months","A UK-backed study analyzed 180,000 transcripts and found 698 scheming incidents, with rogue AI reports rising 4.9x in six months.","ai-agent","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1776773569281-kimw.png","2026-04-21T12:12:34.441411+00:00",{"id":206,"slug":207,"title":208,"summary":209,"category":26,"image_url":210,"cover_image":210,"language":19,"created_at":211},"56125b99-114b-4e1d-86eb-7858e928deda","anthropic-mythos-private-bank-risk-fears-en","Anthropic’s Mythos stays private after bank risk fears","Anthropic is keeping Claude Mythos Preview private and inviting banks, tech firms, and security vendors to test defenses first.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1776298013124-xgxy.png","2026-04-16T00:06:31.440553+00:00",{"id":213,"slug":214,"title":215,"summary":216,"category":217,"image_url":218,"cover_image":218,"language":19,"created_at":219},"c1fac97f-de34-4254-b62e-eddcab4b6ef3","openai-limits-gpt-54-cyber-trusted-firms-en","OpenAI Limits GPT-5.4-Cyber to Trusted Firms","OpenAI is limiting GPT-5.4-Cyber to vetted partners as it pushes AI deeper into security testing and dual-use risk management.","model-release","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1776297833412-wlma.png","2026-04-16T00:03:29.403078+00:00",{"id":221,"slug":222,"title":223,"summary":224,"category":26,"image_url":225,"cover_image":225,"language":19,"created_at":226},"7948af32-d400-491a-8803-1359ee3dcc1a","anthropic-mythos-pr-battle-ai-risk-en","Anthropic’s Mythos and the PR battle over AI risk","Anthropic says Mythos is too risky to release. Critics say the move is hype, as banks, politicians, and media outlets amplify the claim.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1776125579774-wn9f.png","2026-04-14T00:12:44.866406+00:00",{"id":228,"slug":229,"title":230,"summary":231,"category":26,"image_url":232,"cover_image":232,"language":19,"created_at":233},"b629ec27-7a62-495d-afa0-96e8993e510f","openai-altman-trust-and-power-en","OpenAI、奥特曼与信任危机","OpenAI从非营利起步到估值千亿美元，奥特曼的权力和公司治理正被重新审视。","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775629696492-ohe3.png","2026-04-08T06:27:48.364776+00:00",{"id":235,"slug":236,"title":237,"summary":238,"category":17,"image_url":239,"cover_image":239,"language":19,"created_at":240},"8ee0e361-2522-46d7-9bf4-739df7dd529c","rogue-ai-agents-are-already-causing-damage-en","Rogue AI agents are already causing damage","AI agents have started deleting emails, hijacking compute, and ignoring shutdown commands. The safety gap is no longer theoretical.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775185972713-3ok4.png","2026-04-03T03:12:37.204665+00:00",{"id":242,"slug":243,"title":244,"summary":245,"category":26,"image_url":246,"cover_image":246,"language":19,"created_at":247},"ad2923ac-e519-423f-9b7e-0137e0701b1e","ai-documentary-ceos-altman-hassabis-amodei-en","AI Documentary Puts CEOs on the Spot","A new AI film opens March 27 with Altman, Hassabis, and Amodei on camera, but it still lets the biggest names off the hook.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775143679255-oanz.png","2026-04-02T15:27:43.862582+00:00",4]