[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-build-semantic-search-opensearch-vectors-en":3,"article-related-build-semantic-search-opensearch-vectors-en":31,"series-tools-cc4a6360-46f7-4cdd-b250-74e4474d0407":75},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":23,"views":27,"created_at":28,"published_at":29,"topic_cluster_id":30},"cc4a6360-46f7-4cdd-b250-74e4474d0407","build-semantic-search-opensearch-vectors-en","Build semantic search with OpenSearch vectors","\u003Cp data-speakable=\"summary\">A step-by-step guide to set up OpenSearch \u003Ca href=\"\u002Fnews\u002Fzvec-turns-local-vector-search-into-a-library-en\">vector search\u003C\u002Fa> for semantic retrieval.\u003C\u002Fp>\u003Cp>This guide is for developers who want to use \u003Ca href=\"https:\u002F\u002Fdocs.opensearch.org\u002Flatest\u002Fvector-search\u002F\">OpenSearch vector search documentation\u003C\u002Fa> and the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fopensearch-project\u002FOpenSearch\">OpenSearch GitHub repository\u003C\u002Fa> to build semantic search with embeddings, k-NN fields, and practical query patterns.\u003C\u002Fp>\u003Cp>After you follow the steps, you will have a working OpenSearch index that stores vectors, accepts embedding data, and returns nearest-neighbor matches that you can adapt for semantic search or hybrid retrieval.\u003C\u002Fp>\u003Ch2>Before you start\u003C\u002Fh2>\u003Cul>\u003Cli>OpenSearch 2.x or later\u003C\u002Fli>\u003Cli>OpenSearch Dashboards 2.x or later, if you want to inspect queries visually\u003C\u002Fli>\u003Cli>Node.js 20+ or Python 3.10+, if you plan to generate embeddings in your app\u003C\u002Fli>\u003Cli>Docker 24+ or a running OpenSearch cluster with HTTP access\u003C\u002Fli>\u003Cli>An OpenSearch admin username and password, or an API key\u003C\u002Fli>\u003Cli>An embedding model or embedding API, such as OpenAI, Cohere, or Amazon Bedrock\u003C\u002Fli>\u003Cli>curl 8+ for the examples below\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>Step 1: Start an OpenSearch cluster\u003C\u002Fh2>\u003Cp>Your first outcome is a live OpenSearch endpoint that can accept vector mappings and search requests. For local development, start a single-node cluster with \u003Ca href=\"\u002Ftag\u002Fdocker\">Docker\u003C\u002Fa> so you can test vector search without provisioning infrastructure.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781714885490-g1o1.png\" alt=\"Build semantic search with OpenSearch vectors\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cpre>\u003Ccode>docker run -p 9200:9200 -p 9600:9600 \\\n  -e \"discovery.type=single-node\" \\\n  -e \"OPENSEARCH_INITIAL_ADMIN_PASSWORD=StrongPassword123!\" \\\n  opensearchproject\u002Fopensearch:latest\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify the cluster by calling the root endpoint. You should see the cluster name, version, and a status response that confirms OpenSearch is reachable.\u003C\u002Fp>\u003Ch2>Step 2: Create a vector index\u003C\u002Fh2>\u003Cp>Your second outcome is an index that can store text plus vector embeddings. Define a k-NN vector field with the dimension that matches your embedding model, such as 384, 768, or 1536.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781714884201-6iq0.png\" alt=\"Build semantic search with OpenSearch vectors\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cpre>\u003Ccode>curl -u admin:StrongPassword123! -X PUT \"https:\u002F\u002Flocalhost:9200\u002Farticles\" -k -H 'Content-Type: application\u002Fjson' -d '\n{\n  \"settings\": {\n    \"index\": {\n      \"knn\": true\n    }\n  },\n  \"mappings\": {\n    \"properties\": {\n      \"title\": { \"type\": \"text\" },\n      \"body\": { \"type\": \"text\" },\n      \"body_vector\": {\n        \"type\": \"knn_vector\",\n        \"dimension\": 384\n      }\n    }\n  }\n}'\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify the mapping with a GET request to the index. You should see the \u003Ccode>knn_vector\u003C\u002Fcode> field and the exact dimension you configured.\u003C\u002Fp>\u003Ch2>Step 3: Generate and store embeddings\u003C\u002Fh2>\u003Cp>Your third outcome is indexed documents that contain both readable text and numeric vectors. Generate embeddings in your application, then send each document with its vector into OpenSearch.\u003C\u002Fp>\u003Cpre>\u003Ccode>curl -u admin:StrongPassword123! -X POST \"https:\u002F\u002Flocalhost:9200\u002Farticles\u002F_doc\u002F1?refresh=true\" -k -H 'Content-Type: application\u002Fjson' -d '\n{\n  \"title\": \"Vector search basics\",\n  \"body\": \"OpenSearch can store embeddings for semantic retrieval.\",\n  \"body_vector\": [0.12, -0.03, 0.44, 0.08]\n}'\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify ingestion with a document fetch or a search for the document ID. You should see the stored text fields and the vector field in the response.\u003C\u002Fp>\u003Ch2>Step 4: Run a nearest-neighbor query\u003C\u002Fh2>\u003Cp>Your fourth outcome is semantic retrieval based on vector similarity. Use a query vector from the same embedding model and ask OpenSearch for the nearest matches.\u003C\u002Fp>\u003Cpre>\u003Ccode>curl -u admin:StrongPassword123! -X GET \"https:\u002F\u002Flocalhost:9200\u002Farticles\u002F_search\" -k -H 'Content-Type: application\u002Fjson' -d '\n{\n  \"size\": 3,\n  \"query\": {\n    \"knn\": {\n      \"body_vector\": {\n        \"vector\": [0.10, -0.01, 0.40, 0.05],\n        \"k\": 3\n      }\n    }\n  }\n}'\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify the result set by checking that the top hits are the most semantically similar documents, not just the ones with matching keywords.\u003C\u002Fp>\u003Ch2>Step 5: Combine text and vector signals\u003C\u002Fh2>\u003Cp>Your fifth outcome is a hybrid search path that can balance lexical matches and semantic matches. Add a text query alongside vector search when you want exact terms to influence ranking.\u003C\u002Fp>\u003Cpre>\u003Ccode>curl -u admin:StrongPassword123! -X GET \"https:\u002F\u002Flocalhost:9200\u002Farticles\u002F_search\" -k -H 'Content-Type: application\u002Fjson' -d '\n{\n  \"query\": {\n    \"bool\": {\n      \"should\": [\n        { \"match\": { \"body\": \"semantic retrieval\" } },\n        {\n          \"knn\": {\n            \"body_vector\": {\n              \"vector\": [0.10, -0.01, 0.40, 0.05],\n              \"k\": 3\n            }\n          }\n        }\n      ]\n    }\n  }\n}'\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify the ranking by comparing results from text-only search and vector-only search. You should see documents that satisfy both signals rise higher in the list.\u003C\u002Fp>\u003Ch2>Common mistakes\u003C\u002Fh2>\u003Cul>\u003Cli>Using the wrong vector dimension. Fix it by matching the index mapping to the exact output size of your embedding model.\u003C\u002Fli>\u003Cli>Mixing embedding models between indexing and querying. Fix it by using the same model family and preprocessing pipeline for both sides.\u003C\u002Fli>\u003Cli>Forgetting to refresh before testing. Fix it by adding \u003Ccode>?refresh=true\u003C\u002Fcode> during demos or waiting for the refresh interval in production.\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>What's next\u003C\u002Fh2>\u003Cp>Once the basics work, explore semantic search, reranking, and hybrid retrieval patterns in the OpenSearch docs so you can move from a demo index to a production search pipeline with better relevance and control.\u003C\u002Fp>","A step-by-step guide to set up OpenSearch vector search for semantic retrieval.","docs.opensearch.org","https:\u002F\u002Fdocs.opensearch.org\u002Flatest\u002Fvector-search\u002F",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781714885490-g1o1.png","tools","en","f3a58146-9c8e-4358-89f3-d89d9558b629",[17,18,19,20,21,22],"OpenSearch","vector search","knn_vector","embeddings","semantic search","hybrid search",[24,25,26],"Create an OpenSearch index with a vector field that matches your embedding dimension.","Index documents with both text and embeddings so you can search by meaning.","Use nearest-neighbor and hybrid queries to combine semantic and keyword relevance.",0,"2026-06-17T16:47:37.268089+00:00","2026-06-17T16:47:37.258+00:00","dd90c2e4-15ac-4c48-98d2-d6c15129dfb1",{"tags":32,"relatedLang":34,"relatedPosts":38},[33],{"name":20,"slug":20},{"id":15,"slug":35,"title":36,"language":37},"build-semantic-search-opensearch-vectors-zh","OpenSearch 向量語意搜尋實作指南","zh",[39,45,51,57,63,69],{"id":40,"slug":41,"title":42,"cover_image":43,"image_url":43,"created_at":44,"category":13},"46e957eb-f078-4527-9f2b-e05e801998d8","zvec-turns-local-vector-search-into-a-library-en","Zvec turns local vector search into a library","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781714031518-cson.png","2026-06-17T16:33:24.445725+00:00",{"id":46,"slug":47,"title":48,"cover_image":49,"image_url":49,"created_at":50,"category":13},"f13eeb6f-828b-47f1-bffa-ce23f2039ede","codex-override-file-team-safety-en","Codex 的 override 文件让团队少踩坑","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781712203131-p9n8.png","2026-06-17T16:02:50.475328+00:00",{"id":52,"slug":53,"title":54,"cover_image":55,"image_url":55,"created_at":56,"category":13},"2aa4df9d-e949-45c1-98b0-af8c8c0f799b","opencode-terminal-ai-coding-loop-en","OpenCode turns terminal chat into a coding loop","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781693306666-aejr.png","2026-06-17T10:47:58.91198+00:00",{"id":58,"slug":59,"title":60,"cover_image":61,"image_url":61,"created_at":62,"category":13},"8f7dbc25-a9a2-4539-a4d1-8cd9932444e1","open-source-ai-software-infrastructure-wins-en","Open-source AI software is winning on infrastructure, not hype","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781691474026-aqqd.png","2026-06-17T10:17:27.28173+00:00",{"id":64,"slug":65,"title":66,"cover_image":67,"image_url":67,"created_at":68,"category":13},"33c54a94-00ba-4029-bd8d-67b27812d487","wazero-turns-go-wasm-into-plain-go-en","Wazero turns Go Wasm into plain Go","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781681636537-3in0.png","2026-06-17T07:33:31.022165+00:00",{"id":70,"slug":71,"title":72,"cover_image":73,"image_url":73,"created_at":74,"category":13},"e56db932-e6fe-4974-b25e-d5042045e07f","ffmpeg-webcli-browser-video-editor-en","ffmpeg-webCLI brings video editing into the browser","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781680690708-tcol.png","2026-06-17T07:17:40.984864+00:00",[76,81,86,91,96,101,106,111,116,121],{"id":77,"slug":78,"title":79,"created_at":80},"8008f1a9-7a00-4bad-88c9-3eedc9c6b4b1","surepath-ai-mcp-policy-controls-en","SurePath AI's New MCP Policy Controls Enhance AI Security","2026-03-26T01:26:52.222015+00:00",{"id":82,"slug":83,"title":84,"created_at":85},"27e39a8f-b65d-4f7b-a875-859e2b210156","mcp-standard-ai-tools-2026-en","MCP Standard in 2026: Integrating AI Tools","2026-03-26T01:27:43.127519+00:00",{"id":87,"slug":88,"title":89,"created_at":90},"165f9a19-c92d-46ba-b3f0-7125f662921d","rag-2026-transforming-enterprise-ai-en","How RAG in 2026 is Transforming Enterprise AI","2026-03-26T01:28:11.485236+00:00",{"id":92,"slug":93,"title":94,"created_at":95},"6a2a8e6e-b956-49d8-be12-cc47bdc132b2","mastering-ai-prompts-2026-guide-en","Mastering AI Prompts: A 2026 Guide for Developers","2026-03-26T01:29:07.835148+00:00",{"id":97,"slug":98,"title":99,"created_at":100},"3ab2c67e-4664-4c67-a013-687a2f605814","garry-tan-open-sources-claude-code-toolkit-en","Garry Tan Open-Sources a Claude Code Toolkit","2026-03-26T08:26:20.245934+00:00",{"id":102,"slug":103,"title":104,"created_at":105},"66a7cbf8-7e76-41d4-9bbf-eaca9761bf69","github-ai-projects-to-watch-in-2026-en","20 GitHub AI Projects to Watch in 2026","2026-03-26T08:28:09.752027+00:00",{"id":107,"slug":108,"title":109,"created_at":110},"9f332fda-eace-448a-a292-2283951eee71","practical-github-guide-learning-ml-2026-en","A Practical GitHub Guide to Learning ML in 2026","2026-03-27T01:16:50.125678+00:00",{"id":112,"slug":113,"title":114,"created_at":115},"1b1f637d-0f4d-42bd-974b-07b53829144d","aiml-2026-student-ai-ml-lab-repo-review-en","AIML-2026 Is a Bare-Bones Student Lab Repo","2026-03-27T01:21:51.661231+00:00",{"id":117,"slug":118,"title":119,"created_at":120},"6d1bf3f6-e191-4d30-b55b-8a0722fa6afe","ai-trending-github-repos-and-research-feeds-en","AI Trending Tracks Repos and Research Feeds","2026-03-27T01:31:35.709532+00:00",{"id":122,"slug":123,"title":124,"created_at":125},"010539a1-4c3a-4bd3-937a-26616422ee0d","awesome-ai-for-science-research-tools-map-en","Awesome AI for Science Is Becoming a Real Research Map","2026-03-27T01:46:50.89513+00:00"]