How do you build an AI-powered search and recommendation app, step by step? In this demo, filmed during SF TECH WEEK by a16z, Staff Developer Advocate Milen Dyankov takes you through the PineStream workshop. You'll learn how to use Pinecone, Groq, and Nuxt to implement vector search, hybrid retrieval, reranking, and RAG-style pipelines in a sample movie streaming app. Demo repo instructions, solution files, and full video in links below 👇
About us
Pinecone is the leading vector database for building accurate and performant AI applications at scale in production. Pinecone's mission is to make AI knowledgeable. More than 5000 customers across various industries have shipped AI applications faster and more confidently with Pinecone's developer-friendly technology. Pinecone is based in New York and raised $138M in funding from Andreessen Horowitz, ICONIQ, Menlo Ventures, and Wing Venture Capital. For more information, visit pinecone.io.
- Website
-
https://www.pinecone.io/
External link for Pinecone
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- New York, NY
- Type
- Privately Held
- Founded
- 2019
Locations
-
Primary
Get directions
New York, NY 10001, US
-
Get directions
San Francisco, California, US
-
Get directions
Tel Aviv, IL
Employees at Pinecone
-
Milen Dyankov
Developer Relations and Engineering Executive focused on empowering developers and teams. Experienced in leading enterprise projects, enhancing…
-
Jenna Pederson
Developer relations @ Pinecone | Keynote speaker | Software engineer
-
Andrew Naber
Fractional Marketing & Strategy Leader
-
Mike Sefanov
Leading global communications, analyst relations, and various marketing streams at Pinecone
Updates
-
We've released an Agentic Quickstart that demonstrates a new development paradigm. Instead of traditional tutorials with code snippets, you work with AI coding agents that understand Pinecone best practices and implement them automatically. Coverage includes: ‣ Semantic search implementation ‣ RAG pipeline development ‣ Recommendation system architecture The agents (Claude Code, Cursor) handle the implementation details while you focus on application-specific logic and requirements. Interested in this approach to developer education? Try it and let us know what you think. Quickstart: https://lnkd.in/gGXDaszR
-
The best way to measure the success of a database? How often you think about it, according to Delphi Co-Founder and CTO Sam Spelsberg. Less is more. That insight and a lot more in the link below where our customers and partners Nicholas Scavone, CEO & Co-Founder of Seam AI; Dave Piskai, Head of Product at APIsec; and Sam discuss the journey of building AI agents in production at the San Francisco Yacht Club 👇
-
Heading to re:Invent? We're showcasing how Pinecone delivers low-latency performance for billion-vector workloads. Stop by our booth to see live demos and discuss how we're helping teams build faster, more reliable AI systems. Schedule a demo: https://lnkd.in/g_iWEXfH
-
The transformative power in AI isn’t what you think. It's not just algorithms, data quality, or governance. It’s search, especially the ability to access the right information at the right time, which is foundational for making AI agents truly useful. Traditional keyword search doesn't cut it anymore; the new AI paradigm demands deeply contextual, semantic or hybrid search geared for agents, not just humans. The underlying shift is from how people conduct search to how agents need to find and use information autonomously, and this requires a structural overhaul of traditional search systems. At TechCrunch Disrupt last week, our Founder and Chief Scientist, Edo Liberty, discussed this on stage with Senior Reporter Rebecca Szkutak. Check out the video below in the comments👇
-
-
Choosing between Sparse and SPLADE embeddings? Here's your decision framework. 🎯 If you need semantic understanding and can handle vocabulary mismatches, SPLADE delivers learned term expansion and interpretable results. Perfect for out-of-domain scenarios where dense models struggle. Need exact matches with high precision? Sparse embeddings excel at keyword matching for finance, medical, and legal terminology. Plus they're faster and more cost-effective for latency-sensitive applications. The key question: Does your use case prioritize semantic understanding or exact precision? Learn more: https://lnkd.in/gGhTMqh8 ➕ https://lnkd.in/gsEnXuVe
-
-
Pinecone reposted this
Here's what I learned building my first n8n workflow template with Pinecone ⤵️ 🔹 Using the built-in Pinecone Vector Store node gives you flexibility if you need to control all aspects of chunking, embedding, and the various parts of retrieval 🔹 But if you don't need that much flexibility, Pinecone Assistant can manage that for you 🔹 Using Pinecone Assistant's MCP server (vs the /chat or /context API directly) will give you access to future Assistant functionality as it's added and there's less to configure 🔹 n8n is a super slick way to hook Assistant up to your Google Drive docs and have it synced anytime a file is added or changed If the complexity of rolling your own RAG pipeline is not for you or you just want to spin something up quickly, grab the template in the comments. -- 👋 Follow me (Jenna Pederson) for more AI, cloud, and tech content
-
Pinecone reposted this
Critical workloads like recommendation, search, and AI systems require fast, accurate, and scalable search across knowledge bases. These systems all demand very different things from a vector database. Recommenders need high throughput. Semantic search needs to search a namespace with billions of vectors. Agentic apps need millions of small namespaces that become searchable on demand. Pinecone uses an unique slab architecture to handle all of these workloads without trade-offs. When I started at Pinecone, I worked with Perry Krug to draw up an animated diagram to explain how slab all works. I never got around to getting it done, but Lea Wang-Tomic finished the diagram and wrote up a really nice post that explains it all. Slab is really cool for three main reasons: - Writes are logged durably and immediately in memory, and then written to object storage as immutable files called slabs. Writes are never blocked by reindexing or ongoing queries. - Multi-level compaction runs continuously in the background, merging smaller slabs into larger, more efficiently indexed slabs and preventing queries from scanning thousands of tiny files. Because slabs are distributed, the system scales elastically without resharding. - Reads instantly fans out to search in memory and across all existing data slabs. Reads will always return the freshest results because writes happen in parallel and most accurate results because each slab is adaptively indexed based on data size. Slab has a ton of other unique designs. Give the blog post a read; link in the comments. If you have questions or feedback, drop a comment.
-
Building with embeddings? Storage limits might surprise you. Our team recently helped a customer debug what looked like a simple embedding + metadata issue. Turns out, it revealed something critical about how embedding workflows scale, and how easy it is to hit unexpected walls. Our article breaks down: → How Pinecone integrated inference simplifies your stack by combining embedding generation and storage in one API call → Why metadata payload size matters more than you think → The exact approach we used to solve the bottleneck → When to choose Hosted Inference vs. self-managed models Whether you're shipping RAG systems, semantic search, or embedding-driven apps, this post will save you from scaling issues that only appear in production. Read the full breakdown: https://lnkd.in/gx2KftsM
-
-
Scale to Billions. Respond in Milliseconds. Come see the proof at re:Invent 2025. We'll be at Booth #534 demonstrating how Pinecone handles billion-scale vector workloads without sacrificing speed. Live benchmarks, real architecture discussions, and answers to your toughest scaling questions. Book a demo: https://lnkd.in/g_iWEXfH