Full-text search in Pinecone is here. Built a bird search app to show off the search capabilities. 2,079 North American bird articles, full-text keyword search, and Gemini Embedding 2 for cross-modal search — type "tall pink wading bird" and it finds birds by their photos. Type "Ammospiza maritima mirabilis" and it finds the Cape Sable seaside sparrow, which has a Latin name that gives nothing away. Check out the blog post on boolean logic, phrase matching, boosting, regex, autocomplete, and how to mix lexical precision with semantic ranking in the same query. GitHub repo for the bird search app linked in comments. 🕊️ https://lnkd.in/g3a8cuDa
About us
Pinecone is the leading vector database for building accurate and performant AI applications at scale in production. Pinecone's mission is to make AI knowledgeable. More than 9000 customers across various industries have shipped AI applications faster and more confidently with Pinecone's developer-friendly technology. Pinecone is based in New York and raised $138M in funding from Andreessen Horowitz, ICONIQ, Menlo Ventures, and Wing Venture Capital. For more information, visit pinecone.io.
- Website
-
https://www.pinecone.io/
External link for Pinecone
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- New York, NY
- Type
- Privately Held
- Founded
- 2019
Locations
-
Primary
Get directions
New York, NY 10001, US
-
Get directions
San Francisco, California, US
-
Get directions
Tel Aviv, IL
Employees at Pinecone
Updates
-
Pinecone reposted this
Can we compress large KV caches in general? Surprisingly the answer is Yes. New paper with Alex Andoni and Eldar Kleiner proves that all KV caches admit small corests. https://lnkd.in/e5Zry9Cv tl;dr The Memory Bottleneck: Most modern Large Language Models rely on the Attention mechanism to manage knowledge and context. They store previously seen tokens in memory as vector arrays called a KV Cache. As context windows grow, this KV cache becomes massive. It eats up expensive GPU memory and significantly slows down text generation, creating a major bottleneck for long-context AI. Compression with coresets: A coreset, in general, is a highly representative subset of your data. The quality of a coreset is measured by two opposing objectives. First, it needs to be as small as possible. Smaller coresets consume less memory and compute resources. Second, they need to be accurate. Using the coreset, you expect to get almost the same result as using the whole dataset. The way to connect the two is to define an acceptable error tolerance and then find the smallest coreset that can achieve it. Coreset optimality is measured by the relationship between the error tolerance and the coreset size. This Result: This paper shows that every KV cache admits a small coreset (a subset of keys and values) such that the attention vector computed only on the coreset is provably close to the attention computed on the entire KV store. And this holds simultaneously for all queries whose norm is bounded. The achieved coreset size improves on previously known bounds and (almost) matches the coreset lower bound. Why You Should Care: If you're an AI enthusiast, you know that scaling long context memory is the current frontier of AI. This tells AI researchers and engineers two important facts. One, that they don't necessarily need to invent entirely new attention architectures to solve the memory problem. Two, that compression by pruning is mathematically viable and provably effective. This also partly explains the empirical success of recent context pruning techniques.
-
Pinecone reposted this
There have been times while building something where I wanted to fully test an idea, but paused because the cost made me hesitate more than I wanted to. This is why working on the Builder tier in Pinecone Assistant felt meaningful to me. It's built for that early stage where you're still exploring what something could become. In that stage, you want to move fast. Try ideas. Break things. Refine them. Keep momentum. Not constantly second-guess whether you can afford to continue experimenting or whether you'll hit limits too soon. That hesitation can slow people down or make them hold back on things they'd otherwise try. Builder tier is our attempt to make that stage simpler: a flat monthly price with clear limits, so you can keep experimenting and building with more predictability while you're figuring things out. I'm really happy to have contributed to something that helps developers stay in that flow and focus on what they're actually trying to create. #Pinecone https://lnkd.in/dB2AGP-S
-
$20/month. Builder: Pinecone's new plan for teams that have outgrown the free tier but aren't running production workloads at scale yet. The prototype-to-production gap has always been an awkward place to be. Builder is built for it. → https://lnkd.in/gYPmaSwF
-
Join us in LA for our meetup on agents!
The last AI Agents Happy Hour was so fun, Julian and I are going to do it again. This time, co-hosted by Pinecone! We want to see what you're building and learn from your experience. Blow my mind, and I'll buy you a beer. Wednesday, May 28th at Gulp in Playa Vista. Tell your agent to put it on your calendar. https://luma.com/tvwl28gz
-
Congratulations to Ilia Feldgun, our new software engineer, who comes to us with a wealth of experience in Cloud and IT. We are excited to have you on the team! Interested in joining our talented team at Pinecone? Check out our opportunities today: https://lnkd.in/d4R_7sN ```
-
Pinecone reposted this
We had a huge launch week at Pinecone last week! Among the many things we announced, my favs are Full Text Search and the new $20 a month flat Builder tier! A bonus: a new skill, just for full text search: install here! https://lnkd.in/gGMYUsj3 https://lnkd.in/gxaWjFn3 #search #pinecone #agentskills
-
We built an internal AI agent called AskData to answer questions across our data warehouse, Slack, Clay, Gong, and other sources. The vision was great, but the reality was a brute-force nightmare. It took up to two minutes, burned 40,000 tokens, and hit only 68% accuracy just to piece together an answer. Why was our own agent struggling? We realized we were running agents on systems designed for human beings. Traditional databases hand you a stack of documents and expect you to provide the reasoning. Agents don't have that context, so they brute-force their way through the data, guessing the context inside an expensive LLM prompt. We couldn't keep pushing raw data into the LLM. We needed to bring the context closer to the data itself — one reason we built our knowledge engine Pinecone Nexus. When we moved AskData onto Nexus, the transformation was immediate: 📉 Tokens: 40,000 → 2,000 (90% reduction) ⏱️ Latency: 2 mins → Under 500ms 🎯 Accuracy: 68% → Well over 90% If your agents are slow and expensive, the model probably isn't the problem. Our CEO, Ash Ashutosh, sat down with Andreessen Horowitz General Partner Peter Levine for the AI+a16z podcast to tell this story and explain how Pinecone Nexus is rewriting the AI stack. Listen to the full deep dive into Pinecone Nexus and the future of knowledge infrastructure for agents. 👂 Spotify: https://lnkd.in/gquQ_szT 👂 Apple: https://lnkd.in/g4bVZzGb
-
Two new regions, available now. AWS eu-central-1 (Germany) AWS ap-southeast-1 (Singapore) See the release note: https://lnkd.in/gjV793rZ
-
-
Pinecone reposted this
Yesterday, Pinecone had the opportunity to be part of Teradata's Autonomous Intelligence World Tour event, announcing our partnership and upcoming integration in their new Autonomous Knowledge Platform's AI Studio. What stood out to us about this launch and Teradata's strategy is the company we're in, including Unstructured, WisdomAI, and Karini AI. Teradata isn't trying to build everything themselves, but rather making deliberate bets on best-of-breed partners and integrating them tightly. Looking forward to what we help customers build together!
-