Back to episodes

Episode 95

Pinecone is Democratizing Access to Vector Database Technology

Jack Pertschuck, former Principal Engineer at Pinecone, discusses what vector databases are and why they matter for AI and search applications. He also discusses the challenges of communicating the value of this technology when the problem isn’t widely understood.

Transcript

Speaker: you’re listening to a new episode of The Brave Technologist, and this one features Jack Perch, who’s a principal engineer at Pine Cone, the market leader in vector databases. As the company’s founding engineer and lead for algorithms applied research and platform, Jack is responsible for a roadmap and execution across research and engineering for search index efficiency and accuracy.

Prior to Pine Cone, Jack was founder of Psychic QA and creator of the End Boost open source neural ranking engine. He’s an active member of the Rust and information retrieval research communities, and is passionate about solving problems at the intersection of ML and systems. In this episode, we discussed the power of ~~potential of ~~vector-based databases, difficulties in promoting a solution when the problem isn’t widely understood why he is going all in on hybrid retrieval and search, along with Advice for teams adopting Vector search for the first time.

And now for this week’s episode of the Brave Technologist.

[00:01:00] Jack welcome to the Brave Technologist. How are you doing today? Doing excellent.

Speaker 3: Thanks for having me. Luke,

Speaker: just to kind of set the table a little bit I know you’re, one of pine cones, founding engineers. Can you help our audience understand a little bit, what does Pine Cone do and, and what kind of area are you guys like focused on?

Yeah, for sure. So, um.

Speaker 3: pine cones set about making the sort of cutting edge neural search and database technology for building ML powered and AI powered recommender systems. Search systems, like, think about, you know, powering like a Facebook news feed or showing you similar related items when you check out in your Amazon cart, right?

the underlying technology to do this at large scale was previously only. accessible to large tech companies. and when we founded Pine Cone in 2019, we wanted to basically democratize it and allow anyone to easily. Build systems based on this fundamental underlying KNN or vector search vector database [00:02:00] functionality.

Speaker: Cool. kind of first drew you into this, into this work? at the intersection of like ML and these systems? So,

Speaker 3: I actually got into entrepreneurship in undergrad my senior year at Cornell. There was a really strong entrepreneurship community, and I started competing in all these contests with some of my friends.

And we started winning and we just, like, were building, I mean, the pitch competition. So really it’s a lot more about, you know, having an idea for a company and being able to sell it. Compared to actually building it. But this was enough to sort of wet my appetite for, you know, what might be possible with ML technology database technology.

We got into the pharmaceutical space. We started building tools for searching and indexing data in pharma m and a. So, you know, there’s tons and tons of clinical trial data massive volume that, you know, legal assistance and due diligence. People spend hours and hours and hours shorting through.

And so we built an ML based platform for analyzing and searching that data. And as part of that effort, I you know, came into contact with this neural search [00:03:00] AI technology and deployed it for the first time. But it was really hard. what existed was some open source libraries like Face, HNSW, a few things you may have heard of, but to deploy them, you know, me and my buddies, hacking away in, his mom’s basement actually just spent hours and hours spinning up infrastructure.

you hit a lot of like, out of memory, for the systems people, because these are very. Memory intensive algorithms and, and it was just a huge pain. you needed to be almost an expert in vector search to build an application based on this. And so then when I came into touch with the pine Cone founders shortly after my graduation I understood the mission and I understood the value of what they were trying to build.

Speaker: Awesome. So I mean, it sounds like, really, total startup right. Mindset of like, okay. You know, kind of getting the pitch part down, but also a really interesting mix from what I’m hearing. And this is like, I can kind of relate too, working on the production side where you’re hitting these software or hardware limits too, you know mm-hmm.

Kind gotta get a balance of your hardware chops along with what you’re trying to do with the software, What was one of the biggest early challenge that you guys had [00:04:00] when you were starting pine Cone? Yeah, I mean,

Speaker 3: the biggest challenge was just no one knew about this technology outside of, you know, a very insular sort of data science ml, big tech community.

Right. It’s, it’s sort of. It’s easy to explain, right? You just wanna find similar content in like high dimensional space. But for folks without a math background, I think understanding the value of it and why you need like a database for this and, and the sort of applications you can build with it is difficult.

And it requires like a lot of actual knowledge. You have to train an embedding model to produce these high dimensional representations. And then you have to index them, and then you have to build an application on top of that for end users. That actually adds value, right? Like a search application or a recommender or a, some sort user product.

And so all of these, having all of these pieces together, there were only a few companies that really did this and pieced it all together. Like you think, you know, Etsy I think had a big a team, Pinterest like these companies where their core value is really about the [00:05:00] search and, the recommender, understood this technology, but other folks, just didn’t have the background and context to think about what sort of systems they could build with a vector database. And so, you know, we struggled to sort of communicate that to customers, educate people, and that was a big part of our success as sort of the thought leader and vector databases.

Even just the, the word vector database, the term it took us, so long, so many iterations. We called it neural search at first and then we, thought, oh hey, you know, it’s really more like a database. But then, we started calling it a Vector database and people were confused ‘cause they thought it was vectorized instructions, which is a different.

Right, right. Another systems concept, it’s a, you know, vector’s a very overloaded term. We called it. We tried calling it an embedding database. We just had all of this sort of like AB testing around, you know, how do we even talk about this thing? But once we landed on Vector Database and people started, you know, using Chatt t understanding what AI was and what the value of it was everything clicked.

And we educated people about how they can use vector [00:06:00] databases to build, you know, a wide variety of AI powered applications.

Speaker: I imagine too, you said you guys started up in what, 2019? that feels in this space ages ago. Right? And, and like even within the past, I mean, think about the past, year, right?

this is the first time in what a decade that Google query volume’s actually going down, I would imagine. That, AI is actually kind of making a dent into how people are searching for things, right? Yeah. Like I would imagine that communicating that, it’s an ongoing thing, right?

Like how do you communicate the value, but also how is the product market fit changing, right? and because, right? But behaviors are actually starting to change. How much are you guys thinking about that and, and how much is that impacting what you guys are doing?

Speaker 3: Well, I think there’s, there’s a lot of discussion about you know, rag the retrieval, augmented generation.

This is the concept that was really buzzy a year or two ago, and essentially, you know, models had much more limited context windows where you could only put, one or two PDF documents probably in the context of the model at that time. So you needed to do an initial search or retrieval [00:07:00] stage to actually find the most relevant content to, to help augment your query or your, GPT session, right?

And so, you know, model context windows expanded quite a lot in the last year. you know, million, token context windows. And so folks are increasingly Relying a bit more on just putting everything in the, in the model context, but it’s actually not a good approach, right?

we’ve shown in the research that we’ve done, the research that other folks have done, that it’s still much more effective to find just the relevant content, give that to the model. It’s also, of course, way cheaper. You’re not paying for all these tokens and just like stuffing them into the model.

And so, there’s a lot of nuance here where these things kind of go. It’s like a pendulum swings back and forth. Everyone’s like, oh, you, you need a whole retrieval stack. Oh, just put everything in the model context window. Really, you know, the truth can often be somewhere in between. But in general, we still view, and I think most, you know, serious people in the space still view retrieval as a critical part of, the, the modern AI stack.

And. Folks are using more [00:08:00] agentic retrieval methods as well in certain circumstances where you go, you know, maybe grab for like a specific pattern in your data set and then you give it to the model, you didn’t find what you want, you go back rep for something else, so on. And there’s, popular AI tools that, that use this pattern.

But in general it’s much slower and, you know, for, for different applications, a hybrid of both these approaches can be effective. And that’s one of the research areas we’re focusing on here at Pine Cone. And, we have some upcoming products that are gonna, address this, hybrid retrieval scenario where you have both a vector database and you have something more agentic running.

To actually, iteratively look for the content that you wanna find. Mm-hmm. And then, you know, if you don’t find it. Go back and sort of, look again.

Speaker: So continuously like, like taking small little passes at it, right. Like until and refining on that and kind of getting Exactly. Versus, versus like eating the whole watermelon in one bite or whatever, you know, which is what it felt like.

I mean, I feel like it’s gotta be an interesting part of this too, where see these companies with their products and they come out and, and [00:09:00] everybody kind of picks these arbitrary. Cost price points for this stuff, and then all of a sudden you start seeing all these like $200 a month, $300 a month, right?

Like where you’re like, these costs might it seems like a hard thing to kind of like predict what the cost of these things would be, right? And, and how fast the hardware would, you know, iteratively improve, right? And, and become more efficient and cheaper and all that stuff.

I mean, how much on the behavior side are you all seeing things change with use? or are you seeing it at all? Is it much more like. Just communicating with developers right now or are you guys seeing like actual changes in behavior from users with this stuff?

Speaker 3: You know, I think end users at the end, like ultimately just want something to work.

right. they ask a question. They ask for, you know, they’re like, okay, modify my backend to, fix this API authentication issue. Right? And so a developer, they don’t care, how you’re doing the search, whether it’s a vector database or whatever, they just want, you know, to get the relevant content, the documentation for the API backend, the files, the handle authentication, send that to the LOM.

You know, run the, agentic process, edit the [00:10:00] files, find the bug, fix the bug, so on and so forth, right? So, for the end user, in an ideal world, they don’t have to think about retrieval at all. Right? Mm-hmm. The vector database is almost invisible. It’s just something that, magically gets the context that is needed.

And we talk, you know, this term context engineering is getting a little bit more popular, right? Yeah, you’re doing context

Speaker: everywhere. It’s like the new buzzword, right?

Speaker 3: Context is king, right? Context is king. Yeah. And I think, application developers are increasingly realizing that having the right context is, the key to having a good, AI user experience with accurate and grounded answers.

Speaker: And so you kind of touched on it a little bit, like hybrid search, right? And then this cascading retrieval. Can we walk through the audience through this just, a flow for this so they can kind of understand it?

Because I feel like it’s something that’s pretty important for people to understand more. And, I feel like I’m feeling it when I use things more, right? but I wonder if you might giving a little bit of a breakout of what that means.

Speaker 3: Yeah, so, a hybrid and a cascading retrieval are sort of different approaches for combining [00:11:00] multiple search methods to, to emerge results set that contains all possible information that you might need for a given you know, question or interaction with an LLM.

Right. And so what that means is that the semantic meaning and representation of text or a PDF or an image or a document is very well captured by the dense embedding, right? We talk about vector embeddings, right? And the lms, they’re essentially latent representations from the inside layer of the model, right?

And so you can think of it as like a snapshot of the model’s brain. And, and so what you do is you, you just sort of like take these snapshots and you, you put them in pine cone or vector database and then you, you search through them, right? and these snapshots of the, of the model’s brain represent, what actually is going on in that piece of content, right?

If it’s a PDF and image or whatever, And so. This is a very powerful way to find everything that is conceptually similar to a given, you know, piece of text, And or, a query. And so this is the dense retrieval component of cascading retrieval, And this is what you, you primarily [00:12:00] use a vector database for.

But sometimes these searches can miss specific context, right? So let’s say you have like a name of a, function or a file that the model has never seen before. Right? Hmm. It won’t necessarily map cleanly onto a specific, you know, area of the brain. The, the model doesn’t actually know how to perceive or understand this ‘cause it’s something.

It’s private, specific to your workspace, specific to your problem. And so it can be helpful to also do an exact text search, like a full text matching in a traditional style with that mm-hmm. You know, file name or this, thing, right? So this is called like a, a sparse retrieval or a full text retrieval.

and al search. What you do for a hybrid retrieval or cascading, the reason we call it cascading is at the first level you do the, the dense, like I was talking about with the, you know, the latent representations. And then you also do a sparse retrieval, that’s a full text or exact matching. And then, if something was missed in the dense retrieval or the sparse retrieval, it’s fine.

At the end of the day, you merge them, right? You cascade them up into a second layer pipeline. You merge results, you run through a [00:13:00] model. Again, the model tells you, okay, I need this result from, from this result set is interesting. This result from the sparse results set is interesting and, and actually that way you don’t miss anything.

You’re maximizing recall is what we call it in the first stage, which means you want to get, you’re, you’re casting a wide net. You wanna get everything that could possibly be interesting. And then in the second stage of cascading retrieval, you maximize for precision. Precision means then you go through and you drop the results that are not interesting, right?

But it’s easier to do that because now you’ve filtered the whole data set might be millions, hundreds of millions, billions if it’s web scale, right? And you filter that down to a smaller candidate set and then you can maximize for precision, just weed out everything that’s not actually relevant, and then you have a final result set where you’re guaranteed that you didn’t miss anything.

But also you don’t have anything in there that’s not relevant. You give that to the model. It minimizes the amount of tokens you have to put in and maximizes the speed and it, and you’re, you know, guaranteed that that is the most relevant content for the model to answer the given query.

Speaker: It’s fantastic and it all just happens [00:14:00] like super fast.

Exactly. At the end you didn’t even realize so crazy. Like, it’s so powerful. And, I’d love to touch on something a little bit too, ‘cause I don’t have the opportunity to talk to folks like yourself this detail too often. you mentioned there’s something, the model.

Seen before, right? yeah. what’s the rate in which the new things are adapted and, and kind of like incorporated into the model’s knowledge, right? so like let’s say that I’m giving the model something it hasn’t seen before. is it learning from that one instance or is there a certain amount of similar things that have to happen before it learns something new?

I think folks might just be kind of curious a little bit how that works. Yeah, so, so that’s a great question. I

Speaker 3: mean, actually I think at this point, my understanding on, based on what most of the big labs do is actually, it’s, it’s a much more offline static process. So you have a especially for, for pre-training, right?

You have a pre-training run that probably, you know, it might take a month. To actually run the, the full pre-training on. And these are usually you know, multi petabyte, many, many petabyte data sets that are, that are being run through through the pre-training phase. And so you, [00:15:00] you, the model’s knowledge actually just captures a moment of time.

That is usually like months prior. So GPT five, for example, you know, I don’t know. I’m not even sure when that pre-training happened. and I’m sure that they incrementally update the pre-training a bit, but the snapshots that are running in production today for, for GPT five and stuff.

I would be shocked if they’re, if they’re you know, they have any sort of innate knowledge that’s more than a month old, like that the actual model waits and now folks are augmenting those models. And, and I don’t know the exact details of that, but this is actually where RAG comes into play a lot.

Right. Where okay. The model has basically all the knowledge. It was available to it a month ago when we did the pre-training run, right? Mm-hmm. And when the model waits and it’s sort of like, it’s like it’s snapshotted there. But then you can obviously, you know, apply some extra information and search to the interactions, Mm-hmm. So when you go ask GT five, about what happened today or like, so on and so forth. It has tools that it has access to that allow it to access the information that’s [00:16:00] more recent. Right? And this is again with retrieval search rag, where the, the point of it is, you know, you don’t always want to have to update the model weights ‘cause it’s expensive and it’s hard.

Right. So rather than, than trying to cram all the knowledge, getting the, the model to try to memorize tons and tons of specific details and information you actually just want the model to be very good at reasoning. Mm-hmm. And have some world knowledge, like some innate basic knowledge about everything.

And then you wanna augment the interaction and give the model all the extra knowledge it needs at, query time, inference time, so that it can answer, you know, the question and add the extra context. And, and that’s sort the point of rag. Yeah,

Speaker: it’s awesome. I mean, you know, you totally see it too, as people using this stuff.

they used to timestamp this stuff and say like, look I’m only this relevant. Right? Like, yeah. You know, like, and then nobody wants to see that. especially I feel like a lot of this kind of correlates with those people using these prompts more for, you know, everyday search.

But I, I am kind of curious too, you know, there’s a lot of [00:17:00] concerns from like. Content creators, publishers and stuff around this, do you see tooling and, and databases like what you all are doing being integrated more by, publishers and, and content creators to get more functionality and, utility to what they’re doing?

Or, you know, that kind of a separate bucket for you guys?

Speaker 3: So, yeah, I mean, we don’t directly provide. public search APIs or anything that would, you know, cover that sort of proprietary IP and content. And, we’re not collecting data and, training models on it.

So, you know, we haven’t directly dealt with any of those issues. I know, you know, there’s been a few lawsuits and, you know, a lot of talk about it. I think that in general. we we’re gonna have to answer a lot of questions about who owns, you know, the who owns intellectual property in the AI era.

Because yeah, it’s frankly just like, it’s too hard to to attribute and know where everything came from because it’s, you know, at the end of the day, it’s just. It’s math, right? It’s like big matri [00:18:00] matrices. And you know, there’s a lot of work around mechanistic interpretability that, that folks are doing to sort of understand what are these models thinking, how do they work?

You know, it answered in this way, you know, this content came out, where did it come from? Why, why did it, why did it come, come out? How do I like surgically edit that out? Right? And I know there’s a lot of research around that, but but yeah, in terms of retrieval, It’s a bit of a, a different problem space.

Speaker: Yeah, yeah. No, no, no worries. I, just always, always kind of curious. and we touched on this too a little bit. Like, all these, retrieval process is all happening. and there’s like a lot of reasoning involved, right? it obviously feels like this stuff is getting better and better more and more accurate, right?

how do you guys measure. When something’s good enough, I mean, I’m sure there’s many ways to measure when something’s good enough, but like, you know, for the lay person, right? Or, or somebody that’s using this, how, how do you explain that to them?

Speaker 3: That’s a great question.

I mean, in some senses that is the most important thing for building ML and AI systems, right? Evaluation. Mm-hmm. And we have, we have a bunch of rigorous internal [00:19:00] tooling for evaluation and we’ve flirted a bit with, you know, productizing it and launching it. And I think that there’s, we’ve worked with some customers one off to sort of help them, you know, evaluate their retrieval system. We have scientists that will, will embed with customers and help them understand if the search system isn’t returning the results that they want, then why knowing when it’s good enough, I think.

Most folks have an external business, KPI, that they’re looking at, right? Mm-hmm. So if you’re building a search system, you’re looking at click through rate. Are people finding the content that they actually want? If you’re building a recommender system, like, you know, are, are you converting people to adding stuff to their cart, right?

Are people buying more? You know, if you’re, building a social media platform, it’s engagement, right? Like mm-hmm. Are people staying? They’re, they’re clicking, they’re watching the videos, right? So. There’s not one size fits all sort of, evaluation for a vector database like retrieval application in terms of LLMs.

and this is where, I think folks there is kind of, either it gets the answer right or it doesn’t. Right? there’s a clear way to evaluate it, right? And [00:20:00] so, ~~we push. You know, ~~we have data sets that are synthetically generated for, you know, specific applications like finance, right?

And there, there’s, you know, data sets out there like finance bench is one and a few other that are, you know, human curated where you have a set of questions. You have a set of documents and then you have a set of a rubric essentially, right? Mm-hmm. And, and this is one of the thing that a lot of folks in AI are, are investing in heavily right now.

Mer core right? Is, is very you. They just passed, I think, 500 million in revenue or something crazy. Just based on building these data sets, hiring people hourly to, grade design rubrics you know, generate data to evaluate oms. Hmm. And so, ~~having that data, ~~once you have that data, you kind of look at two things, right?

If an LLM got the answer to a question wrong, did it not have the relevant context? Right? And again, this comes back to sort of context engineering. Where, if it didn’t have the relevant context, then you have a problem in your retrieval stack. That means maybe you [00:21:00] need an approach like hybrid retrieval.

You’re, you’re missing results. So you need to add lexical search or, you know, your dense model is not performing like it should. And so then you, you go in and there’s common metrics and information retrieval is the field of research. You look at MRR like median reciprocal rank is, is a way of measuring essentially like how far up in your search results is the, you know, correct.

Most relevant one, right? and these sort of metrics people look at and you might say, oh, your, your MRR is really low, or Your, your recall is really low. You’re actually, you’re missing this document. And that’s why the LLM got it wrong. and that’s one of the reasons people choose Pine Cone ‘cause we also optimize for recall within this approximate nearest neighbor space, when you’re searching the dents and beddings, it’s too expensive to do, you know, a full scan of your data set right at at runtime.

It’s gonna take a really long time. So we build indexes. to optimize that. But those indexes are approximate, so they’re approximate nearest neighbors. It means, you know, you might miss maybe 5% or less of results, but oftentimes those [00:22:00] 5% are not even gonna be necessarily the ones that are the top.

And so what we optimize for in building pine cone in these algorithms is, ~~you know, ~~okay, are you getting, close to a hundred percent of the. context results that you need to give the LLM to answer the question properly, right? And then if the LLM has all the context, your retrieval stack is actually performing as expected.

The metrics, the MRR, the recall, you know, you’re not missing any documents in, in your search. You’re giving LM what it needs, but it’s still answering it wrong. Okay, then it’s a reasoning problem, right? Mm-hmm. Then it’s an LLM problem where, you know, you need to, do more RL on the model or, train it fine, tune it in some way to improve the accuracy.

Speaker: It’s wild, man. it’s almost like you’re kind of putting like mini SERPs in everything. Like, like, you know, where it is. Just like because I’m thinking about it as you’re talking, I’m kind of thinking about how summarizations have evolved too to where like, you know, initially just kind of breaking down.

A few bullets of something to now you’re seeing these richer and richer data and text and images and, and all [00:23:00] this stuff kind coming together. and at the same time you’ve got genic, which is like a buzzword in itself, right? Where it’s like, it could mean so many different things to different people, right?

and different businesses, right? where it’s like, gosh. if you’re moving off of click-through rates or something. Well, what happens if a gen product you know, insertion or something just makes it irrelevant to even need to advertise the thing anymore, right? it’s just so wild.

Like how, wild west the space is kind of right now, but also just like. How robust what you all are doing. I is actually is and, and how it just feels like natural to the end user, or even if it misses, I can just follow up with a quick, you know, no, uh uh lemme steer you this way a little bit, you know, like, and, and then get, get that follow up.

it’s really interesting. ~~what are those things ~~what innovations that are happening in vector search today that are most interesting from your point of view?

Speaker 3: I think hybrid retrieval is probably the area where, you know the most. gains are being made. Cause a lot of folks originally built sort of dense retrieval applications and they realized that it, you know, it wasn’t exactly returning all [00:24:00] the results that were needed. Like I, you know, mentioned before, you’re sort of missing out on those benefits of cascading or hybrid retrieval.

And so, you know, actually building hybrid indexes with a full, sparse component and a full dense component. And, merging those results in an optimal way. sort of instant, latency is actually hard. That’s like a, a really non-trivial problem. You know, because you don’t know necessarily, you know, how many dense results, how many sparse and results.

These are sort of like two different indexes and you, you retrieve them and then, you know, merging there. a lot of different algorithms for that. and so, you know, one area I would like, there’s been a little bit of research in this, and, and this is an area I’d like to, to invest in more is training a model for hybrid retrieval in and of itself, right?

So right now you have models for, sparse, or you can use like T-F-I-D-F, BM 25, like, traditional full text matching. You have models for for dents, right? And there’s sort of approaches like splayed, which produces like word piece, tokenized, [00:25:00] sparse embeddings, which are kind of a, somewhere in between, essentially like sparse and dents.

And these models have been shown to do to, to be really effective. And they, and they do sort of document expansion and summarization as part of the indexing process, query expansion as well. These are techniques that are, being researched and are very promising.

So I think like, you know, being able to train one model that produces both a sparse and dense embedding, is something that folks haven’t really been doing in production yet. And, and I think that would be really powerful because then you sort of automatically get like a, a proper merging of these results.

You get the best of both worlds and you don’t have to kind of, right now if you’re doing hybrid, you, have to like, tweak some parameters, right? and sort of like use your heuristic or guess you’re like, oh. I’m gonna give like 0.2 weight to my dense and 0.7 to the sparse, Or something like this, right?

and we see folks as we call it alpha, right? This like blend parameter, And so, you know, if you could train a model that actually just sort of. Had that alpha baked in and optimized for, and like learned that sort of way to blend the sparse [00:26:00] and dense results in hybrid retrieval, that would be a really powerful step forward.

So, you know, I, I wanna see I hope to see some folks publish on that soon.

Speaker: No, that’s awesome. No, that’s super interesting. and also kind of given. your background too, like, this kind of coming from startup and also from the engineering side.

Is there any advice that you’d give teams that are adopting Vector search for the first time? You know, maybe they, maybe they specialized in something else initially and they’re having, because this is, this is something where like, I don’t know, I, I’ve been in startup too, like been dealing with this for stuff for a long time.

Like I, you rarely see something new that has this kind of C level. Fortune 1000 mandate to integrate something where like, and also having this so powerful and like, I don’t know, is there any, are there any pointers that you give folks that might be considering adopting this for the first time?

Speaker 3: Yeah. I mean, I think if you don’t adopt it, you’re competitive as well. Right. Good point. So, so, you know, it, kind of do or die, right? We’ve seen a lot of companies get really meaningful bumps in revenue from, you know, going full hog into [00:27:00] ai. I think Databricks just had like a, a banner year. there’s clearly ROI as like a digital native SaaS or database or, or, or any company that has a product that could benefit from ai.

You know, you are gonna get ROI if you deploy it properly, right? Mm-hmm. Like, we’ve seen that. Mm-hmm. And it doesn’t mean it’s easy, right? Mm-hmm. It doesn’t mean it’s easy at all. It doesn’t mean it’s intuitive. Evals are key. Key, key, right? Come up with a metric that you wanna optimize for what is the problem and be realistic.

AI is not a panacea, right? This is not a, a silver bullet. this is not gonna cure revenue problems forever. like magic. You can’t just wave a wand. And so. You gotta be realistic. But if you, you know, if you have metrics, specific metrics that you want to, to push in a specific way to measure the impact and a realistic.

Assessment that the AI technology, the retrieval technology will push those metrics in a meaningful way, a hypothesis and you can be very successful and you can add a [00:28:00] lot of ROI compared to your competitors.

Speaker: That’s awesome. there anything we didn’t cover that you want our audience to know about what you’re doing or, or stuff in general?

Speaker 3: We’re only at the beginning. That’s what I’ll say. this is awesome. This is the tip of the iceberg. You know, when we started Pine Conan in 2019, it was really, really early. It, it was arguably too early. Right. But you’re, you’re too early until you’re too late. Mm-hmm. And, and I think that’s one word of wisdom.

I, I’ll give to people. You know, you might feel like the AI models aren’t there yet. You might feel like, oh, I, the way I was doing it before is fine, But the technology will only get better if you just stick with your legacy methodologies. You’re gonna get left behind.

Mm-hmm.

Speaker: Great note to end on, man. Uh, Finally uh, where can people check out what you guys are doing at Pine Cone or, if you’re publishing stuff or are active on social work? Can people find you?

Speaker 3: Yeah, for sure. You can follow me at Jack Chu on Twitter.

First name, JACK [00:29:00] P-E-R-T-S-C-H. UK with an underscore in between. And you know, pine Cone is constantly putting out new blogs. We have a a Discord community that’s just launched. And so, you know, join, come hang out ask us questions. We love to chat. And, stay tuned because as I said it, it is just the beginning.

Speaker: Awesome, Jack. thank you so much for being so gracious with your time. I, I really enjoyed this conversation. I’d love to have you guys back on to you to kind of check back in on things as, as we move from the beginning to you know, down the road, right? but thanks. This has been like really awesome.

Really appreciate you making the time to come on today. Sweet. Likewise. Thanks Luke. Alright, cool. Thanks. Bye.

Speaker 2: Thanks for listening to the Brave Technologist Podcast. To never miss an episode, make sure you hit follow in your podcast app. If you haven’t already made the switch to the Brave Browser, you can download it for free today@brave.com and start using brave Search, which enables you to search to web privately.[00:30:00]

Brave also shields you from the ads trackers and other creepy stuff following you across the web.

Show Notes

In this episode of The Brave Technologist Podcast, we discuss:

  • Advice for teams looking to adopt vector search technology
  • The importance of hybrid and cascading retrieval methods
  • The significance of context engineering and evaluation metrics in AI systems
  • How Pinecone is democratizing vector database technology for AI and machine learning applications

Guest List

The amazing cast and crew:

  • Jack Pertschuck - Former Principal Engineer

    Jack Pertschuk is a former Principal Engineer with Pinecone, the market leader in vector databases. As the company’s founding engineer and lead for algorithms, applied research, and platform, Jack is responsible for the roadmap and execution across research and engineering for search index efficiency and accuracy. Prior to Pinecone, Jack was a founder of SidekickQA and creator of the NBoost open source neural ranking engine. He is an active member of the Rust and Information Retrieval research community, and is passionate about solving problems at the intersection of ML and systems.

About the Show

Shedding light on the opportunities and challenges of emerging tech. To make it digestible, less scary, and more approachable for all!
Join us as we embark on a mission to demystify artificial intelligence, challenge the status quo, and empower everyday people to embrace the digital revolution. Whether you’re a tech enthusiast, a curious mind, or an industry professional, this podcast invites you to join the conversation and explore the future of AI together.