Back to episodes

Episode 7

Building Your Own LLM

Greg Diamos, co-founder of Lamini, shares how their discovery of the Scaling Laws Recipe led to rapid evolution of language learning models, and inspired Lamini’s product offering. He also discusses his message for policy makers, including what we should be worried about, and what pitfalls we should work to avoid.

Transcript

[00:00:00] Luke: From privacy concerns to limitless potential, AI is rapidly impacting our evolving society. In this new season of the Brave Technologist podcast, we’re demystifying artificial intelligence, challenging the status quo, and empowering everyday people to embrace the digital revolution. I’m your host, Luke Malks, VP of Business Operations at Brave Software.

[00:00:21] Makers of the privacy respecting Brave browser and search engine, now powering AI with the Brave Search API. You’re listening to a new episode of The Brave Technologist, and this one features Greg Dymos, the co founder of Lamini, the enterprise LLM platform for building and owning LLMs. He is the co founder of MLPerf, the industry standard benchmark for deep learning performance.

[00:00:42] Greg holds a PhD from Georgia Tech focusing on high performance computing, and in this episode, we explore some topics you’ll learn a lot from, why you should build your own LLM, how language learning models have impacted ChatGPT’s evolution from 1, 2, 3, 4, and so on, and why this is the best time for software engineers to [00:01:00] enter the space.

[00:01:01] And now for this week’s episode of the brave technologist. Hey, Greg, welcome to the brave technologist podcast. How are you doing today? Doing

[00:01:12] Greg: great. Great to be here,

[00:01:14] Luke: Luke. What’s your involvement in AI? Kind of, what are you building and what kind of got you into it?

[00:01:19] Greg: I am currently co founding Lomini which is an enterprise platform to help companies like any business fine tune their own large language models, really take something like chat GPT and make it their own.

[00:01:29] And we’re providing a platform that enables companies to do this really easily. So if you’re a software engineer. And you just want to be able to build something quickly and deploy something quickly. We’re a platform that makes that easier. So when you

[00:01:40] Luke: say making it their own, is it around kind of like a lot of these data sets seem like it makes the web kind of look the same to everybody, right?

[00:01:47] Like, can you drill down into that a little bit? I’m just kind of curious, my own curiosity on that.

[00:01:51] Greg: Yeah, definitely. What does it look like to make it your own? Really? We found especially enterprises. They’ve accumulated a huge amount of data over time. So [00:02:00] especially companies have been in business for a while.

[00:02:01] You have operational data. You’ve probably taken it and put it maybe into a relational database, maybe a data warehouse, maybe a data lake. You’ve accumulated a lot of institutional knowledge in this data set. What we found actually for language models, language models are like a really interesting technology.

[00:02:14] A while ago, we actually discovered something really special about language models. So we discovered a kind of technology that’s called scaling laws. And essentially what scaling laws are just to kind of boil it down is it’s a recipe that turns data into intelligence. So if you played around with like, you know, for example, like chat GPT, that model has basically downloaded all of the information that’s available on the internet.

[00:02:34] Like imagine like a web crawler, it’s crawled the entire internet. It took all of that data. And then through this scaling law recipe, basically converted into chat GPT, which is this highly intelligent, very knowledgeable system able to do things like reasoning there. Turns out that technology is just very mature.

[00:02:50] It’s very reliable. So if you’re any company and you have, you know, significant data, like maybe one of my favorite examples was a company that makes adhesives. So they make FI. You know, [00:03:00] some of the like best adhesives and they like understand exactly which materials bond to each other. You can take all of that knowledge, all that like institutional knowledge, all of your patents, all the things you’ve invented and just embed that knowledge in this intelligent agent, which is just much more accessible, much easier to use, much more flexible, you know, than it’s just a giant pile of documents in a data center.

[00:03:18] So that’s really what we do. We provide a computational engine that allows, you know, companies to do that easily rather than having to build the entire thing from scratch.

[00:03:25] Luke: You guys must be busy.

[00:03:30] Greg: We’ve been doing this for a while. We were doing this even before the launch of chat GPT, but I think we knew it was going to work. We’d seen it working. It was very popular amongst, I would say AI researchers, but it just wasn’t very popular. Amongst casual users or amongst software engineers or amongst other enterprises.

[00:03:45] And just after chat GPT, everyone was like, wow, it actually does work. And I can see that it works and I can use it and I want it. So basically my entire life these days is just. onboarding people.

[00:03:58] Luke: Wow. That’s awesome. [00:04:00] I mean, because it does seem like we’re kind of in that phase now to where it’s like, okay, here’s parlor tricks and what this thing can kind of do, like a presentation style level and people talking about it going everywhere.

[00:04:10] But now it’s like, okay, it’s got to get everywhere, right? Like, and you’ve got like, you know, businesses, enterprise scale, kind of trying to figure out how can we work this into our systems and where can we find those kinds of efficiencies, right? Like, is that kind of what you guys were saying to from your end of it?

[00:04:23] Yeah.

[00:04:24] Greg: Many people come to us and say, I have 40 different use cases that we’re currently exploring internally. you know, we have a few that we’ve already deployed. It’s really easy to use. Like it’s, you know, it’s so much easier than the last generation of machine learning, like deep learning, because deep learning, you had to get like the entire, you know, infrastructure stack and you had to do like the whole ML ops lifecycle and you had to do like data labeling, like supervised learning.

[00:04:45] And it’s just really time intensive to do anything. And with a large language model, you plug in an API. And you write a little prompt. And then you deploy it and you know, somebody can do like any software engineering team that we’ve seen can do that in less than a week. So it just [00:05:00] leads to this like explosion of creativity.

[00:05:02] Luke: There’s different ways to build LLMs, like prompt engineering and fine tuning. Like why should companies or people that are playing with this stuff use them? I

[00:05:09] Greg: think you should use all the above. I think we found a flow that’s works pretty well, you know, as people are building on language models. So usually, you know, the starting point, as I was saying, you just wire it up.

[00:05:20] Just connected to an API. It’s just an endpoint, you know, query it. You have maybe a chat interface, maybe an agent you want it to generate some content. You want it to call an API for you. You know, you can just form a request, submit the request to an existing model. The existing models, you know, out of the box, the foundation models are pretty good.

[00:05:39] So the response that you’ll get back is going to be, you know, reasonably good. And then from there, you know, it is machine learning, right? So an important thing about machine learning is it’s not always right. You know, machine learning is about the mathematician. And he says, it’s like, it’s a foundation of statistics, so it’s going to be right statistically.

[00:05:54] Like you might be able to say, Oh, the system will get 90 percent of the. Answers, right. But it just doesn’t make sense at all [00:06:00] to ever say like it will get 100 percent sure. And then another way of understanding this, I found useful is that it’s a learning system. So it’s like a toddler, you know, like, I don’t know.

[00:06:10] I have like a little kid. He’s like five years old, you know, you can ask him to do things, you know, I can ask him to go put his shoes away. It doesn’t always happen, you know, but he’s going to try every day. He’s getting a little

[00:06:20] Luke: better. I’ve got a five and a seven year old. I can relate. it’s, yeah, a lot of repetition training them on it.

[00:06:26] And I mean, so much of this, that’s what’s interesting about this. Like, it’s never complete. Like, it’s never, it evolves just like people do with learning inputs and all that stuff. I can imagine the variety of things that you see in your day to day, right? Like working with these enterprises and on these different use cases.

[00:06:42] Like, is there one feature that you aren’t seeing that you would Love to see like or you’d love to see get more attention that folks that are listening might be interested to hear about

[00:06:52] Greg: oh sure yeah feature that just we’re not seeing enough for just getting more attention i mean i do think the process of teaching the model it’s still [00:07:00] somewhat of a black are like just like you know it starts out as a toddler eventually you want to you know.

[00:07:04] Graduate from high school and go to college and maybe get a PhD or something, you know, like it would be nice if it actually continues learning in that way. I think that’s the piece that, you know, we, we see a lot of people struggle with that, you know, it’s easy to do a little bit of learning by a little bit of learning.

[00:07:20] Usually the tools that people use are prompt engineering. So prompt engineering is writing like a, you know, a human level, like a paragraph. English description of what you want the model to do, it’s really funny. You can also ask the models to behave like particular people you so you can say, don’t answer it like a five year old answer it like a, a Harvard law graduate, you know, who has 10 years of experience case law experience, you write a prompt.

[00:07:43] And that’s pretty easy, like, and then you maybe iterate on it a little bit, you debug it, you know, you ask it some questions, you try and, you know, evaluate it, like, maybe you put in front of users, you get some feedback, the prompt engineering thing is usually pretty easy. And that’s what we see people kind of like doing to learn to teach the model.

[00:07:59] But then [00:08:00] it usually runs out of steam. we’ve seen people spend like four hours. writing out this page of a prompt. And on one hand, the model starts being a little forgetful, like even if the prompt gets really long, it’s kind of kind of just a lot to digest it all at once for the model.

[00:08:13] So it doesn’t really take all of it to heart. But then also you’re like, am I really, do you really want me to write? 10 pages, you know, or like 50, sure, sure. Yeah. So we, we see people kind of running into roadblocks with that. They’ll often add what’s called a rag on top of that retrieval, augmented generation of like, well, maybe the model isn’t answering a question correctly because it just doesn’t know what the fact is For example, in the news right now, Sam Altman just got ousted from open AI as the, as the CEO, probably chap GPT doesn’t even know about that. But with a retrieval, with a retrieval system, you could do a search query. You could pull that news article and you could feed that into the model.

[00:08:53] Yeah, I mean, that’s

[00:08:54] Luke: one thing too. I mean, like, just from a business perspective to like, you know, we have an index and we’ve been plugging this in to get more of [00:09:00] those like kind of near real time human results. Right. But like, it definitely seems like if you’re just using one set for everything, you’re, you’re basically either getting.

[00:09:09] a little too much of everywhere or you’re not getting enough info to you hit that limit that you’re talking about right whether that’s like time sensitive because it only goes back to a certain date or it’s just not as relevant right like to what your what your focus is the thing you’re trying to find out right like it seems like there’s a ton of road to be taken there by folks in the space is that fair to say Yeah,

[00:09:31] Greg: definitely.

[00:09:31] It’s, it’s almost like right now we have tools that are really easy to use, but then they tap out really fast. They tap out, you know, within like a week or, you know, just a little bit of of playing around with. And so it’s hard to go deep. It’s hard to continue to fix problems. If you’re deploying something in production, you get a lot of errors, you get a lot of potentially mistakes the model is making.

[00:09:50] It’s like, what do I do with all of that? So traditionally we have seen fine tuning. And then, you know, also more advanced technologies like domain adaptation and also training from scratch. Like these are the [00:10:00] things that are in the realm of like open AI would do that, like Anthropic would do that, Meta would do that.

[00:10:05] They’re not very accessible. So like, if you’re, you know, not one of those companies, you don’t have like a big existing machine learning engineering team. We, we see like on one hand it’s missing infrastructure. And on the other hand, it’s like really error prone. It’s. Really easy to like mess up the training recipe and just totally wipe out the model and it just doesn’t work at all.

[00:10:26] Luke: Is this where you see kind of things like open source playing a big role in helping to accelerate

[00:10:30] Greg: stuff? Yeah, I think the acceleration open source has been great. I love what Meta has been doing, like releasing Llama, releasing like a good starting point. So at least people don’t have to buy a 10, 000 GPU cluster to, you know, even play at all.

[00:10:44] They can start from, you know, Meta’s cluster, start with Llama. I really love the innovation in open source. Like one of the technologies I really like is called instruction fine tuning. This is the technology that actually turned a GPT three into chat GPT. Oh, okay. [00:11:00] The joke I like to make about it is, is like GPT three.

[00:11:02] It was fun for researchers, but it used to basically like complete the sentence. So if you ask it a question. It’ll like expand on the question or ask another question instead of like answering the question.

[00:11:16] Luke: Just coming from the startup space, right? Like, and especially cause you know, a lot of my backgrounds, both in ad tech and web three side of things where it’s like, easy to use has been. the biggest hurdle, right? Like, but with AI, it seems like, okay, we’ve got like the inverse of that, right? Like, it’s like, you’ve got some things that are actually easy to use building in that complexity and enriching that on the back end that makes it more useful is like a different set of problems, but a cool set of problems too.

[00:11:40] I think from an engineering or product development kind of point of view, the iteration and the acceleration of this is moving at a really fast tempo. Where do you see this having the biggest kind of fundamental changes in an everyday person’s life or work life in the next year or a few?

[00:11:56] Greg: Yeah, I think it’s just gonna become pervasive. It’s gonna be adopted everywhere [00:12:00] and it’s gonna keep getting smarter. And I just want to go back a little bit in time to tell a quick story of like why, why I think that. Okay, so this is rewind back to like 2015. We’re basically deploying deep learning systems in Baidu, like the, you know, the internet search engine, China, we actually invented.

[00:12:17] Or discover is kind of I actually think of as more like a scientific discovery. What are now called language model scaling laws. I mentioned this previously, but language model scaling laws. Again, they give you this simple recipe. It’s like, if I just follow this, like very simple recipe, I know how to make the language model keep getting better.

[00:12:33] And we’ve been following that recipe since then. So it’s basically that thing. I think of it as like, People in technology might like think of an analogy like Moore’s law, I think if it is almost like Moore’s law, it’s like there is a very predictable, very repeatable process that we can use to make this better, and we can quantify like how much is going to get better and how much effort we have to put in, and so we can make projections and plan around it.

[00:12:55] I think that’s going on right now, like that applies to language models, that’s how we [00:13:00] got from like GPT 1 to GPT 2 to GPT 3 to chat GPT and GPT 4. The thing that I think is so fascinating about it is that it’s not just like engineering or serendipity or something. There’s actually like a statistical law that’s driving it.

[00:13:14] And there’s nothing about that law that we understand that says it’s going to stop. So we can just keep doing the same thing and all the models are just going to keep getting smarter. One of the things I just think is like absolutely wild about it is that it’s not just that they get better at the existing tasks.

[00:13:31] You also see the emergence of new tasks. So like back in 2015, You know, the models kind of struggle with spellcheck, you know, they’re kind of moving on to grammar. Sure, sure. I had a researcher in Baidu, this is in 2018. He basically had one of these models where it was first learning how to write code.

[00:13:49] And so he showed me this example. He’s like, Oh, the language model. It finally wrote a Fibonacci function. I’ve been like asking it to write a Fibonacci function every day for the last [00:14:00] six months, and it finally got it right. And he was so happy about that. At the time, I was kind of like, Come back when it actually produces something useful about like, but also in hindsight, like it was the emergence of a new capability.

[00:14:16] Yeah. And

[00:14:16] Luke: I know like the minute I see an engineer or a researcher getting that excited about something, you know, that there’s just a matter of time before it starts to pick up steam and find that right fit. So interesting about right now. I mean, like to go back to the scaling laws, right? Like what I’m seeing right now is kind of a mix between you’ve got like these large language models that you’re sourcing into, but also companies are starting to play around with like more local models to just, I mean, like in coming from a browser, right?

[00:14:42] I mean, like you kind of have this like super tight corpus of just a user’s Personal information all in one spot, right? Like, which is kind of a rich thing, but like balancing these things out and kind of like augmenting them together in a learning system is like super interesting. And I’m just super excited to see where that [00:15:00] goes too.

[00:15:00] But I think that it’s mind blowing. Like the stuff could just keep going and going. It’s basically at the rate in which we can improve compute, right? Like, is that kind of what our only limitation

[00:15:08] Greg: is? Yeah, it’s the rate at which we can improve compute. But then sometimes people say, Hey, but we’re hitting the end of Moore’s law.

[00:15:14] So what is going to happen? I just don’t believe it because I think it’s not just how much faster can you make this adder circuit or something, or how much faster can you make this like Pentium architecture CPU? It’s, it’s actually a completely different type of application. And so since we’ve like reoriented around that new type of application, there are so many other optimizations that have become.

[00:15:38] Possible that just didn’t apply at all. Like if I’m trying to build a computer to like run the web browser, except for the language model part of the web browser, but you know, the like rendering on the screen or like the processing requests like that, like doing the network stack, that kind of thing, it actually is pretty hard to like eke out performance for those things that we’ve known about for a long time, but this is a completely new workload.

[00:15:57] So like, as soon as we switched it, you [00:16:00] immediately saw like not 20%. Kind of performance improvements, but you saw like 20 X or like 30 X. one technology that we worked on at Baidu was called the tensor core. We had a partnership with Nvidia. We kind of worked on the what’s now called the tensor core.

[00:16:14] That thing was like a eight X one change was an eight X improvement in efficiency. Wild.

[00:16:20] Luke: And then there’s this whole like kind of depth to it too, right? the learning element, right? Like where it’s like, okay, you know, the functions are the functions, but also like what’s coming out quality and where is that person sitting and, you know, all this stuff.

[00:16:31] It’s just so early. It feels like. Right. Like where it’s like, okay, I know it gets better for English pretty quick. Right. Like, but even, I mean, we’re kind of seeing us too, where it’s like, you’ve got a global product and then you release something that’s like heavily trained on English language. Right. And you’re like, well, when’s German going to get better?

[00:16:48] And you’re like, well, just keep using it

[00:16:51] Greg: over time. I mean, yeah, right. Exactly. So I think it will get better. I think you’re exactly right. I mean, I think because most of the data [00:17:00] is in English. The models are best in English. It’s also like so computationally intensive to like train these models. even though we might have more data.

[00:17:10] That might cover other languages, like if you’re someone who’s building a language model, you have this hard choice of do I really want to train on a bunch of German data that might take a lot more time, like a lot more resources and the model might have to be even bigger. Yeah, so actually we’ve seen one thing.

[00:17:23] I’m not, I’m not sure how pervasive this is going to be, but I’ve actually seen governments like basically entire countries after they realize like, oh, the models are better in English. Maybe we should build our own models. Wow. That are centered on our users and like, you know, train on all of their data.

[00:17:39] That makes sense. I’ve actually seen that. It’s not, I, I’m not kidding. Like I’ve seen, I’ve talked to like governments who are thinking about doing this.

[00:17:45] Luke: That’s, it’s wild. I mean, even us, when we’ve done like pretty naive stuff, in the browser, trying to kind of train off local data, for things like matching advertising.

[00:17:54] Right. And the leap is huge when you go from one, one language to another, and there’s so much context that’s [00:18:00] different, right? Like that. You’re like, okay, even translating the ads is one thing, but like also the context in which someone’s browsing is this, there’s a lot of nuance there that you’re going to get lost on.

[00:18:08] And, and it’s not really that easy, but now it’s super fascinating. Yeah. So we talked about kind of scaling, right? Like infinitely scaling and getting more pervasive with these things. there’s a lot of talk around like, Oh, 10 X engineering and things like this, where these are power tools. Like how much of that, from what you’re seeing is real versus just kind of parlor trick or whatever you seeing that actually happening when you’re working

[00:18:30] Greg: with enterprise, I do it.

[00:18:31] Yeah, I’ve, I, I use language models actually in the past, actually, it was really funny. They were, they were kind of terrible, but because I’m like a practitioner, I would still like try and force myself to use it. So one of the first and weirdest uses, it was around the time GPT 2 was out. So we had some like GPT 2 class models.

[00:18:48] I was working in the AI fund with Andrew Ng, I worked with Andrew Ng for a long time. And we were basically just writing Thanksgiving cards, actually reminds me this is appropriate time of year. So getting close to Thanksgiving, we’re writing Thanksgiving cards to [00:19:00] like all of our employees. And then I was thinking like, you know, I should try the language model, right?

[00:19:05] Like it’s, it’s kind of bad, but it’s kind of okay. Some people liked it. I think the really hardcore machine learning engineers like kind of liked my GPT 2 written Thanksgiving cards. A lot of them were kind of like, really, Greg? Like, really?

[00:19:23] Luke: It’s an experiment. You know, it’s good. It’s good. There’s an appreciation

[00:19:28] Greg: there.

[00:19:28] Now I think it’s like, I use it all the time. I code a lot. Okay. I code like a ton. I don’t really use Stack Overflow anymore. I feel kind of bad for Stack Overflow, but like, Yeah. When I’m in my code, I’m like, okay, here I need, I don’t remember what this API does, right? Because I need to read the docs. I need to go and find the website of like the docs for this instead of like, all right, can I open up stack overflow?

[00:19:49] I’m going to try and craft a search query and we’ll flip through a bunch of questions, like try and find the uploaded answer. I basically use co pilot and I just put a comment and I’m like, what does this API do or fill out this [00:20:00] API, you know, with these parameters and it just does it. Yeah, that saves me a ton of time.

[00:20:04] I love it.

[00:20:06] Luke: That’s awesome. I know there’s a lot of like AGI kind of doomerism and whatnot, but like, I’m always asking, wondering like, what’s the stuff that’s like, you’re actually kind of worried or concerned or think is kind of flying unnoticed?

[00:20:20] Greg: I’m pretty scared, honestly, okay? But I’m not scared about, I’m not scared about evil killer robots, okay?

[00:20:26] Not, I kind of like, Well, sorry. It’s an aside one. Terminator is one of my favorite movies, but I’m not reading for the term. Okay. I just think it’s cool. Okay. Arnold. Right? Okay. Anyways, I’m actually pretty afraid right now that most companies like don’t realize the value of their data. And one thing I’m kind of worried about is basically people who are insiders.

[00:20:47] If you have a ton of information, like a ton of knowledge about how this works and a ton of money, basically cornering the entire market on intelligence. Yeah. And I think that’s like absolutely terrifying [00:21:00] that like a small group of people in the world could basically build the best model in the world that has scraped all of the data from everyone and they have the keys to it and they can do whatever they want with it, I’m a humanist, like, I kind of just don’t like a situation regardless of who you are, regardless of how responsible you are.

[00:21:19] To me, it sounds like a dictatorship. of like any one person or any small number of people who have that much power and influence. So I’m deathly afraid of that right now because I think that a bunch of insiders, like people like me who have like seen how this stuff works, know that that’s a move.

[00:21:34] And they’re going for it. They’re going for it hard. I think I’m really worried about basically the U. S. government doesn’t see that happening. So I think, you know, they could play into it. They could help enable it. So I think, you know, anyone’s listening, if you’re a policymaker, you should be super skeptical of anybody like me coming to you and saying, like, I need to have the keys to this thing.

[00:21:57] would you give me the keys to like the [00:22:00] nuclear bombs? Right. I think it just doesn’t make sense as a society to end up in that state another thing if you’re a CEO You know, you’re like c suite in a company that has data and has proprietary technology I would also be deathly afraid because imagine like if all of your data gets sucked up into one of these models Right.

[00:22:18] Right. They’ll basically wipe out your entire competitive advantage. It’s

[00:22:21] Luke: like training OPSEC to your employees, right? Where you’re like, okay, like don’t click on phishing emails don’t input our sensitive stuff here, right? Like have really kind of control around what you’re inputting into this or, or your IT team’s like making sure, I don’t know.

[00:22:34] There’s like the safety layer stuff. When you see things like these executive orders and stuff or people talking about regulatory capture, like how much does that miss the mark from your point of view? Because I think you framed that really well, the concern. When you do see these government agencies kind of starting to take positions, is it that they’re just missing the boat?

[00:22:52] Or do you think it’s in the realm of? I think there’s a

[00:22:54] Greg: real Issue there, which is it is a powerful technology, and so you don’t [00:23:00] want it to fall into the wrong hands. You don’t want it to fall into the hands of either essentially people who are misaligned with the government. So imagine it’s like a defense problem.

[00:23:09] Like it could realistically be a defense problem. You know, you also see, like, from criminal activity perspective, like alright, You could see this being used by spammers. You could see this used by black hat people trying to exploit other systems. You could see it being used, you know, to essentially exploit other people.

[00:23:24] I do think there is a caveat there that people aren’t really talking about, which is in order to run any of this stuff, it’s very computationally intensive. So I’m actually, like, not that worried about somebody with no resources just trying to do something nefarious, because the people who might think of doing something like that, like, it’s just unlikely, at least from my perspective, that they’d have a lot of computational resources.

[00:23:52] Again, like I think the bigger, like more realistic concern is either like another nation, you know, somebody who has a lot of resources that [00:24:00] is capable of doing that and it’s just misaligned, or, you know, again, someone who’s basically capitalists, basically greedy, sees an opportunity to create a monopoly, to create very one sided, very dictatorial monopoly that cuts out regular people, but also, you know, I think also importantly, like many other businesses and the opportunity for competition.

[00:24:21] Yeah, I

[00:24:22] Luke: mean, this is kind of like a recurring theme that we’re hearing, like whether I’m talking to like academics or, or, or business owners or whatever, it’s just like the thing a lot of people are concerned about is just accessibility too. It’s like, if you’re cutting people off, if your country is slowing this down or whatever, you’re, you’re blocking off access.

[00:24:39] You’re really, people are going to fall behind like pretty quickly. And yeah, no, it’s super fascinating stuff. And I mean, just kind of like. Take it and do a lighter place. I think, you know, like people here might be listening, might be technical people, developers, engineers that haven’t done a ton on AI, but like are interested.

[00:24:56] From your point of view, for somebody that’s already kind of technically literate [00:25:00] or just trying to kind of jump into AI, like what are some of those resources that you, if you could go back in time or things that are here now, that weren’t there, you know, back when you got into this stuff that you’d recommend people kind of take a look at or any suggestions around tooling or resources

[00:25:14] Greg: or whatever.

[00:25:15] Yeah, we have a class on deep learning AI. I think actually the deep learning AI short courses are pretty good. So they, they cover a number of different topics. Like what is an LLM? How do you prompt engineering? How do you do rag? We have one. How do you fine tuning? Those are pretty easy, pretty accessible.

[00:25:30] I also just want to note for software engineers, this is the best time to enter, and actually we need you because one of the things that I’ve, I’ve really found is that like the thing that blocks deployment of any of these features, any of these LLM features, it’s not. The model, I mean, it can be the model, but like, in addition to the model, it’s really the integration, for example, like a real product, like, like get up copilot, it’s not just a model, right.

[00:25:54] It needs to be like integrated into an IDE. Like you have to be able to serve it in production. Like you have to be able to [00:26:00] scale it. You have to have reliability. You have to have you know, a data system behind it. all that stuff doesn’t just happen. All right, it needs software engineers in order to build all of those things.

[00:26:08] And so actually when I see most projects, like most companies who are thinking of like, okay, I want to deploy this right now. I have this great idea. I really want to deploy this in my product, the models there, you know, we’re missing all the software. We need you, we need you to build it.

[00:26:23] Luke: That’s awesome.

[00:26:24] It’s a good kind of problem, I think. But also a tricky one. I mean, like, you know, that’s where these things, these new areas are super interesting to me is that you have startup fundamentals of okay, you have survival mode and kind of breaking through and odds there. But then when you’ve got these new leading edge areas, it’s like.

[00:26:40] Well, like let’s also kind of pioneer and break through some stuff and, and the fact that the stuff’s ready and it’s just a matter of just getting the talent, the pieces of the, the Lego blocks gonna put together in the right way. That’s like easy to use by the way. There’s already products that are kind of easy to use, so like, it’s just, seems so kind of inevitable that stuff’s just gonna start to continue to accelerate, from my point of view.[00:27:00]

[00:27:00] But yeah, no, it’s fascinating. I know you mentioned Terminator, any other favorite movies or shows or books or anything that put your focus on this stuff early on or that you like to look back on?

[00:27:11] Greg: Yeah, well, I mean, the way I got into this actually is strangely, when I was a little kid, I was like 10 years old.

[00:27:17] My parents, for various reasons, just decided it would be a great idea to move out in the middle of the desert in Arizona, like not in a big city, but like really out in the middle of nowhere. And I was just like bored out of my mind. And the only thing I had were a bunch of machines, like old mainframes that my mom brought home from IBM.

[00:27:36] She was a database administrator at IBM. She just threw them in our garage and I would play like, I would just like try and compile old games. And like, I’d play like Zork, you know, or like text adventure games. That’s awesome. And it was the only thing that kept me sane, I think, for 10 years.

[00:27:51] Luke: I bet. Like, you know, there’s not a lot to do, right?

[00:27:54] Especially as a kid, it’s just like you know, yeah, yeah, it’s

[00:27:56] Greg: awesome. Yeah, that, that’s kind of what got me into it.

[00:27:59] Luke: [00:28:00] Yeah, well, no, I love, I love hearing that stuff too. I mean, cause you know, it just. Everybody’s got a different path, right? Those little things. I mean, you know, whether it’s, you know, hacking around the phone or playing with old mainframes in the desert, right?

[00:28:10] Like, it’s pretty cool. Is there anything too that we like didn’t cover that you want folks to know about or, or anything interesting you guys have going on? I know you mentioned the learning center. We’ll link to that. Cause I think that’d be super interesting for people to take a look at, but like, is there anything else that you want to touch on while we’re here?

[00:28:25] Oh yeah,

[00:28:25] Greg: I did just want to note one other thing. So one problem that we see, like everyone running into whenever they’re trying to do anything with LLMs is like, where am I going to get my compute? They’re so computationally intensive. Yeah. I do just want to plug AMD. Like, so right now everybody runs almost everything on top of Nvidia.

[00:28:42] It just works, you know, but people are probably not paying attention. AMD actually works right now. So we have our entire stack that runs like out of the box on AMD. AMD has, you know, mi 300, the biggest difference about AMD is like, well, it’s actually great at fine tuning. But in addition to that, it’s [00:29:00] 10 X cheaper and it’s available today.

[00:29:03] So anyone who is blocked on scaling should like seriously take a look at that. We’re deploying everything on AMD because of that. No,

[00:29:10] Luke: it’s great. It’s great. People will take that alpha for sure. I didn’t take it back and and I’m sure they’ll run with it. thank you for your time and for sharing all your insight here today. It was super interesting.

[00:29:19] Greg: Yeah, it was great chatting, Luke. Where can

[00:29:21] Luke: people find you too, online, if they want to follow you? Oh,

[00:29:24] Greg: sure. I’m on Twitter and LinkedIn. Feel free to follow us.

[00:29:26] we’re moving pretty fast. We do about weekly releases. So any new features, anything, you know, interesting, just check there for updates. You’ll, you’ll be able to see what we’re doing.

[00:29:35] Luke: Awesome, man. Thanks, Greg. Have a good one.

[00:29:37] Really appreciate you coming on. All right. Thanks,

[00:29:39] Greg: Luke. Take care.

[00:29:41] Luke: Thanks for listening to the Brave Technologist podcast. To never miss an episode, make sure you hit follow in your podcast app. If you haven’t already made the switch to the Brave browser, you can download it for free today at brave. com and start using Brave Search, which enables you to search the web privately.

[00:29:56] Brave also shields you from the ads, trackers, and other creepy [00:30:00] stuff following you across the web.

Show Notes

In this episode of The Brave Technologist Podcast, we discuss:

  • Why you should build your own LLM
  • How language learning models have impacted ChatGPT
  • Why this is the best time for software engineers to enter the space

Guest List

The amazing cast and crew:

  • Greg Diamos - Co-Founder of Lamini

    Greg Diamos is a co-founder of Lamini, the enterprise LLM platform for building and owning LLMs. He is also a co-founder of MLPerf™, the industry standard benchmark for deep learning performance. Greg was a founding engineer at Baidu’s Silicon Valley AI Lab (SVAIL), where he co-invented the first deep learning speech and language model (which was deployed in production to billions of users). At Baidu, he discovered Scaling Laws, motivating LLMs. His team members from SVAIL built the most useful LLMs today, including OpenAI’s ChatGPT, Llama 2 at Meta, Claude 2 at Anthropic, PALM at Google, and NVIDIA’s Megatron. Before Baidu, Greg was a CUDA Architect at NVIDIA. Greg holds a PhD from Georgia Tech focusing on high performance computing.

About the Show

Shedding light on the opportunities and challenges of emerging tech. To make it digestible, less scary, and more approachable for all!
Join us as we embark on a mission to demystify artificial intelligence, challenge the status quo, and empower everyday people to embrace the digital revolution. Whether you’re a tech enthusiast, a curious mind, or an industry professional, this podcast invites you to join the conversation and explore the future of AI together.