Back to episodes

Episode 29

How Mistral AI Strikes the Balance Between Openness and Profitability

Sophia Yang, Head of Developer Relations at Mistral AI, discusses why they don’t open source everything, and how they strike a balance between openness and growing their business. She also discusses how Mistral AI approaches developer relations and community building to advance AI.

Transcript

[00:00:00] Luke: From privacy concerns to limitless potential, AI is rapidly impacting our evolving society. In this new season of the Brave Technologist podcast, we’re demystifying artificial intelligence, challenging the status quo, and empowering everyday people to embrace the digital revolution. I’m your host, Luke Malks, VP of Business Operations at Brave Software, makers of the privacy respecting Brave browser and search engine, now powering AI with the Brave Search API.

[00:00:29] You’re listening to a new episode of the Brave Technologist. And this one features Sophia Yang, who is the Head Developer of Relations at Mistral AI. There, she leads developer education, developer ecosystem partnerships, and community engagement. In this episode, we discussed The goal and vision of Mistral AI, why they don’t open source everything, and how they strike their balance between openness and growing their business. and how Brave and Mistral are currently integrated. And now for this week’s episode of the Brave Technologist. [00:01:00] Sophia, welcome to the Brave Technologist podcast.

[00:01:02] How are you doing today?

[00:01:03] Sophia: Pretty good. Thanks so much for having me.

[00:01:05] Luke: Yeah, thanks for coming. I think there’s a few things where we’ve launched, whether it’s partnerships or we’ve had feature releases where you just see this wave of really great feedback from the community. And when we announced that we had Mistral in our Leo integration, like that was one of those ones where all of a sudden, like you’d have researchers and other people that were just really into AI that we don’t normally hear from came out of the woodworks and said good things about it.

[00:01:28] But I think one thing that I’m kind of. Got some feedback on it. And I think like, well, I think it would be great for people that maybe understand it’s like, what is the goal, kind of the vision of Mistral AI? Like what’s different about what you all are doing than what other companies in the space are doing?

[00:01:43] Sophia: Yeah. Great question. So our mission is to make AI ubiquitous. Our goal is to build the best AI models, as you all know, we’re trying to make the model open, efficient, portable, helpful, and trustworthy. So more specifically. We try to make AI [00:02:00] more accessible and beneficial for various industries and individuals.

[00:02:05] We want to be, and we are, and we’ll continue to be at the front and center of many AI applications. We also try to commit to. Building highly specialized models, such as the code model, Coastal, we recently released, which is specifically for coding trained more than 80 languages. And also we try to empower builders and developers like you guys with the ability to customize our models according to your specific data use case and business needs.

[00:02:38] Luke: That’s great. And in the specialization part, and you said making it more accessible to other areas, are there certain areas that you all are really focusing on? I mean, what does that mean as far as like approaching these models go from a specialization point of view? Is it more around like resourcing, or is it around kind of doing extra research, or how does that work?

[00:02:57] Maybe folks might find that interesting.

[00:02:59] are two [00:03:00] parts. One is model customization. Like I mentioned, we build specialized models. Code model is one of the model we just released. Second is builder customization. We try to empower our builders. So as an example, last week we released our fine tuning API so everyone can fine tune Mr.

[00:03:18] Models with our API and fine tuning. I think, two weeks before we actually released our fine tuning code base, so like if you’re using our model locally, you want to find to our model. A lot of people are asking what is the best way to find fine tune Mr. Models. Then you can use our code base right away.

[00:03:37] You might need some GPUs, some local powers if you want to use the code base, but we offer you different options, right? If you want to use our API or if you want to just fine tune on your own with a code.

[00:03:50] You mentioned new fine tuning API and how you’re working with developers on helping to support them.

[00:03:56] How do you see, like, developer relations helping AI evolve? Is it [00:04:00] around kind of providing more well rounded support for areas where, like, GPUs and things that you mentioned that, you know, might be other challenges that are facing startups other than just purely from the software perspective? can you give us a little bit of how you see developer relations around AI evolving?

[00:04:15] Sophia: Yeah, I mean, to me, developer relations is like a play a huge role in the evolution of a I and we try to build a strong relationship with our community developer relation is actually all about our community and luckily we have a really nice community here and we try to help people to collaborate with each other and to innovate several things we do here. For example, we invest in heavily in developer education. We work to create a very supportive and inclusive community. We have organized several hackathons, successful and pretty big hackathons with participants across the industry and interests.

[00:04:53] In just a few weeks ago, we hosted a hackathon right here in Paris. Attracted people from like, as far as [00:05:00] London, Japan, Singapore. Yeah. We have like over 700 signups and unfortunately we can’t host them all. Yeah. we selected like two, 300 people attending the hackathon and we’re really impressed with our community.

[00:05:14] And we try to support as much as we can, like during our hackathon, we provide API support, GPU support, and we have other sponsors providing different tools as well.

[00:05:23] Luke: Are you guys learning from a lot of these hackathons and these events? Like, is it helping to inform kind of how Mistral approaches building the software and, is that relationship with the developers, like pretty key at this stage, you think?

[00:05:35] I, this just sounds like amazing demand. I mean, 700 plus, you know, it’s huge for a hackathon. I mean, that’s fantastic.

[00:05:41] Sophia: Yeah, we’re definitely impressed by what people are building, like what people are looking for. We see a lot of vision related models built on top of Mistral, and they build really cool applications like helping with the vision impaired people, help them navigate the world.

[00:05:56] So. We see, like, the trend of, what people [00:06:00] are really looking for, and also, like, the agentic, different types of applications. How does Mr. Model is used in real life is crucial for us to know. They always provide the best feedback.

[00:06:12] Luke: Awesome. And is there any area where you’re seeing, like, the biggest amount of demand?

[00:06:17] I mean, you mentioned, helping people to navigate that have, that might have vision disabled, or are there other areas? Certain industries that are really standing out as far as like, Oh, well, out of 700, we saw, you know, this vast majority coming from this one place, or is it pretty spread across a lot of different areas?

[00:06:33] Sophia: I would say people have various use cases because our hackathon is more open book like we don’t have a specific industry requirements, like you have to be finance or you have to be healthcare. Even though we might have a more specific hackathon leader this year. Yeah, because there’s a open book hackathon.

[00:06:51] We see applications just from all over the places.

[00:06:54] Luke: Oh, that’s awesome. Just changing gears a little bit. Two of, Mr. World’s models use this MOE [00:07:00] architecture. Can you kind of briefly explain what MOE is and, and kind of how this is different? Yeah,

[00:07:05] Sophia: yeah, that’s a great question. It’s one of the, the power we have at Mistral, and now everyone, everyone is using it.

[00:07:12] So Mixture of Experts MOE is the underlying architecture of Mistral AX7B and AX22B. So at a very high level, you can imagine you have multiple experts, for example, eight experts to help you process your text. So instead of seeing All of them, or maybe just one expert. We actually choose the top expert for you for each token.

[00:07:37] So in a more technical sense. Mixture of experts incorporates different layers into the transformer block. You may already know about transformer block. Each transformer block consists of a feed forward layer and a multi head attention layer. So traditionally, every input actually goes through the same feed forward layer.

[00:07:58] So everything goes through the [00:08:00] same thing. What if you want to add capacity to the model? We actually can duplicate the feed forward layers n times. So instead of just one, you have like many, many different feed forward layers. But then how can we decide which input token goes to which feed forward layers?

[00:08:17] So the solution is to use a router to map each input token to the top k feed forward layers. So in our model, K equals two, which means that each token sees two experts out of eight. So as a result of MOE, even though our mixture 8AX22B model has a total of 141 billion parameters, only 39 billion parameters are active.

[00:08:44] Which means our model is highly efficient with faster inference and improved model quality at a lower computing cost. So this is great. And a lot of people have found that mixture models actually perform really well compared to models at the same [00:09:00] active parameter size or even higher, a lot higher parameter sizes.

[00:09:04] Luke: That’s fantastic. So when you all talk about, you know, the specialization, right? Like I would imagine this is a key part of having those different experts on that, assuming I’m understanding this correctly, but that having that variety tackling the input and then providing the output that best matches the input, right?

[00:09:20] Like, that’s sounds great. Mistrals open source. Why is open source important to LLMs and important to you and important to you all at Mistral? And do you think that people have a good understanding around how open source and AI are, working together and,

[00:09:34] Sophia: yeah. That’s a great question. I mean, open source AI is at the core of Mistral since day one.

[00:09:41] We open sourced our first model. Mr. 7B, we, it’s open weight. So we have always been the advocate for openness in AI. And we believe that openness is not just a catalyst for innovation, but also very essential for transparency, accountability, trustworthiness. [00:10:00] In this AI world, currently we offer four open way models.

[00:10:03] We have Mr. 7B, Mr. 8X7B, Mr. 8X22B and CodeStraw, the coding model, and we also have open sourced various databases such as Mr. Common for tokenizers, Mr. Inference for running our open way models and Mr. FineTune for fine tuning. So basically we provide both the open way models and all the code would need to run and customize our model.

[00:10:27] So like, I guess one of the common question we get from the community is that why don’t we open source all of our models, right? Specifically Mr. Large is a really, really powerful model. It’s our reflection model with amazing advanced reasoning, language, math, and coding abilities. And when we released Mr.

[00:10:47] Large, a lot of people from the community is like, why don’t you open source it? Yeah. But I think it’s really hard. I guess. Just in my opinion, I guess we need to strike the balance between our commitment to openness [00:11:00] and our responsibility to grow a business. And also like Mr.

[00:11:05] Large is a much larger model. It’s very challenging. It will be very challenging for our community and developers to use it effectively if we open way to the model, just because it’s a lot bigger. It will need a lot more computing power for individuals. It might be like hard. But for enterprises, like who are interested in getting the raw ways for our commercial models, we do offer options for enterprise to get our raw ways.

[00:11:30] And we have seen great success for enterprises to use our commercial models in house.

[00:11:35] Luke: That’s super helpful. I think, there is a lot of that balance striking to be made, especially when with all of the hype around AI, the compute costs and resourcing and all of these things, 700 plus developers coming to a hackathon is no small thing.

[00:11:47] But if you start to think about how that. Cost if the largest things open and, you know, there’s unknowns, I mean, it really might cripple the ability to do a lot of other things. I would imagine, you know, if you had to make those kinds of pivots and [00:12:00] it definitely seems like we’re at a point now where making sure that the right things are available and accessible, like to, and, for some, it seems like some specialized cases too.

[00:12:08] It’s not necessarily needed to have these huge models, right. I mean, that’s one of the things that’s jumping out to me that sounds really interesting, and it’s kind of an area I’d like to dig in a little bit more with you on here, but I’m hearing all sorts of different things. I’m hearing specialization.

[00:12:21] I’m hearing local models. I’m hearing, you know, different types of use cases and all these great things that I think, at least me, from somebody that’s working at an open source company and working with a lot of like, tooling here is open source too, don’t think you get a lot of that ubiquity, right?

[00:12:36] Unless you have a really robust open source community that’s working with you closely on developing these things out and kind of finding fit for them in the market. So that’s fantastic. But I think one thing too, I mean, you know, Brave has Mistral integrated into our Leo AI. Tool thing in the browser, you might sharing a little bit more like kind of about the detail around how Brave and Mistral are integrated, just for folks that might [00:13:00] be unaware and hearing about this is the first time we have one of the benefits about Brave is we have users that are all kinds of different skill sets, and maybe they’re a developer in one area, but they haven’t really been following the AI stuff in others, too.

[00:13:10] So like, I’d love for them to get a little more detail on that specific area, too.

[00:13:14] Sophia: Yeah, absolutely. So we are definitely committed to open source and we are super glad that Brave is also committed to open source. So this collaboration is kind of natural for focus. And I know like the browser system, Brave Leo powered by.

[00:13:31] Mix A X seven B. And I know the code large length model feature in brave search is also powered by mixture. A X seven B to generate coding answers during searches. So I’m just really glad to see brave is using Mr Mr models specifically mixed raw X seven B in different areas, including assistant and also in search.

[00:13:57] So, Yeah, I’m glad we’re [00:14:00] both committed to models and open source community, and I really looking forward to see more integrations with our models in the Brave platform.

[00:14:08] Luke: Yeah, yeah, we are too. I mean, I think one of those great things is about our kind of position in the market. And one thing that I’m excited to hear, you know, that we have Mistral in both of these areas is one around the search answer engine, right?

[00:14:19] Like where, I mean, you look at how. Kind of the markets are approaching these different areas that they are going to have power users that may be using something like a chat bot a lot for a lot of their workflow. But then one of the other great things about this and why we’re excited to have you all in this answer engine is everybody uses search, right?

[00:14:37] Like everybody, your neighbor, your friends, your family. If we really want ubiquity with this stuff, right? Like having it. Yeah. Meet people where they are is like really important. So I hope that like that touch point where we have our, your models working with our search engine, right? Like in, the ability to kind of have this real time location for these things.

[00:14:55] I really hope it’s kind of helping all sides kind of learn too, about like how to, you [00:15:00] know, improve and kind of keep that edge because it’s one of those things where, I mean, I’ve been brave for a long time and seeing this answer engine come out, it’s one of those things where it’s like, well, okay.

[00:15:08] There’s something really cool here, like people are really like getting useful things from this, especially with even basic things like recipes, right? Where it’s like, give me like a, you know, where normally you might go to a webpage and it’s just like bombarded with all these things and you can’t find the ingredients or whatever.

[00:15:22] And here’s this beautiful kind of like clean, you know, output for that. And, fact that Mistral is helping to power that, I think it’s fantastic. And I think both sides are going to learn a lot from that, just from what we’ve been seeing in early feedback. Zooming out a little bit, When you look at the current state of AI, right, and large and small players, like, is specialization kind of this, this key area that you think a lot of the attention is going to go into over the next couple of years, or where do you see movement and what things are most appealing to you, maybe personally, or, but also, you know, what’s Mr looking at, as far as areas to continue to like, put a lot of focus into?

[00:15:54] Sophia: I think exactly like you said, I think there will be a lot more customizations [00:16:00] and consumer applications powered by AI. But the interactions with AI would probably be different from what we see today. Like right now it’s chatbot, it’s like everything feels, we’re doing very similar things.

[00:16:14] I don’t know if that will change. I think it will probably change. On the other side, I think everyone. Should have the power to be an AI builder. I feel like we are trying to push that movement that everyone is going to be a builder either directly or indirectly with the increasing availability of the tools and resources and all the models available for you in different toolings.

[00:16:39] So that building AI models will be become more and more accessible to a wider range of people. So not just developers, but like everyone at everyday use cases, you can think about how you can build models. Yeah. And on the model side, think the continued improvements in the performance efficiency customization of models is bound to happen.

[00:16:59] [00:17:00] It’s just, my opinion. That’s what I was thinking.

[00:17:02] Luke: No, that’s great. We started seeing this early on too, is kind of, you know, you have these really huge tech players that kind of have a bit of an advantage from a lot of different angles, especially when you look at things like how an operating system or how a browser or how these different, areas can become kind of rich inputs and also like areas to get.

[00:17:23] Additional info from, and I think like what I really love seeing about Mistral and, this approach that you’ve outlined here today is that a lot of focus around the builders and a lot of focus around, you know, code based doing, this is what we’ve seen a lot of excitement around too, because I mean, one thing we’ve noticed at Brave is if you can go towards the people that are building, you start to kind of get a real network effect among the people that go to those people to find out what’s going on.

[00:17:47] Not only that, you really can kind of see what happens when people take these to the limits on creativity and using it in different ways. And I think there’s a lot of like buzzword bingo around,10 X developers or whatever, 10 X coders or, or [00:18:00] whatever, you know, and, and, and I think they’re not gonna get there if they don’t have the tooling.

[00:18:02] You know, having open source tooling that is specialized enough for them to actually start to work with. And it’s not from just one of these mega, huge. Incumbent players. I think it’s really important and it’s really exciting to see that you all are, have been moving so quickly, like, and working with such a broad array of developers across the world to looking at the future.

[00:18:22] Are you optimistic about where, things are going? You know, let’s say over the next five years, right? Like, are you seeing areas? Is it hype? are you seeing more? Okay, this is definitely here to stay. And we’re going to see this everywhere kind of approach from, Cause you have an interesting look into this, right?

[00:18:36] Like as one that’s working with the teams, just kind of curious to your, your thoughts on it.

[00:18:39] Sophia: Yeah, that’s another great question. I think we’re still at the very beginning age of the AI era, even though it feels like it’s such a hype right now, I don’t think it’s going to go down. I feel like extremely optimistic about the future of AI and.

[00:18:53] I believe we’ll see significant progress in the next five, ten years, but even probably even less than that, probably a lot sooner. [00:19:00] I think a lot more applications will be powered by AI. Like I said, the 10x developers, like everyone will be a 10x developer with the help of AI. And so everyone will be building, there will be a lot more consumer applications.

[00:19:12] And again, like how you interact with AI will be, Singlessly integrated into your everyday life without you even thinking about it.

[00:19:20] Luke: That’s great. With our audience of a lot of people that may be building software and with being in developer relations, like if someone’s looking to kind of build and wants to kind of look into Mistral, are there resources that you’d recommend that they go to both for Mistral’s documentation maybe some areas there they could kind of do quick starts or also just even more broadly tool like any other resources that.

[00:19:40] you might recommend for people who might just be really, I mean, going to be ubiquitous, right, you’re going to have to get devs from all different kinds of walks of life, right? Like, what might help them out if they’re starting out?

[00:19:51] Sophia: Yeah, so we have the share, which is our chat interface where you can ask any questions.

[00:19:56] It will provide answers. You can use Brave’s search engine [00:20:00] powered by Mistral or Brave’s assistant tool powered by Mistral, and if you are a builder, if you’re serious about building, wanting to use our API, we have great documentations written by our team, actually. So if you go to docs. Domestory dot A. I.

[00:20:14] You can find all the documentations. you can go to the getting started guide and then go through the capabilities of different things you can do. And also we have different guides and also we have a community cookbook repo is on GitHub. So GitHub. Mr. A. I cookbook. We have a lot of integrations with 30 party tools, so we like our, AI ecosystem.

[00:20:36] So if you want to build with different tools, feel free to check out our cookbook, different integrations and see whichever you like as an example to start with.

[00:20:47] Luke: Fantastic. No, that’s helpful. We’ll be sure to try and link as many of those as we can to in the show description so people can have an easy time to jump off there and find them.

[00:20:54] How about you personally? Like, are there any folks that you follow in the space that people that you’d [00:21:00] recommend people tune into? I mean, I know the information moves so fast. It’s almost like, you know, okay, books or whatever, but you know, are there voices or, shows or anything you could recommend for people who might be wanting to, to kind of.

[00:21:10] Learn more both kind of the broader view and the technical side.

[00:21:14] Sophia: Oh, there’s so many names, but, yeah, like my Twitter feed is all AI related things, AI news, AI papers. Sometimes it’s more efficient to read Twitter than actual papers when like the authors would just write down the abstract and the story of the paper.

[00:21:32] So it’s like pretty nice.

[00:21:35] Luke: To that end, where, can people follow you on Twitter, I guess?

[00:21:39] Sophia: Yeah, if you, search my name on Twitter, hopefully it will show up. Yeah, I’ll also provide you a link so you can link in the description. Thank you.

[00:21:47] Luke: Yeah, yeah. Fantastic. You’ve been great, gracious with your time today too, and really appreciate it, especially, you know, being in Paris.

[00:21:53] Is there anything we didn’t cover today that you really, would love our audience to know about that we might not have [00:22:00] chatted about today?

[00:22:01] Sophia: Thanks for the opportunity. I do want to talk about our fine tuning hackathon that’s happening right now.

[00:22:06] Luke: Oh, great.

[00:22:06] Sophia: Yeah, we released our fine tuning API last week, and we decided to launch a virtual fine tuning hackathon at the same time.

[00:22:13] So the hackathon runs through the end of June, so there’s still time for people to submit, and we’re really looking forward to seeing people’s projects and different creations. so much for joining us. Yeah, we want to do a virtual hackathon because we heard so many feedback from various people all over the world that they want to participate in Mr.

[00:22:33] Hackathon, but previously we only did it in San Francisco and Paris, but like a lot of people are feeling missing out. So here’s your chance to participate. Yeah, please feel free to join us.

[00:22:44] Luke: Fantastic. We hope people will join that. That sounds really, really excellent. Well, I really appreciate you coming on today.

[00:22:50] you’ve been really, great. I feel like folks have leave here with a much better understanding around Mistral and, and also how Mistral and Brave are working together. love to have you back too, as you all have new things that [00:23:00] you’re releasing or other hackathons you’d love to get the word out about.

[00:23:02] But yeah, thank you so much for your time today and, we’ll talk soon.

[00:23:05] Sophia: so much, Luke.

[00:23:06] Luke: All right. Take care. Thanks for listening to the Brave Technologist podcast. To never miss an episode, make sure you hit follow in your podcast app. If you haven’t already made the switch to the Brave browser, you can download it for free today at brave.

[00:23:19] com and start using Brave Search, which enables you to search the web privately. Brave also shields you from the ads, trackers, and other creepy stuff following you across the web.

Show Notes

In this episode of The Brave Technologist Podcast, we discuss:

  • The goal and vision of Mistral AI
  • Why Mistral doesn’t open source all their models
  • How Brave and Mistral are currently integrated, enhancing Brave’s Leo AI tools and search engine
  • Ways Mistral is investing in community through large-scale hackathons
  • Why specialized models are the future of innovation in AI

Guest List

The amazing cast and crew:

  • Sophia Yang - Head of Developer Relations at Mistral AI

    Sophia Yang is the Head of Developer Relations at Mistral AI, where she leads developer education, developer ecosystem partnerships, and community engagement. She is passionate about the AI and open-source communities, and she is committed to empowering their growth and learning. She holds an M.S. in Computer Science, an M.S. in Statistics, and a Ph.D. in Educational Psychology from The University of Texas at Austin.

About the Show

Shedding light on the opportunities and challenges of emerging tech. To make it digestible, less scary, and more approachable for all!
Join us as we embark on a mission to demystify artificial intelligence, challenge the status quo, and empower everyday people to embrace the digital revolution. Whether you’re a tech enthusiast, a curious mind, or an industry professional, this podcast invites you to join the conversation and explore the future of AI together.