Thursday, November 7, 2024
Home > Exchanges > Mistral AI’s CEO on Microsoft and Europe’s AI Ecosystem

Mistral AI’s CEO on Microsoft and Europe’s AI Ecosystem

Over the past year, Paris-based Mistral AI—one of the TIME100 Most Influential Companies of 2024—has rapidly risen as a homegrown European AI champion, earning the praise of French President Emmanuel Macron. The startup has released six AI language models that can answer questions, produce code, and carry out basic reasoning.

In June the company said it had raised $645 million in a funding round that reportedly values the company at more than $6 billion. That followed an announcement in February that Mistral had struck a deal with Microsoft to make its models available to the U.S. tech giant’s customers in exchange for access to Microsoft’s computational resources.

Mistral’s co-founder and CEO Arthur Mensch has been vocal in debates over the E.U.’s landmark AI law, arguing that rather than regulating general-purpose AI models like Mistral’s, lawmakers should focus on regulating how others use those models. He also opposes limitations on AI developers freely sharing their creations. “I don’t see any risk associated with open sourcing models,” he says. “I only see benefits.”

TIME spoke with Mensch in May about attracting scarce AI talent, how Mistral plans to turn a profit, and what’s missing from Europe’s AI ecosystem.

This interview has been condensed and edited for clarity.

Florian Bressand, chief business officer at Mistral, told CNBC a few months ago that more than half of the team that developed Llama now work for Mistral. How have you managed to draw so many talented researchers away from well-resourced companies like Meta?

At first, we hired our friends. We could do it because we made some meaningful contribution to the field, and so people knew that it was interesting to work with us. Later on, starting in December, we started to hire people that we knew less. That owed to the strategy we follow, to push the field in a more open direction. That’s the mission that is talking to a lot of scientists, that for similar reasons as we do, liked the way it was before when communication and information was circulating in the free way.

There are so few people around the world who have trained the sorts of AI systems that Mistral does. I know France has a thriving AI scene, but do you think you managed to hire some significant proportion of the people who know how to do this—perhaps even all of them?

Not all of them. There’s a couple of them, friends of ours, at Google, at OpenAI, a few of them remain at Meta. But for sure, we attracted, let’s say, 15 people that knew how to train these models. It’s always hard to estimate the size of the pool, but I think it’s probably been maybe 10% of the people that knew how to work on these things at the time.

Mistral has been fundraising. What are you spending the money on?

We’re spending the money on mostly compute. It’s an industry which is structured differently than software, in the sense that the investment that you need to do at the beginning to get the scientific team running, to get models that are on the frontier of technology, is quite significant. Today, we are still running on our seed compute, but finally getting access to the compute that we raised money for in December.

Executives at almost all the other foundation model companies have talked about how they expect to spend $100 billion on compute in the coming years. Do you have similar expectations?

What we’ve shown is that we burn a little north of 25 million [euros] in 12 months, and that has brought us to where we are today, with distribution which is very wide across the world, with models that are on the frontier of performance and efficiency. Our thesis is that we can be much more capital efficient, and that the kind of technology we’re building is effectively capital intensive, but with good ideas [it] can be made with less spending than our competitors. We have shown it to be true 2023-2024, and we expect it to remain true 2024-2025. With obviously the fact that we will be spending more. But we will still be spending a fraction of what our competitors are spending.

Are you profitable at the moment?

Of course not. The investment that we have to make is quite significant. The investment that we did and the revenue that we have, is actually not completely decorrelated, unlike others. So it’s not the case [that Mistral is profitable], but it’s also not expected from a 12-month-old startup to be profitable.

What’s the plan for turning a profit? What is your business model?

Our business model is to build frontier models and to bring them to developers. We’re building a developer platform that enables [developers] to customize AI models, to make AI applications, that are differentiated, that are in their hands—in the sense that they can deploy the technology where they want to deploy it, so potentially not using public cloud services, that enable them to customize the models much more than what they can do today with general-purpose models behind closed opaque APIs [application programming interfaces]. And finally, we are also focusing a lot on model efficiency, so enabling for certain reasoning capacity, making the models as fast as possible, as cheap as possible. 

This is the product that we are building: the developer platform that we host ourselves, and then serve through APIs and managed services, but that we also deploy with our customers that want to have full control over the technology, so that we give them access to the software, and then we disappear from the loop. So that gives them sovereign control over the data they use in their applications, for instance.

Is it fair to say that your plan is to make AI models that almost match those of your competitors, at a lower cost to you and your customers, and that are more openly available? Or do you hope to match your competitors’ most advanced models, or “frontier models,” in terms of absolute capabilities?

We also intend to compete on the frontier. There’s some phenomenon of saturation of performance that has enabled us to catch up, and we intend to continue catching up, and eventually to be as competitive as the others. But effectively the business model we have is a business model that the others do not have. We are much more open about sharing and customizing and deploying our technology in places where we stop having control.

Recently, your most capable models did move behind an API, whereas you started with all of your models being open. Why did you change your approach?

This is not something that we changed on. We always had the intention to have leading models in the open source world, but also have some premium features that could only be accessed through monetized services.

We have a very significant part of our offer which is open source, and that enables developers to adopt the technology and to build whatever they need to build with it. And then eventually, when you want to move the workloads that they’ve built into production, or if you want to make them better, more efficient, better managed with lower maintenance costs, then these developers come and use our platforms, use potentially our optimized models to have an increase in performance and speed in reasoning capacities.

It’s always going to stay that way. The open source path is very important to us. We are building the developer platform on top of it, which is obviously going to be monetized because we do need to have a valid business model. But we expect to bring extra value to developers that are using our open-source models.

You have often argued that Europe can’t be dependent on American AI companies and needs a homegrown AI champion. Mistral is one of the most prominent European AI companies but has a partnership with a U.S.-based “hyperscaler,” Microsoft, to access the computational power it needs. Does Mistral’s dependency on Microsoft for compute limit its ability to play the role of sovereign AI champion?

We have four cloud providers. We are cloud independent by design, and that has been our strategy from day one. Our models are available through Microsoft Azure, but also through [Amazon Web Services], and also through [Google Cloud Platform]. We use the three of them as cloud providers. We also use different cloud providers—CoreWeave, in particular—for training. We have built our stack of technology and our distribution channels to build the independence that we believe our customers need.

Should Europe try to build its own sovereign compute infrastructure in addition to having AI labs based in Europe?

I think it would be beneficial for the ecosystem. But Europe isn’t some actor that takes independent decisions and builds something out of thin air. There’s an ecosystem aspect to it, of how do you make sure that effectively some of the infrastructure for compute is available in Europe. 

This is super important for our customers, because some of them are European customers, and they do want to have some form of sovereignty over the cloud infrastructure they use. In that respect, there is already some availability, and our inference, our platform is actually deployed in Europe. But there could be some improvement. It’s not Europe that decides it. It’s an ecosystem that needs to realize that there are some needs that could be addressed. We would love to have some European cloud partners in the near future.

Cedric O, former Minister of State for Digital Affairs of France and one of your co-founders, has warned that the E.U. AI Act could “kill” Mistral. The Act passed, but the codes of conduct for general purpose AI models are yet to be written. What should those look like?

Generally, the AI Act is something that is very workable with, in the sense that the constraints that we have are constraints that we already meet. We already document the way we use our models, the way we evaluate our models, and this has become a requirement for frontier models. So this is fine.

There’s some discussions to be had on the aspect of transparency for training data sets, which is something that we’d really love to enable, but which is something that needs to be measured against trade secrets. A lot of our [intellectual property] is also in the way we treat the data and select the data. This is also the IP of others. As a small company, we are defensive about our IP, because this is the only thing we have. And so in that perspective, we are confident that we can find a way for things to be acceptable for every side. 

We’ve been asked to participate in the technical specifications and to give our input. We also expect that Europe should, in all independence, make its choice to enable the development of the ecosystem and also to enable everybody to be happy.

There are a lot of sound bites out there from executives at your competitors about how AI is going to change the world over the next five or 10 years and the things that they’re worried about, the various transformational things that they think might happen as a result of the development of more powerful AI systems. Do you have predictions for how AI might change the world?

We make a powerful technology, but I think there’s a tendency of assuming that this powerful technology is solving every problem. At Mistral, we’re very focused on making sure that our technology brings productivity gains, brings reasoning capacity to certain verticals, to certain fields that we will have societal benefits. 

Everything that humanity has been building are tools, and we’re bringing a new tool that brings new abstract capabilities. So in a sense, you can see it as a more abstract programming language. It’s been 50 years since we programmed in computer-understandable language. Today, we are able to create systems by talking to them in English or in French, or in any language. That brings a new abstraction to workers, to developers, and that obviously changes the way we’re going to work in the next 10 years. 

I think if we do things properly and we ensure that this tool is in the hands of everyone—and this is really why we created Mistral—we can ensure that this brings improvement to everybody’s lives—across the world, across the spectrum of socio-economic status. The way to do it is, for us, to first enable strongly differentiated applications in useful sectors like health care, like education. It’s also very important to ensure that the population is trained and has access to the technology, and to enable access to this technology—providing the technology in a more open way than the others, is a way to accelerate it. It’s not sufficient in the sense that the political decision makers also have to create enablement programs to accelerate access to the internet in parts of the world where it’s not not accessible. But I think what we’re building has a positive effect in enabling the population to access that new tool, which is generative AI.

Is there any scenario you can imagine in the future, where you’ve developed an AI model, or you’re in the process of developing one, and you notice certain capabilities. Would there ever be a scenario where you decided it’s better not to open source the model and to keep it behind an API, or not even deploy it behind an API?

Not in the foreseeable future. The models we build have predictable capabilities. The only way that we found collectively to govern software and the way it is used in a collective way is through open source. This has been true for cybersecurity. This has been true for operating systems. And so the safest technologies today are the open source technologies. 

In a sense, AI is not changing anything about software. It’s only a more abstract way of defining software. So I don’t see any risk associated with open sourcing models. I only see benefits. This is a neutral tool that can be used to do anything. We haven’t banned the C [computer programming] language, because you could build malware with the C language. There is nothing different about the models that we release. And so it remains important to control the quality of applications that are put on the market. But the technology that is used to make these applications is not the one thing that can be regulated.

Source