Skip to main content

GitHub CEO Thomas Dohmke says the AI industry needs competition to thrive

Dohmke says navigating Microsoft-OpenAI isn’t as complicated as it seems, and open source is still king.

Share this story

If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.

A stylized portrait of Thomas Dohmke
Photo illustration by The Verge / Photo by GitHub

Today, I’m talking with Thomas Dohmke, the CEO of GitHub. GitHub is the platform for managing code — everyone from solo open-source hobbyists to the biggest companies in the world relies on GitHub to maintain their code and manage changes. But it’s been owned by Microsoft since 2018, which makes this a perfect Decoder episode since I have a lot of questions about that structure.

Thomas and I talked a lot about how independent GitHub really is inside of Microsoft, especially now that Microsoft is all in on AI, and GitHub Copilot, which helps people write code, is one of the biggest AI product success stories that exists right now. How much of GitHub’s AI roadmap is tied to Microsoft’s AI roadmap? How do resources get moved around? And since GitHub is used by all sorts of companies for all sorts of things, how does Thomas keep them all feeling secure that Microsoft isn’t just trying to pull them toward services it prefers, like Azure or OpenAI?

Thomas had some surprising answers for all of this. Like any good Microsoft executive in the Satya Nadella era, he told me that the company’s strength is in working well with partners. But he also insisted that tech isn’t a zero-sum game and that one company winning doesn’t mean another has to lose. You’ll hear him tell me that he enjoys competition and that if there were only one option — just OpenAI or Meta’s Llama, for example — to him, that would be like a sport “with just one team in the league.”

Of course, I also asked Thomas about AI and whether our current AI systems can live up to all this hype. He’s got a front-row seat, after all: not only can he see what people are using Copilot for but he can also see what people are building across GitHub. I think his perspective here is pretty refreshing. It’s clear there’s still a long way to go.

Okay, GitHub CEO Thomas Dohmke. Here we go.

This transcript has been lightly edited for length and clarity.

Thomas Dohmke, you are the CEO of GitHub. Welcome to Decoder.

Thank you so much for having me. I’m a big fan of the show.

I appreciate that. There is a lot to talk about. There are a lot of Decoder questions to answer about how GitHub works in Microsoft and Microsoft works in the industry. GitHub is everywhere in the industry.

Let’s start at the very beginning. Some people in the Decoder audience are intimately familiar with GitHub. They probably live in it every day. For another part of the audience, it’s a bit of an enigma. Just explain quickly what GitHub is and what it’s for.

GitHub is where most of the world’s developers are building the software of today and tomorrow. It started as a place to store your source code in a version control system called Git. That’s where the name comes from, which was actually invented by the Linux kernel team in 2005, about two years before GitHub was founded in 2007.

Today, it has not only become the place where people store their open-source code, but it’s also used by 90 percent of the Fortune 100. Really, every big and small company in the world is storing their private source code and collaborating together. That’s what I think GitHub is all about.

Do people actually code in GitHub? Is it just version control in a repository? There’s some blurriness there, especially with some of the news you have today.

It used to be just repositories. That’s how it started, and it’s actually fun to go to the Wayback Machine and look at the first GitHub homepage and how Chris [Wanstrath], Tom [Preston-Werner], and P.J. [Hyett], the founders, basically described all the changes. The front page was like a change log, effectively.

In the meantime, we also have issues where you can describe your work — bug reports or feature requests. Planning and tracking is what we call that area. We have something called GitHub Actions, which lets you automate a lot of the workflows, and we have GitHub Codespaces, which is a whole dev environment in the cloud.

So you don’t even need a laptop anymore. You can just open that in the browser on your smartphone or iPad and have VS Code in the browser, which is a popular IDE editor, and you can start coding right there without ever having to install all the dependencies, libraries, and toolchains. It’s just an environment that you can leverage and then submit code back to GitHub.

How many people are coding in a browser on their iPhones on GitHub?

More than you would think. Obviously, it’s not the main way of writing software, but you can imagine a scenario where somebody pings you and says, “Hey, can you quickly review my pull request?” which is a way developers collaborate. For example, I make a change to the code base and send you a pull request, and then you review it and say, “Yeah, this looks good. I approve it” and then deploy it to the system.

That definitely happens. People use the browser and the GitHub mobile app on the bus on their commute to work or back from work to quickly review what I’ve done — and correct a small typo or maybe do a bug fix or an update or something like that — and then click approve and it goes from there.

In fact, at GitHub, we use GitHub to build GitHub. For example, when one of my employees wants access to Salesforce, they have to send a pull request against an entitlements file, and then, depending on where they sit in the organization, I might be the approver. I often do that on my phone. So it’s code not in the sense of, “I’m writing a lot of code,” but it’s definitely code in the spirit of, “I have a file with a diff and I compare the two sites against each other and say, ‘Okay, this looks good. Let me approve this.’” 

Wait, you manage enterprise approvals in code in GitHub, as opposed to some horrible enterprise software?

We do.

Honestly, I feel like that might be better compared to the horrible enterprise software that most people have to use, but that is astonishing.

We have a blog post on this. It’s called Entitlements, and it’s basically a repo that has a file with all of our usernames on GitHub. Almost everybody identifies with their GitHub handle, so I’m @ashtom, and often, we speak about each other with our handles and not with our real names, and then those files have the user handles in them.

Once you do that, you have all the benefits of software processes. You can run test cases and see if the file is properly formatted. You can see where that person sits in the org chart and who needs to be the approver. You can check automatically how to give that access and then ultimately give it.

In many ways, it’s much easier to do that. Especially if you’re a developer already, you know how to modify a file and send a pull request. But, yeah, we have our sales team do that. We have our legal and HR teams do that.

In fact, our legal team, for the longest time, has managed our terms of service and privacy policy in the public GitHub repository, so everybody can see what changes we’re making. It’s utterly transparent, sometimes in a good way and sometimes in lots of good ways. People are debating about why we’re making these changes to cookies and other things. That’s a good way, if you think about that for legal texts, to have a diff in the same way that you want to have that for code.

I ask a lot of enterprise software CEOs to come on the show, and they often say no because they know I’ll ask them if they use their own software. It seems like you’ve passed that test with flying colors.

GitHub is expanding over time. It started as version control, this software as a service to do version control with Git. Now, you can actually code in GitHub. You can apparently run an entire large enterprise inside of GitHub.

Where do you want that scope to end? Do you want it to become something like VS Code that developers log in to and do all of their work in? Are there boundaries or stuff you don’t want to do?

All of the work, I think, in software development, never actually works. If you think about the plethora of tools that developers use — from an operating system and a container-like solution like Docker and Kubernetes programming language; all the tools that come with the programming language like compiler and debugger and the profiler and all of that; the frameworks. And of course, a lot of the open source is coming from GitHub, but it’s not provided by GitHub. It’s stored as open source on GitHub and you find the readme and you consume that project. And then, as you go through what we call the developer life cycle, on the tail end is monitoring, data collection, telemetry, exception tracking, policies, making sure that all data is stored within a data governance framework, all the way to security scanning. There’s never a world where one vendor will offer all of that.

So we see ourselves as one planet, a big planet in a large universe of software development tools, and it has always been important for GitHub to have APIs and webhooks and integration points for these partners to actually build that end-to-end workflow that developers want and give them the choice.

Whether you’re on the Python ecosystem and you want to use PyPy and VS Code or whether you’re in the Java ecosystem and you want to use JetBrains and Maven and other tools like that, GitHub is there for you to help you collaborate as a team.

We see ourselves as the center of collaboration. You could say it’s the creator network or the social network of coding. For some time, our tagline on the homepage was social coding, and it’s a very special creator network because most creator networks are financing themselves by advertisements and things like that. And you create communities around the creator with comments and things that help you engage with the community.

In GitHub, it’s still code, and I don’t think anyone would want us to put banner ads on GitHub, even if that would provide a revenue cutback to the owner of the open-source project. So we are constantly also evolving our thinking on that.

This is going to bring us inevitably to AI. GitHub has a lot of AI in it now. GitHub Copilot is a massively successful product. You have some news — you’ve announced something called GitHub Models, which I want to talk about — but I just want to stay on that vision of GitHub as a social platform or creator network.

Most other creator networks don’t launch tools that let you make the thing that the creators are making as a first-class citizen to the creators themselves. Instagram is not making an AI photo tool that lets you publish photos that build explicitly on the photos that Instagram influencers have published and then presenting them in those AI photos in a first-class way. That would be a weird thing for Instagram to do.

But that is more or less exactly what GitHub is allowing people to do. Copilot lets you generate code using AI and then you can present that code right back to the GitHub audience. Do you see that as being an odd dynamic, a new dynamic? Is that going the way you want it to?

It’s a good question. If I think back to the origins of GitHub, while we allowed you to store source code, in some ways, that always spurred creation. Once you have a file, especially in a public open-source repository, that allowed somebody else to fork and modify it.

There was some type of creation there, in the way that you’re taking something that exists and you’re allowed to copy it into your namespace and then modify it. Nobody forces you to say, “When you fork my repo, send me back your changes.” You can just keep them for yourself. And we had an editing view, obviously, within the UI, a very basic HTML text box for the longest time.

When we started working on Copilot four years ago, back then, this was GPT-3. Node-chat GPT was on the horizon. Generative AI was a very inside topic in the tech industry, but it certainly wasn’t a top news topic that was reported on every single day. In fact, in all my customer conversations, we spent five minutes on AI and then 55 minutes on DevOps, the developer life cycle, Agile development, those kinds of things.

But I think the original motivation was the same as GitHub’s, which is, how can we make developers more productive? How can we make them more collaborative, and ultimately, how can we increase their happiness? While we were very internally motivated by just making our own developers faster, we are always running out of time to implement all the ideas we have.

If I look at my backlog, we have a huge repository of issues that somebody has filed over the last 15 years. There are some from 2015 and 2016. They are great ideas that we just didn’t get to yet, and I’m running out of time faster than GitHub is running out of ideas of all the things we could do to make the platform better.

So the idea here was, how can we make developers more productive? How can we make our own developers more productive so they can implement things a little bit faster so we get to the future that we envisioned sooner?

When you think about that life cycle of the developer, so much of what we have traditionally thought of as software engineering involves talking to other people, asking questions, searching for answers. I have a lot of engineer friends who say they spend half of their time just looking for the code they need to implement and then the other half trying to implement it.

That’s gone away in some capacity with AI. Platforms like Stack Overflow, which were a huge social community for developers, are seeing drops in the rates that people are using them. You see that in other places as well. Do you see that as the natural outcome of AI, or do you see a way to bring that social innovation back to the forefront?

I think the first thing that comes to mind is that there’s truly a democratizing effect of having your Copilot within your editor, and you can just get started.

It’s easy to see that when you look over the shoulders of kids trying to build a game, which many kids nowadays do at age six or seven as they grow up with mobile phones. You observe, in any restaurant around the world, that scenario of a family with a three-year-old holding an iPhone or an Android phone and watching a video. Soon enough, they’re into Minecraft and other games, and soon enough thereafter, they want to create because that’s what we do as humans. And then, how do we get them started?

Stack Overflow is great, and I don’t think Stack Overflow will go away, but you have to know that that even exists. Who tells you that as a six-year-old when you live in a household where the parents are not computer scientists themselves?

I think Copilot will become ubiquitous enough, and now I use Copilot as the category term, whether it’s ChatGPT or other products. You can just say, “Hey, I want to make a game” — a pong game or snake game or something easy to start with — and it gives you an answer. And it already links you back to where some of that answer came from.

And so the social network gets a new feeder where you can learn more about the answer if it doesn’t solve your problem already. But I think we are going to see more of that in those chat interfaces.

Actually, just a couple of minutes ago, I was on a call where somebody had an example. If your mom goes to Photoshop today and wants to replace a gray sky with a blue sky, that’s probably hard because figuring out how the user interface of Photoshop works, if you’re not a pro, is incredibly complicated.

If you can just say, “Hey, replace a gray sky with a blue sky,” whether it’s a prompt that you’re typing or actually literally speaking to a computer like Scotty in Star Trek, it’s going to open up a whole new world of creation.

And then, typically, you create something to share with others. That’s how humans interact. I think it’s actually changing how the creator economy works, but it’ll open this up to so many more people. And if I bring that back to coding, this morning, I woke up with an idea, and then I realized, “Well, I have this podcast today and I have the customer meetings and I have all the other things in my role as CEO, so I don’t have time to start a new project.”

What if I could go to Copilot and say, “Hey, I want to build this app to track the weather. Here’s an API I can use,” and I iterate on this in an hour and a half to build something as quickly as building a Lego set. I think that’s the true change that we’re going to see.

If you pull that thread out all the way, maybe you don’t need to know how to code at all. You’re just instructing the computer to do some task or produce some application that can do some tasks and you just evaluate the end result. Is that the endpoint for you, that people use GitHub who don’t know how to code at all?

That endpoint already exists. There are low-code / no-code tools like Retool or Microsoft Power Platform

But they don’t have a natural language interface where you’re like, “Make me an app that changes the color of the sky.” We’re not quite there yet, but we could be very soon.

Well, the Power Platform does. I haven’t checked Retool recently, but I would be surprised if they’re not working on that at least as an assistant to get started. But I think the way this will work is that you have a spectrum of knowledge. And you can probably build a webpage without knowing anything about HTML and CSS, as you can in Squarespace and many other tools and could do for the last 20 years or so.

But code still exists as the underlying deterministic language. Human language is incredibly nondeterministic. I can say something and you say the same thing and we mean two different things. Code is deterministic and code, effectively, is just an abstraction layer on top of the processor and the operating system that runs your machine. And that processor in itself, today, the CPU or the GPU both run the machine language like an instruction set and code is just the next layer. Now, we’re moving higher, but that doesn’t mean those layers have gone away when we invented programming languages and replaced assembly and, before that, punch cards with code. Those exist. I think it depends on what you’re working on, whether you’re going down the abstraction stack or whether you’re staying at the higher level.

The professional developers will know both layers, I think. The professional developer will have to know code. They will have to understand the laws of scaling and the intricacies of program languages, security vulnerabilities, those kinds of things. And they’re going to leverage natural language to get the job done faster, to write boilerplate, to write test cases, those kinds of things. 

So I think it’s going to be a mix of these things, and we are going to sit on that spectrum and move back and forth. And that makes the technology so powerful because if you are a learner and today maybe you are in an IT role and you’re only working with a no-code, low-code tool, you now have the same user interface and natural language to move up that stack and ultimately become a pro code developer.

That brings me to the news you announced recently, which is GitHub Models, which allows people to play with various AI models right inside of GitHub. Explain what that is exactly, because it feels like you’re describing something that leads you right to, “You’re going to play with AI models directly in GitHub.” 

What has changed over the last couple of years is that, now, models themselves have become a building block for software. It used to be code both in the front and the back end. Before that, we didn’t even have a back end. You would just build an app that runs on a PC or, before that, on a Commodore 64 or an Atari that didn’t have a back end because there wasn’t really internet at that time.

We moved from building all of this by yourself to using open-source libraries as building blocks in your application. In the last few years, we have increasingly talked about the full stack developer that is able to build back-end code and front-end code and all the things in the middle, deploy to the cloud, manage the operations of that cloud service, being on call all the time. 

Now, what has changed is we add models to that picture, and most modern applications that are being worked on right now have some form of AI integration, whether it’s a simple chatbot or it’s using a model to predict anomalies and whatnot.

For a while now, we have been thinking, “Okay, so GitHub offers the code and offers the open-source projects, but we’re missing the model as a building block.” We are adding these with GitHub Models in partnership with Azure AI, and we’re starting with a bunch of models, including those from OpenAI and Microsoft, of course, but also from Meta, Astral, Cohere, and a couple of other partners.

It’s a nice mix of open weights or open models, and some of them are also open source, but that is a debate in itself. What do you call these models where the weights are open and the source code is not? And of course, commercial models like GPT-4o Mini that just recently was released.

It allows you, on GitHub with your GitHub account, to play with these models, and you can send prompts and get a response. You can ask about Shakespeare and about coding. And then you can change the parameters of the model that are sent during inference, like how long your context window is or how high you want the temperature and how nondeterministic you want the answer to be. You can start experimenting with these different models. You can find one and bring it into your editor, into your code space, and prototype and application, and you don’t have to sign up for another account. You don’t have to worry about paying inference costs while you’re doing that. You can keep that all within your GitHub workflow.

Is GitHub paying for the inference costs as part of the subscription you pay to GitHub?

We offer the playground for free with certain entitlements, so a certain number of tokens that you can send per day. Beyond that, you can sign up for an Azure subscription and pay for the overages. Of course, when you want to move to production, you definitely want to remove your GitHub token from the source code that IAS tied to your personal account. In a larger organization, you obviously don’t want that because the employee might leave the team or leave the company and you want to move to a more productionized version of having a key or token within a key vault system where that is stored and then inference is found against that key and not against their personal token.

When you think about what models you can make available to people, there are some open-source models or open-ish models like the ones from Meta, which have open weights but maybe not open-source code. Then there are obviously Microsoft’s models. Then there are models from Microsoft’s partners like OpenAI. Is there a limit? Does Microsoft have a point of view on what models GitHub can offer and what models GitHub points people to? I imagine Microsoft would like everyone to use their models and run everything on Azure, but that’s not the reality of GitHub today.

I think Microsoft wants everybody to use the best model to build applications that ultimately are hopefully deployed on our cloud and stored on GitHub. As a platform company that is almost 50 years old, we want to offer a choice. Next spring, our 50th birthday is coming up. We have always offered that choice. Every time you report on a Surface launch, there are often also a number of partners that announce laptops under their brand with a similar feature set.

In the model space, we think about that similarly. We want to offer the best models, and we are starting with 20 or so top models with this launch, and then we’ll see what the reaction and feedback is and if people want to add their own models to the list, if they want to fine-tune these models, what the actual usage is. I think that’s a very interesting question. We, at GitHub, love to move fast, to bring things out there, and then work with the community to figure out what the next best thing that we can build is that actually solves that use case.

There’s a big debate right now in the AI world about open versus closed. I think it’s right next to a debate that we have to actually start building some applications to make money. There’s another debate about running it in the cloud versus running it locally. There’s a lot going on. Where do you see that shaking out? As you build GitHub, you probably have to make some longer-term decisions that predict how development will go. To architect GitHub correctly, you have to say, “Okay, in two years, a lot of applications will be built this way, maybe using open-source models, maybe everyone’s going to use OpenAI as API, or whatever it may be.” The debate is raging. How do you see the trends going right now? 

One interesting statistic I can share with you is that, in the last year, over 100,000 AI projects have been started on GitHub open source. I can’t track this closed-source because obviously we would not look into private repositories. 100,000 open-source AI repositories have been started in the last year alone, and that’s up by an order of magnitude from what we’ve seen before ChatGPT. As such, I’d say the quantity absolutely will be now in the open-source space as it has been in software for the last two decades. Open source has won. There’s no question anymore that the most successful software companies all use open source in their stack. They’re running mostly Linux on the server and in containers. They’re running the Python ecosystem or the JavaScript TypeScript ecosystem, the Ruby ecosystem. All of these ecosystems have large ranges of open-source libraries that whether you start a new project in a large company or you’re a startup, you’re pulling in all these things. React has a thousand or so dependencies just by starting a new app.

I think if you just look at where open source has gone, I would predict the open-source models or the open-weights model will play a very important role in democratizing access to software development. It is too easy to get started and not worry about inference costs or license costs. The other pole of this is the commercial models that try to be the best models on the planet at any given point in time. They offer a different value, which is that you can get the best model but you have to pay a vendor or a cloud provider to run inference on these models, and you don’t get access to the weights or get to see what happens in those models. I think those two polarities will continue to exist, and nothing really in tech is a zero-sum game.

In our heads, we like to think about everything like a sports competition, where our favorite team, our favorite phone, or favorite operating system, or favorite cloud provider, should win. But then a new season starts with mobile phones — often in the fall, when Apple launches a new iPhone — and then there are the tech conferences that determine the rhythm of model launches. The new season starts and the competition starts anew. I think that’s actually fun because you wouldn’t want to watch your favorite sport with just one team in the league or in the championship. You want different teams competing against each other, and you want to see how they can play the infinite game. In the season, they play the finite game — they want to win the season — but in the long run, they play the infinite game. They want to have a legacy. They want to play Minecraft as much as they play Super Mario.

It is interesting to think of OpenAI as Minecraft and Llama as Mario. I’m not sure where that metaphor goes, but I’ll leave it for the audience. It’s something. Or maybe it would be the other way around. I think Llama would be Minecraft because it’s more open world.

But inside of that, Meta’s claim is that Llama right now is as functional as the closed-source frontier models. It has matched the performance. It has matched the capabilities. You have to be much better to be closed and paid versus open and free. You have to deliver some massive amount of additional value. Just based on what you’re seeing in the developer ecosystem, do you think that’s going to play out?

The Llama model isn’t free in the sense that you still have to deploy it to GPUs and run inference, and that’s most of the cost that you get for OpenAI’s models today as well. If you look at GPT-4o Mini, the inference costs are now so small compared to just a few years ago on GPT-4, or even before that on 3.5 and 3, that you really have to look at inference costs as the differentiator, not license cost in the sense that you have to pay OpenAI and an additional license on top of that. I think the model will be commoditized in the sense that the chips in our laptops are commoditized. It doesn’t mean that Nvidia isn’t a great business. It clearly is, especially in the last year, but it doesn’t matter as much to the consumer what chip is running in their laptop.

I mean, I buy a new iPhone every year, and there are certainly people in the tech industry that do want the latest chip and latest feature, but the majority of consumers and enterprise users do not actually care about that compute layer at the bottom in the same way that they don’t care whether you’re running a SaaS product on a certain CPU type, a certain VM type, or whether you’re using a Kubernetes cluster. That’s a tech question and maybe an operating margin question for the provider more so than a question for the user of the product. While the benchmarks are getting close between those two models, from our perspective, the GPT line still has an advantage. That’s why we’re using it in Copilot. I have the freedom to move to a different model. My management at Microsoft is definitely encouraging me to look into all the opportunities to provide the best product to my customers.

To keep going with my metaphor, in the same way that we have laptops with Intel chips and with AMD chips and now with Arm chips and the customer decides which laptop they want based on different things like battery life, I think there will be commoditization, but there’s also differentiation between the different models. It will come down to the typical questions: How good is it? How much does inference cost? How many GPUs do I need? How fast is it? How long is the token window? Do I actually have a mature, responsible AI pipeline around that model, and does it fit my scenario?

You mentioned that you have the freedom to choose models in addition to letting people build on these models. You obviously have deployed a significant AI application in GitHub Copilot. When you evaluate its performance, its cost versus its value versus the switching cost of another model, how often do you sit and think that through? Are you set with it now in GPT, or is this something you’re evaluating constantly?

We’re doing it constantly. In fact, we are doing it on GPT-4o Mini, which, at the time of this recording, had just launched, and we are looking at how that compares to GPT-3.5 Turbo, which is the model that we’re using behind auto-completion. If you look at Copilot today as it is deployed to over 77,000 organizations and more than 1.8 million paid users, it’s multiple models that run for multiple scenarios. We have 3.5 Turbo for auto-completion because we need low latency and a fast response time with a decent amount of accuracy. As you’re typing in your editor and you’re seeing the proposal, is that coming for whatever you typed a minute ago? And if you actually look at how long it took the original GPT-4 to write the whole response, streaming was a genius user interface design because it obscured how long it actually took to get the full response.

With auto-completion, you can’t have that. It needs to show you the whole thing relatively quickly because, otherwise, you’re faster and you keep typing the code that you wanted to type. We are using a fast, small-ish model in auto-completion. In Chat, we have a mix of 4 Turbo and actually 4o has rolled out in the meantime. And then, for newer scenarios like Copilot Workspace, we have been on 4o for a while, and we have compared 4o to other models to see where we get the best returns in terms of code rendered and changes made to the code base to solve the problem that Copilot Workspace tries to solve. So we are comparing within the same model generation newer releases that we’re getting from OpenAI, and we’re also comparing these models against other open weights, open source, and private models that are accessible to us through Azure.

You have a lot of decisions to make. There are a lot of things swirling. Obviously, there’s Microsoft to manage as well. What’s your framework for making decisions?

I have two frameworks that we closely follow at GitHub. One is what we call the DRI, the directly responsible individual. The first question is, who’s the DRI? And if we don’t have one, we should. We have one person in the company that runs the project. If a decision needs to be made, ideally, the DRI can make the decision by consulting all the stakeholders, or they can bring the decision to the leadership team and me to discuss.

The other framework I like is “view, voice, vote, veto,” which basically is deciding who in the group actually has what rights in the discussion. Can they have a view? Can they have a voice? Do they have a vote, or do they have a veto? As different decisions need to be made, you have the difference of these roles.

Obviously, within the large framework of Microsoft, I often have a voice. While in the framework of GitHub, I often have a veto. Well, I hope at least I have one. I definitely have a vote. But honestly, I often don’t want to have a voice. I’d like to view things because I’m interested to just browse through GitHub issues and GitHub discussions where the company is discussing things. But when engineers are talking about the ups and downs of using React, as an example, I’d love to read all that stuff because it helps me understand what’s happening and also tune it out to a certain degree. But I don’t need to raise my voice or even have a vote on that. I have a strong engineering leadership team and a strong set of distinguished engineers and principal engineers that can make those decisions and will be accountable for them within the DRI framework.

What I’d like to tell my leadership team is to give me options and give me a set of choices I can make and tell me what the pros and cons are. But also, and this maybe is a bit of my German DNA, I often ask questions. What about the options that are not here? What are you not telling me? What are we missing? What am I not seeing in these options? I think it’s actually more important to think about what’s not presented and what we’re not discussing, even if it’s just picking between A and B.

Lastly, I’d say, let’s be real, many CEOs and many leaders leverage experience or intuition to make decisions. Many small decisions can be made without a document, without data. I love to be data-driven and look at data, especially when it comes to things like determining pricing or determining model updates, as we talked about earlier, and whether 5 percent is enough, but many decisions are just a question of intuition. Like the tagline for our conference, that’s certainly a discussion, but then we decide on that based on taste and intuition. 

You’re not A/B testing 40 shades of blue?

No. The reality is that you don’t get to do an A/B test on most decisions. Your life doesn’t have A/B tests. The price point that we set for Copilot, we are kind of stuck with that until we make a decision to change it. But you don’t really want to sell at $19 to some set of customers and a different price point to other customers, minus discounting obviously. That doesn’t really work. When we made the decision to launch Copilot and then put considerable resources within the company into Copilot, it also meant we removed funding from other projects that we could also have done. The reality is that resource constraint is true of even the largest companies. In fact, I think the biggest weakness of the largest companies is that they’re so big, they think they can do everything. The truth is, they’re still resource-constrained, and they still have to say “no” way more often than they can say “yes.”

That’s the thing that I remind myself almost every day: that saying “no” is much more important than saying “yes.” Especially in this age of AI, it means that while we invested in all these AI topics like Copilot and Copilot Workspace and Models, we also made the conscious decision to leave things behind. 

You mentioned that you’re thinking about models as commodities like chips, like AMD chips versus Arm chips. Have you architected your various systems so that if you wanted to make a big model switch to Mistral or something, you could? Would that be very costly? Would it be easy? 

The costly part is the evaluation test suite and the meta prompt or the system prompt. And you can imagine in Copilot, as it sits in the editor, there are a lot of these system prompts for different scenarios. There are different system prompts for summarizing a pull request versus one that auto-completes text or one that helps you with debugging an error, which Copilot does in the IDE. These suites of prompts are very specific today to different models. As we move into the next year or two, I think that’s going to become a competitive differentiator for companies to be able to plug and play different models while keeping the prompt suite relatively stable.

Today, we’re not in that place and there is a lot of work that goes into adjusting these prompts, running the offline evaluation. I think almost any Copilot or Copilot-like system runs some form of A/B testing, where, once they have a new model and they have done their offline eval and their responsible AI red teaming and all of those kind of things, they actually roll it out to 1 percent, 5 percent, 10 percent of the population. And they look at metrics, like I mentioned before. They look at acceptance rates. We do see whether this new population is getting better results or worse results than with the old model. Only if we have that confidence level do we go to 100 percent. I think that will enable us to hopefully, in the near-term future, move to new model generations faster than we can today.

If one of your engineers came to you with an argument to switch to another model, what would the winning argument be? Would it be 5 percent more efficient, 10 percent less cost? Where would the metric be where you’d say, “Okay, it’s time to switch.”

Five percent sounds pretty good. Ten percent also sounds pretty good.

But it’s on that order, right? For a lot of things, it’s a lot of cost for a 5 percent gain. But you’re saying 5 percent would be a winning argument?

I think the nuance there is that we are checking in offline eval for C and C++ and C# and JavaScript and TypeScript and Python and Ruby and Go and Rust, and so far, I haven’t seen a model update, even within the GPT line, where all the languages across the board are better at the start. Some are better and some are worse. We are looking at different types of metrics. Obviously, a successful build is one of them. It is actually the code building in the test suite, but also, how many lines of code did you get compared to the previous model or the competing model? If that number of lines goes down, the question then becomes, well, is that better, and is it using a smarter way of writing that same code or an open-source library, or did it get worse? And it’s like, “Well it’s the builds. It doesn’t actually create the right output anymore.”

If somebody comes to me, one of my engineers or data centers, is like, “This model has everything better across the board and we are saving half the GPUs,” that seems like a pretty good deal. I would certainly go into a deeper evaluation process and try to figure out if it was worth it to now go into the handful of regions where we have deployed the model because we are running in different Azure regions with clusters of GPUs to have low latency. So a European Copilot user is connecting to a GPU cluster in France and Switzerland and the UK and Sweden, I think. If they’re in Asia, they have a GPU cluster in Japan, but then India is probably closer to the European cluster, so they’re going that way around the world. And then we have different ones in the US, and we’re expanding almost every month to a new region to get more scale.

Switching the model has switching costs across all of these clusters. And then, we come back to the A/B testing question of how do you do that so you have enough confidence that the offline evaluation is matched in the online evaluation where people work with real code and not with synthetic scenarios. The way I like to think about this in web services, ever since the cloud became a thing, 99.9 or more in terms of uptime percentage is the gold standard. Anything less than that, and you’re going to be on Hacker News or on The Verge all the time saying that startup XYZ or big company XYZ is down again and is preventing everybody from getting to work. We have seen that both with GitHub and with other collaboration tools like Slack or Teams. If Slack is down on a Monday morning, everybody is like, “Well, I guess I’m off work today.”

In the model world, that still plays a role because your model has to have 99.99 whatever uptime, but also the model quality, the response quality, if that dips, you have to monitor that, and you almost have to run through the exact same process with your site reliability engineering team to say, “Okay, something is going wrong. What is it?” And when the stack did an operating system update patch on Tuesday or something like that, maybe a network router changed. Oftentimes, when we deploy GitHub in a new data center, the big question is, “Can the network bandwidth actually support our load given the scale of GitHub as a social network?” All of these things play a role now, not only in model uptime but also in model output. And that’s where all of these questions come into play before we make the decision of saying, “Okay, we are ready to move to the latest GPT model or the competing model.”

I just want to point out, you started with “5 percent sounds pretty good,” and you ended with “50 percent less GPUs,” so it feels like the numbers are maybe a little bit higher than 5 percent.

GitHub is part of Microsoft. The acquisition was made several years ago. You’re a new CEO of GitHub within Microsoft. You were at Microsoft before. How is that structured now? How does GitHub work inside of Microsoft?

I’m coming up on 10 years at Microsoft in December, which I wouldn’t have believed when I started at Microsoft given that I came through a small acquisition myself at a small startup called HockeyApp that got acquired in late 2014. I joined GitHub six years ago and then became the CEO three years ago. Today, GitHub is very much structured within Microsoft as it was when we acquired it in 2018. I was actually on the deal team working with Nat Friedman and others to get the deal done and was enjoying GitHub that way.

We are a limited integration company, as Microsoft calls it. We have adopted some of the Microsoft processes. Our employees get stock grants from Microsoft and invest that stock very similar to Microsoft employees. My manager is the president of the developer division, Julia Liuson, who also has all the Microsoft developer tools like Visual Studio Code and Visual Studio .NET and some of the Azure services that are near to developer workflows like Redis and API management and whatnot. She reports in to Scott Guthrie, who runs the cloud and AI division. That way, we are very much aligned with what the cloud is doing and also what the Azure AI platform team is doing, which we partnered with on this GitHub Models launch that we talked about earlier. 

As the CEO of GitHub, I have a leadership team across the whole range of functions: an engineering leader, a product leader, a COO, a chief people officer, a chief finance officer, a chief of staff. We are working together as a company, not as a functional Microsoft organization. As such, I’m operating much closer to a CEO than a typical Microsoft engineering leader. And I think that’s a lot of fun. That gives me a lot of energy, and it gives me a lot of motivation so we can fully focus on GitHub and making GitHub bigger.

Our goal, our winning aspiration, is to get to 1 billion developers on this planet. Hopefully they also all have a GitHub account, but more so the goal is to enable about 10 percent of the population, by the time we achieve that goal, to start coding, just as they learn to draw an image or start playing the guitar. Literacy at 100 percent is, hopefully, our aspiration as humans. I think coding should go in the same direction. Everybody should be able to code and explore their creativity. 

Coming back to your Microsoft question, we obviously benefit a lot from the mothership, including the partnership with OpenAI and the power of the cloud and having GPUs available in different regions, and the responsible AI stack and whatnot. At the same time, we get to focus on what makes GitHub unique in the industry.

You’ve said Copilot accounts for more than 40 percent of GitHub’s revenue growth this year. Is Copilot revenue positive? Is it still a cost for you? Is it just helping you acquire customers?

The earnings call script shared that, in the last year, 40 percent of the revenue growth came from Copilot, and the run rate is now 2 billion. Run rate obviously is forward-looking, so those are a bit of different metrics. We’re really happy about the Copilot growth and where this is going. And [Microsoft CEO] Satya [Nadella] keeps sharing the number of organizations that have adopted Copilot. I think what has been remarkable is that it’s not only the cloud native companies, the startups, the Silicon Valley core that have adopted Copilot. It’s really the largest companies in the world. 

But just running Copilot for you, is that a cost center, or is that actually profitable? Because that’s really the conversation across all of AI right now. Are we actually using this to make products to make money?

We’re very excited about where Copilot is today and where this is helping the GitHub business to go.

I did my best.

You’ve been running Copilot. You have a lot of feedback from your users. What are the biggest weaknesses in Copilot that you want to address?

I think the biggest weakness for a product like Copilot was early on in this generative AI journey. We announced the first version of Copilot, the preview, in June 2021. That was a year and a half before ChatGPT came. And we did [general access] in June 2022, still almost half a year before ChatGPT. And then ChatGPT came and changed everything. Until that point, we thought that chat was not a scenario that worked well enough for coding. Clearly, we were wrong on that. And clearly then, quickly, we moved to add Chat to the Copilot portfolio and make that great for developer scenarios within the IDE, within the editor, because it allows people to have all the context that’s available.

The power of Copilot has always been that it knows what’s in your file. So, when it suggests code, it actually has the variable names and it knows what open-source frameworks you’re using. It actually looks at adjacent tabs. So, when you ask questions to explain code, it not only looks at the lines of code you highlighted but it also looks at the context. If you copy and paste stuff into a generic chat agent, you have to collect that context yourself or give it to the tool in the prompt. It shows one of the weaknesses, which is that the world is moving fast, and you have to be really agile.

We don’t know what the next big thing in AI is, in the same way that you would’ve had a hard time predicting in 1994 that Amazon would become the big tech company, the member of The Magnificent Seven, that it is today. It took them a decade or so to actually turn their first profit. So it’s hard to predict what’s coming next. Especially in this AI race, I think our biggest weakness is that we already have a large product in market with a large installed base, where then moving fast is a challenge in itself.

We have the benefit of that installed base helping us to grow market share and a tight feedback loop, but at the same time, every time we want to experiment, we have to balance between that experimentation and breaking things and keeping the current customer set happy, both actually on the technical side but also how we invest in the engineers, the product managers, the designers that we have.

Microsoft has a lot of CEOs under Satya Nadella, who is the CEO of Microsoft. When they hire someone like Mustafa Suleyman and make him the CEO of AI, do you have to take a meeting? What was that like? “Hey, I already have one of the biggest AI applications in the world in GitHub Copilot. Can you help?” Describe that first meeting, that conversation.

The first time I met him was at the TED conference in Vancouver because he had a talk and I had a talk and we ran into each other backstage. That was, I think, about a month after it was announced that he was joining Microsoft. Obviously, the first couple of weeks in a large company like Microsoft are always stressful, and many people want to meet you. So I left him alone. We ran into each other and shook hands and exchanged a couple of intro sentences. Then, in the meantime, we’ve met both in the senior leadership meeting under Satya, at the SLT meeting every Friday, talking mostly about AI topics. I’ve also met with him and his team to talk about similar questions that you asked about earlier: How do we get more agile on models? How do we move faster on being flexible on the next model generation? What can we learn from the Microsoft Copilot?

Now, as you know, the GitHub Copilot was the first one that we ever built, and as such, there has been a continuous learning loop across all of Microsoft. Since the very early days of GitHub Copilot, there has been a monthly Copilot meeting with 100-plus people across Azure, across the Bing team, across Kevin Scott’s CTO organization, that have been in the loop of what we were doing in terms of building the Copilot, deploying the Copilot, commercializing the product, but also what they are doing and how we can leverage the stack.

I think the most fascinating thing is that, I think all the Copilots, it’s the first time, at least in my time at Microsoft, where everybody from the early days started on a common stack, the Azure AI platform, or Azure AI Services, as it’s sold to third parties. So it’s not like we built our own stack and Bing built their own stack and then somebody came and said, “Well, we should really standardize that on a new stack,” and then everybody else in the future starts with that new stack but all the old-timers are like, “Wow, that’s way too much effort to move to that new stack.” 

You’re just describing Windows right now. I just want to be very clear.

You said that, not I. [Laughs] But very early on, we identified that we needed an Azure AI platform. So that team under Scott Guthrie started building that in parallel to Copilot. Before we went and made Copilot generally available in June 2022, we were already on that stack. We were already benefiting from responsible AI. My team is doing red teaming and collaborating closely with Sarah Bird’s team that runs the responsible AI team in the platform. But we are mostly relying on their technology, and we collaborate very closely. I think that’s the new way of working at Microsoft that we have benefited from greatly, even though we are independent and limitedly integrated. 

Is there a set of things you would want to do that run counter to Microsoft’s priorities? Are there things that you would not be able to do? 

I don’t know.

I’ll just give you an example. There’s no way you’re going to go use one of Google’s models to run Copilot. That seems totally out of bounds, unless it isn’t, in which case that would be huge breaking news.

Well, I’d say we haven’t had that discussion because, so far, we haven’t seen the business case for that. At the end of the day, we’re running GitHub as a business that contributes to Microsoft’s earnings reports and the overall success of the business. As I mentioned earlier, we’re turning 50 next year and playing the infinite game.

But the reason I’m asking is, you’re a limited integration company inside of Microsoft. GitHub did start as an independent company. It has a different relationship to the developer ecosystem than even Azure does. Azure is a big, important part of the developer ecosystem, but Azure exists in a much more competitive environment than GitHub, which people think of almost as a utility. It’s there. You can use it. Everyone uses it for everything. Particularly in the open-source community, it is a focal point of a lot of things.

It doesn’t seem to have the commercial aspect that something like Azure might, but it’s still a business, and sometimes its priorities and the needs of its users might run against Microsoft’s desires. I’m just trying to suss out where that is and how you manage that tension.

If I can make a successful business case where I can show that we can generate revenue, we have healthy cost margins and ultimately profit margins in the long run, I think anything is possible. I would say never say never, whether it’s Google or AWS or any of the chip providers. I don’t think there is a mantra that I couldn’t do that. I think it’s a much bigger question: can I do it in such a way that we are still achieving our business goals as GitHub and as Microsoft?

And as such, while I’m the CEO of GitHub, obviously, I’m an executive at Microsoft, and we need to have that “One Microsoft” thinking in the grand scheme of things to grow the overall business. We are all tied to the mothership, whether it’s Ryan [Roslansky] at LinkedIn and the game studios, Mustafa in AI, or Thomas in GitHub. We’re part of Microsoft, and we’re working with Satya and the SLT very closely to make Microsoft successful. But I don’t think it is against Microsoft’s DNA to partner. I think the classic example is Apple, where they have been on and off.

Yeah. [Laughs] No tension in that relationship at all.

On and off. There have been winters and summers, I guess, in that relationship. But these days, my iPhone is full of Microsoft apps, and I’m having this podcast on a Mac, and I use a Mac day in and day out. In fact, when I joined Microsoft in December 2014, Microsoft bought me a new Mac. My startup had Macs, and it was at the time already, under Satya, very natural to say, “Well, if you want to work on a Mac and that makes you more productive, we’re totally down. We’re not forcing you to use a Windows PC.”

I think that anything is possible as long as it aligns with our strategy. Where do we want to go with GitHub? What products do we want to build? And the Models launch is actually a perfect example. We do have Meta’s model in there, which, it’s easy to argue that Llama is a competitor to Phi-3 and GPT-4. And we have Mistral in there with actually the latest Mistral large model as well. So I think we are open to being the platform provider that is both competing and partnering with sometimes the same company.

I want to end by talking about not just AI broadly but the communities on GitHub and how they feel about it. Let me ask you a question I’ve been asking every AI leader lately. There’s a lot of burden being placed on LLM technology. It came out. It had the moment. There’s tons and tons of hype. Everyone has bought as many H100s as they can. Jensen Huang’s doing great at Nvidia.

It’s not yet clear to me that LLMs can do all of the things that people say they can do. Obviously they can run Copilot. You have built one successful application at scale that people really like. You also have a view of what everyone else is building because you’re in GitHub. Do you think LLMs can actually do the things that people want them to do?

They can do a limited set of tasks. And I think, as you define those tasks in a very clear box of what that is, what you want the LLM to achieve, like auto-completion in Copilot as a scenario, they can be very successful. The reason we started with the auto-completion was not that we didn’t have the idea of chat and we didn’t have the idea of explaining code or building an agent that does it all. It was that the model didn’t do any of those scenarios at a sufficient success rate.

Developers have very high expectations. If you deliver a product that serves 60 percent of scenarios, you’re not going to be successful because your reputation is going to dive down really fast, whether it’s on social media or in our own community forums and whatnot. I think those scenarios have expanded over the last four years, from auto-completion to Chat to test generation to helping you plan out an idea and create a spec and then implement that code — what we are doing in Workspace, which takes you from an idea to implementation without ever leaving GitHub, and the AI helps every step of the way. 

But what’s important is that there are points in that flow where the human needs to come in and look at the plan and say, “Yeah, that’s actually what I wanted.” I like to think about it in the same way that I think about the relationships that we have with our coworkers. How often do you, at The Verge, give a task to somebody and then ask yourself, how specific do I have to get? And how long do I want to go until I need to check in with them and see if they are on the path that I had in my head?

I hear that comparison a lot, but I have to be honest with you, I never give a task to one of my colleagues at The Verge and assume that they will just make up bullshit at scale. That’s not how that goes. And with LLMs, the thing that they do is hallucinate. And sometimes they hallucinate in the correct direction and sometimes they don’t. It’s unclear to me whether they are actually reasoning or just appearing to.

There are a lot of things we want these systems to do, and I’m curious if you think the technology can actually get to the endpoint, because it requires them to be different than they are today in some meaningful way.

We believe, at GitHub, that the human will be at the center. That’s why we call the thing Copilot; we believe there has to be a pilot. Now, that doesn’t mean that the Copilot doesn’t fly the plane at times. They do in real life. And there are going to be scenarios where a large language model is scoped enough in the task that it needs to do to fix, for example, a security vulnerability. We have that already in public preview. We have what we call AutoFix, which takes a vulnerability and actually writes the fix for it.

But then there is still that moment where the pilot has to come back and say, “Yeah, that’s actually the fix that I want to merge into my repository.” I don’t think we are anywhere close to the pilot being replaced by an AI tool. From a security perspective in itself, there is also the risk that companies probably are not willing to manage anytime soon that AI and AI work together and merge code and push it into the cloud with no human involved because, purely from a nation-state actor perspective, or bad actor perspective, that’s a risk vector that nobody wants to take. There needs to be a human in the loop to make sure what’s deployed is actually secure code and not introducing vulnerabilities or viruses.

I think it’s a question, really, of how big the task is where you can trust the LLM enough that it results in a productivity improvement. You can easily now use an AI agent to change the background color of a webpage, and it takes three hours of work and you could have done it in three minutes yourself. That’s not the dishwasher. That’s just a waste of compute resources and, ultimately, energy. I think we’re going to see progress, and I think we are going to see better agents and better Copilots in the near and long-term future, but I don’t think we are anywhere near where we can replace the human with an AI, even at the more complex tasks. And we’re not even talking about giving the AI a task that is to build the next GitHub. I don’t think that’s in the next decade even. 

Right. We’ll have you back a decade from now and see if there’s a GitHub AGI.

There’s a reason I asked, “Can LLMs do it?” If the answer is they can, they can take all of the weight that we’re putting on them, then maybe some of the costs along the way are worth it. If they can’t, maybe those costs aren’t worth it. And I specifically mean costs like how people feel about AI. There’s a community of coders out there who are very unhappy that GitHub has trained on their work in various GitHub repositories and built Copilot.

If we think LLMs are going to get to the finish line, maybe it’s worth it. Maybe that pain is worth it. If it’s not going to get there, we’ve just pissed off a bunch of customers. How do you evaluate that? I see creatives across every field, whether it’s coding, whether it’s art, whether it’s movies, who are really upset that these AI systems are being trained on their work. Maybe they’re legally upset, maybe they’re morally upset, whatever it is. And then the outputs might not be worth it yet.

How do you think about those customers specifically and then the bigger problem of training and how that makes people feel generally?

First of all, I think the outputs are definitely worth it already. We’ve seen significant productivity gains for developers. We have seen 55 percent, is one such statistic from a case study that we did with 100 developers, 50 with and 50 without Copilot, and [the group] with Copilot were 55 percent faster. We see similar statistics from competitors and customers confirming that, both in the short and long term, developers are seeing significant productivity gains. We see it even in the later part of the developer life cycle, in successful builds and more deployments to the cloud from the team using Copilot versus the team without Copilot.

I think, though, more important is that we see very clear feedback and surveys, our own surveys and customer surveys, that developers are saying they’re happier, more satisfied, more fulfilled now that they no longer have to do all the repetitive tasks. I think that’s where the dishwasher analogy works really well. It’s easier for them to onboard to a new project.

If you think about one of the biggest challenges for a developer today, whether that’s in open source or whether that’s in a company, is onboarding to a new project. Whether you are joining a team or whether you’re just picking up somebody else’s work to make a bug fix, navigating that code base is incredibly hard because you don’t know what the person thought when they wrote it, while the AI can somewhat reliably figure that out and help you navigate that code base. And you reason with it together. You ask questions and it gives you a wrong answer. That’s okay, too, because the human programmer does that as well. So I think the value is proven.

But that said, and I think this is the second piece, we do need to work as an industry with these people raising the concerns to figure out what the right model is that the open-source foundations, the open-source maintainers, those that have been spending most of their private life on maintaining that small library that supports half the internet, how do we put them into a place where they also see the benefits of AI? How do we help them understand both our legal position but also our human position of why we believe training the models on that code is the right thing for society? 

It’s a complicated question. I’m not saying I have all the answers, but I can tell you that, at GitHub, we have always been committed to working with the open-source community, to working with regulators, to fighting for the rights of open-source maintainers with the European Commission, and ultimately now, giving GitHub away for free for every open-source project. We’re not asking the question, is it really open source or is it open weights or it’s public but it’s not an open-source license. We are giving you free repo, free issues, free actions, free code spaces, free models now with GitHub Models. We’ve been engaging with the community with things like GitHub Sponsors, an integration with Patreon, and other things where we enable maintainers to build a creator economy around their creator community.

I’ve noticed that you’ve changed certain language already. You’re evolving. So even with the launch of GitHub Models, I read your blog post, it’s very clear. You have a sentence. It stands all by itself: “No prompts or outputs in GitHub Models will be shared with model providers, nor used to train or improve the models.”

That feels important to say now. It’s right there. You can read it. Is that something you had to learn that you needed to say, that this was a concern that people would have? Because in the rush to AI, what you might call the ChatGPT moment, I feel like no one knew they needed to say that, and that has caused all these problems. And now it’s very clear that people care a lot about where their data goes.

Yes, it’s important to get out of the tech bubble. What is obvious to the people working on the product is often not obvious to the customers. As the customer base is growing, more people ask those questions. So I think it is incredibly important. In fact, it’s equally important as it was with the cloud or it was with systems like Exchange and Gmail to say, “Hey, if you’re deploying your application on our cloud, we are obviously not looking at your source code and using that source code to make other products better or sharing that source code with other people deploying on the cloud.”

The same is true for models. People see these models as a compute layer and, as such, they want to use that and send something, compute it, and get it back and not implicitly give anyone access to that data to make the model or the compute layer, if you will, better. I think that continues to be a cornerstone of Microsoft’s strategy. We have this line that every employee learns: Microsoft runs on trust. We believe that if we lose that trust, earning it back is incredibly hard. We have gone through moments in my career at Microsoft, and certainly in Microsoft’s 50 years, where a lot of that trust was lost, and it took a while to get it back.

I think the model providers themselves have enough data and will be finding ways to get access to data without us sharing it with the model providers or certainly not without the approval of the customer. There’s one caveat to this that is somewhat orthogonal but is easily intermingled with that question, which is, there’s an increasing demand of customers wanting to fine-tune a model based on their data. What that means is taking their source code in the GitHub scenario, or other data in other scenarios, and changing the parameters of the model, changing the weights through a tuning process.

Now, they have a customized version of that model that is a combination of the public model, the one that OpenAI or Meta has released, but also their own data, where the parameters were changed. Now, obviously, that model needs to be within the private tenant of that customer unless the customer decides to make that model public through their own API. A common scenario that you can imagine is companies having their own programming languages, like SAP has [Advanced Business Application Programming], and so they want a model that speaks ABAP so that everybody that wants to use an SAP Copilot to build ABAP can do so with a fine-tuned model that SAP has provided. Those scenarios obviously exist. And there, it is fine to tune on the customer data because the customer wants to do that.

I feel like I learned a lot about SAP and how its software is built just now. [Laughs]

They’re not too far from here.

Thomas, you’ve given us so much time. What’s next for GitHub and Copilot? What should people be looking for?

I think if you look at where we have gone for the last year or so, it’s like we have extended Copilot into different parts of the developer life cycle. We originally announced it as Copilot X, Copilot coming to other parts of the workflow, not just auto-completion, not just chat, but actually bringing it into everything that the developers do because we believe there’s a lot of value there. A very simple feature that we launched last year is summarizing the pull request. So when you have done all your changes to the code and you submit that for review, you no longer have to write the description yourself. You can use Copilot to write that description for you. Now, you’re saying, “Well, that’s trivial. You can do that yourself. You’re not saving that much time.”

But the truth is, if you’re coming out of a three-hour coding session, and you have to write all the things up that you did during that time, you will have incredible confirmation bias of what you believe you did versus what you actually did. You’re only remembering the changes that you thought were important and not the ones that you maybe accidentally made or you made because you were trying out how things worked. Copilot, when it looks at the changes, it just plainly writes down what it sees. You get a very detailed write-up. You can obviously customize it to be shorter or longer, but it also describes stuff that you may have changed inadvertently, so you’re saving a lot of time by avoiding the iteration later in the cycle.

We’re bringing Copilot into all parts of the developer workflow. We are looking into building what we call Copilot Workspace, the AI native development workflow, which is really cool because it allows you to take an idea and bring that into code with the help of a Copilot. So it’s not adding Copilot to your editor; it’s inventing the whole developer workflow from scratch. You write in an idea, and it looks at that idea and the existing code base and writes your plan. You can look at that plan and say, “Well, that isn’t actually what I wanted.” If you think about the dynamic today between engineering and product management, you often have either overspecified or underspecified issues, and then the product manager has to go back to the engineering team and say, “Well, that isn’t actually what I wanted,” or the engineers go back with the product manager and say, “This isn’t specific enough.”

Having AI in that planning piece is already a win for both sides. In fact, we have seen product managers saying, “Now, I can implement the thing myself. At least I can try what that does to the code base and see how long it’ll take.” 

[Laughs] I feel like you’ve really ratcheted up the temperature on the PM / engineer dynamic right there.

I have chief product officer friends who are literally saying, “I found the fun in coding again with the help of Copilot.” Whether you’re a CEO or a chief product officer, most of your day is spent in email and meetings and customer calls and podcasts. And then, when you have an hour on Sunday, spending that in a productive way is incredibly hard because you have to get back to your environment. Whether that’s building model train houses or whether that’s code, it’s equally similar because you have to prepare your workspace again. With something like Copilot, it actually is much easier because you can open your project where you left it. You can ask Copilot, how do I do this? You don’t have to start navigating that complex world of open-source libraries and models. So we are building the AI native developer workflow, and we actually think this is going to be incredibly empowering for both developers working on their private project but also for open-source maintainers. 

If you look at an open-source project today and want to make a change, your biggest challenge is going to be to figure out the places where you have to make those changes. And how do you not piss off the maintainers by creating a pull request that is incomplete, or that doesn’t follow their coding standards, or that doesn’t follow the way they want to collaborate with each other? At the end of the day, the open-source communities are defining how they want to collaborate. And that’s totally cool. Every company defines their culture and every open-source project defines their culture. The contributors that are coming in, especially those that are early in their career, often have anxieties in their head of “what if I file my first pull request and the reaction is not ‘Oh, this is so great, Thomas, that you sent that to us,’” but “Go back and learn how to code.”

This doesn’t happen often, but I think most people have that anxiety in their heads, and they’re waiting forever until they feel ready to contribute. I think Copilot will lower that barrier of entry. And one last thing is that I’m from Germany. I grew up with German as my first language. I learned Russian, and then English, and I will probably always have an accent when speaking English, but most kids on this planet do not speak English at age six. There’s a large population that does speak English, but a lot of them do not speak English, while open source and technology is predominantly in English. For them, the barrier to entry is going way down, and it will allow them to explore their creativity before learning a second language, before becoming fluent in that second language, before having the confidence of “I can type a feature request against the Linux kernel and say, ‘I want this, I want this. And here’s the code I’ve already implemented. What do you think?’” That is going to completely change the dynamic on this planet.

It feels like we’re going to have to have you back very soon to see how all of these projects are going. Thomas, thank you so much for being on Decoder.

Thank you so much. It was super fun.

Decoder with Nilay Patel /

A podcast from The Verge about big ideas and other problems.

SUBSCRIBE NOW!