Skip to main content

GitHub Models gives developers new power to experiment with Gen AI

Credit: Image generated by VentureBeat with Stable Diffusion 3
Credit: Image generated by VentureBeat with Stable Diffusion 3

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


GitHub is no stranger to the world of AI for development, but to date, it hasn’t been as easy as it could be for developers to try out new gen AI models. That’s starting to change today.

GitHub is launching a new effort called GitHub Models in a bid to provide an easier onramp for enterprise developers to try out and build applications with gen AI. GitHub is an early pioneer in the use of gen AI, particularly with its GitHub Copilot service. With GitHub Copilot developers get code completion and suggestion capabilities to build applications. GitHub Copilot is currently powered by a single model that GitHub has carefully curated and evaluated. GitHub Models, on the other hand, is a new initiative that provides developers with direct access to a wider range of AI models including Meta’s Llama 3.1, OpenAI’s GPT-4o, Mistral Large 2, AI21’s Jamba-Instruct, Microsoft Phi-3 as well as models from Cohere.

The goal with the new service is to allow developers to experiment with and integrate gen AI  models into their own applications, beyond just code completion.

“Every single app that is probably going to be created in the coming months and years is going to have intelligence attached to it as well,” Mario Rodriguez, senior vice-president of product at GitHub told VentureBeat. “It’s no longer enough for you to have an application, you’re going to have to have an application that is powered by intelligence.”

Reducing AI friction for developers

A key focus of the GitHub models initiative is to reduce the friction developers face when trying to experiment with and integrate AI models into their applications. Rodriguez noted that previously developers had to jump between a lot of sites and create multiple accounts just to play with different models. 

Rodriguez said that for GitHub’s users, it was previously impossible to easily explore and access a broad array of gen AI models, using just a GitHub identity. For developers that use GitHub, the identity provides access to an array of capabilities and makes it easier to develop code.

“We just wanted to make it extremely simple, you know, AI is not a fad, it’s here to stay,” Rodriguez said. “So we just have to get that friction to be zero, if we want to continue to have that market grow.”

The GitHub Models initiative aims to reduce AI friction for developers by providing a centralized catalog of AI models that developers can access and experiment with directly within the GitHub platform, using their existing GitHub identity.

GitHub Models provides a developer path to enterprise AI deployment

While reducing friction to help developers try out and experiment with gen AI models is a core goal of GitHub Models, it’s not the only one.

GitHub is also providing a path for its users to easily move from experimentation to production deployment of AI-powered applications. That path leads to Microsoft’s Azure. GitHub is, of course, part of Microsoft as well so it’s not surprising that’s the direction.

The way it works is users will first experiment with the AI models in the GitHub Models playground to evaluate their capabilities and performance. From there, a developer would transition to a GitHub Codespace or VS code developer environment and access an Azure SDK (software development kit) to obtain the necessary tokens and API keys to connect to the Azure platform.

Experimentation is the key to overcoming enterprise AI challenges

The path to enterprise AI deployment is also about overcoming challenges.

Rodriguez identified three key challenges that developers face when working with AI models: latency, quality of responses and cost. Part of the goal with GitHub Models is to help developers navigate these challenges by providing an environment for testing and comparison.

While industry benchmarks for various gen AI models are useful, Rodriguez noted that they do not tell the full story. 

“You really have to rely on your offline evaluation and online evaluation to make the best decision,” he said.