Skip to Content

How to Build Local AI on PC Using OpenAI gpt-oss-20b in VS Code

Want to Build Powerful AI on Your Own PC? Are You Missing Out on the Secret to Free, Private AI Development? Discover Microsoft’s Incredible Toolkit.

You can now build and run powerful Artificial Intelligence (AI) applications right on your personal computer. Microsoft has provided a clear guide showing how developers can use OpenAI’s gpt-oss-20b model locally with the AI Toolkit in Visual Studio (VS) Code. This approach allows you to work with advanced AI without needing to connect to the cloud, offering more privacy and control.

How to Build Local AI on PC Using OpenAI gpt-oss-20b in VS Code

The gpt-oss-20b model is a game-changer because it has strong reasoning skills but can run on regular consumer hardware. This is perfect for projects that need to work offline or on devices at the edge of a network, like smart cameras or sensors.

Understanding the gpt-oss-20b Model

OpenAI recently made its gpt-oss-20b and a larger gpt-oss-120b model available for public use. Both are built with a smart “mixture-of-experts” design, which helps them run efficiently. Here’s what makes the smaller gpt-oss-20b so useful:

  • Low Memory Needs: It requires only 16GB of GPU memory, which is common in many gaming or development computers.
  • Large Context Window: It can remember and process up to 128,000 tokens of information at once, allowing for more complex conversations and tasks.
  • Free to Use: The model is released under an Apache 2.0 license, meaning you can use, change, and build upon it for free, even for commercial products.
  • Powerful Capabilities: Despite its smaller size, it performs very well on tasks that require reasoning, problem-solving, and using tools.

Your Toolkit for Local AI Development

The AI Toolkit for Visual Studio Code is a free extension that brings all the necessary tools for AI development into one place. It helps you manage the entire process, from downloading and testing models to building them into your applications. With this toolkit, you can deploy, test, and use the gpt-oss-20b model without relying on external cloud APIs, which can be costly and complicated.

How to Get Started with Local Deployment

Setting up the gpt-oss-20b model on your machine is a straightforward process with the AI Toolkit.

System Requirements

Before you start, make sure your computer meets these requirements:

  • GPU: 16GB or more VRAM
  • Software: Visual Studio Code with AI Toolkit extension
  • Operating system: Windows, macOS, or Linux

Method 1: Direct Deployment

  1. Install Visual Studio Code if you don’t have it already
  2. Add the AI Toolkit extension from the marketplace
  3. Open the Model Catalog using Ctrl+Shift+P
  4. Find gpt-oss-20b in the catalog and click “Add Model”
  5. Wait for download – this takes 15-30 minutes
  6. Check deployment status in the AI Toolkit’s model management interface

Method 2: Using Ollama

You can also use Ollama for more flexibility:

  1. Install Ollama on your computer
  2. Run the command: ollama run gpt-oss
  3. Add to AI Toolkit through the Resources section

This method gives you API access and works with different development frameworks.

Added Flexibility with Ollama

For developers who prefer working with the gguf model format, the AI Toolkit supports Ollama. This allows you to run gpt-oss-20b through Ollama’s local server while still managing it within the toolkit. You can install Ollama, pull the gpt-oss model, and then add it to your resources in the AI Toolkit.

Testing and Building with Your New AI

Once the model is deployed, the AI Toolkit provides features to help you test its capabilities and build applications.

The Playground

The toolkit includes a “Playground” where you can test prompts and even compare different models side-by-side. For example, you could see how gpt-oss-20b performs on a coding task compared to another model like Qwen3-Coder.

The AI Toolkit includes a Playground feature for testing models. You can:

  • Compare different models side-by-side
  • Test programming tasks like creating HTML5 games
  • Evaluate performance against other local models like Qwen3-Coder

For example, you can test the prompt “Creating an HTML5 Tetris application” to see how well the model generates code.

Agent Builder

For more advanced projects, the Agent Builder is a visual tool that helps you create AI agents. These agents can combine the power of your local model with other services to perform complex tasks.

Building AI Agents

The AI Toolkit’s Agent Builder lets you create intelligent agents using gpt-oss-20b. This visual tool helps you:

  • Build agent applications quickly
  • Combine multiple services using Model Control Protocol (MCP)
  • Create prototypes for business applications

This feature makes it easy to experiment with AI agents without complex coding.

Performance and Optimization

The gpt-oss-20b model performs well on consumer hardware:

  • Laptops with 16GB RAM: 15-25 tokens per second
  • Apple Silicon Macs: 20-30 tokens per second with Metal optimization
  • High-end smartphones: 8-12 tokens per second with quantization

For better performance, you can use techniques like memory mapping and torch.compile() optimization.

Real-World Applications

Developers are using gpt-oss-20b for various applications:

  • Edge computing solutions that work without internet
  • Privacy-focused AI assistants
  • Local development environments for testing
  • Offline AI applications for mobile devices

Security and Safety

OpenAI has tested gpt-oss-20b across multiple safety domains including biological, chemical, and cybersecurity applications. The model includes safety measures to prevent misuse while maintaining high performance.

Future Updates

Microsoft plans to add CPU-only deployment in future releases. Currently, GPU acceleration is required for optimal performance.

By bringing powerful, open-weight models like gpt-oss-20b into a simple development environment, Microsoft is making it easier for anyone to experiment with AI. This local-first approach gives developers more freedom and control, helping them innovate faster without the frustrating costs and limitations of cloud-based services.