
CLI tool for running and managing local large language models with REST API and client library support
Ollama is a CLI tool that allows developers to run large language models locally on their machines. It provides a simple command-line interface for downloading, running, and managing models like Gemma 3, with commands like ollama run gemma3 to start interactive chat sessions. The tool includes a built-in REST API server that listens on port 11434, enabling integration with external applications through HTTP requests.
The tool supports multiple installation methods across macOS, Windows, and Linux, with official Docker images available. Ollama provides client libraries for Python and JavaScript, allowing developers to integrate local LLM capabilities into their applications programmatically. It can launch specific integrations with development tools like Claude Code, Codex, and GitHub Copilot CLI through commands like ollama launch claude.
Ollama uses llama.cpp as its backend and maintains a model library accessible through ollama.com/library. The tool has spawned a large ecosystem of community integrations including web interfaces, desktop applications, mobile clients, and IDE extensions. It targets developers who need local AI capabilities without relying on cloud services, offering both interactive chat functionality and programmatic API access for building AI-powered applications.
# via Shell (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh
# via PowerShell (Windows)
irm https://ollama.com/install.ps1 | iex
# via Docker
docker pull ollama/ollama