Ollama: The easiest way to run local and cloud LLM models  


I’ve tried many approaches to run local LLMs, and Ollama is hands-down the easiest way I’ve found to get started and actually use local models productively.

Installing Ollama is a one-line command:

curl -fsSL https://ollama.com/install.sh | sh

Running a model is also a one liner:

ollama run gemma3

It includes a local server service that can be managed with:

sudo systemctl start ollama
sudo systemctl stop ollama

https://docs.ollama.com/quickstart

Open WebUI

While you can use Ollama from the command line, I prefer the Open WebUI interface. It’s essentially ChatGPT-style interface but fully self-hosted, fully local, and beautifully designed.

If you have non-technical family members or want a quick chat interface, Open WebUI is perfect. It remembers conversation history, handles formatting beautifully, and feels like using a cloud service, except everything stays on your own hardware.

It can be easily run in a Docker container:

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

https://github.com/open-webui/open-webui

Home Assistant integration

Here’s where Ollama gets really interesting for home automators: Home Assistant has official integration with Ollama. I’ve set up my Home Assistant to use an Ollama model as a conversation agent, part of the voice assistant controlling the devices in my home.

https://www.home-assistant.io/integrations/ollama/

Local LLMs are improving rapidly

My current favorite is now GLM-4.7-flash. On my mini PC with an integrated GPU, it generates amazing responses in about one minute. Models that were unusable six months ago are now competitive with far more expensive options.

Cloud LLMs with Ollama

Here’s what I didn’t realize: Ollama added cloud hosting in September, with the exact same interface as local models. Same API, same CLI, same simplicity, just running on their infrastructure. They offer:

  • Free tier: Great for evaluation and light use
  • $20/month tier: Reasonable for regular personal use
  • $100/month tier: For heavier workloads and teams

If you have a Raspberry Pi (or any modest device), the cloud option means you can run advanced models that would never fit in 2GB of RAM. The same ollama run command works across both local and cloud.

Enterprise-Grade privacy

Unlike other cloud LLM services, Ollama emphasizes enterprise-grade privacy. Your data isn’t used for training, and they offer compliance features that other services simply don’t have.

For me as a telecommunication engineer working with sensitive systems, this is a major advantage over alternatives like OpenAI or Anthropic.

Launching AI coding agents

Ollama can launch Claude Code, Codex or OpenCode, i.e.:

ollama launch claude --model qwen3-coder-next:cloud

Each of these works with any local model or cloud model.

OpenClaw integration

OpenClaw is the newest sensation in the AI agent world: https://openclaw.ai/

OpenClaw does not include a setup assistant for Ollama, and the config file is a bit tricky, but you can simply run:

ollama launch openclaw --model kimi-k2.5:cloud

Ollama generates the OpenClaw configuration and starts it.

I’m running OpenClaw in a Docker environment without access to my personal data, be careful, there are lots of security concerns.

https://openclaw.ai

The bottom line

Ollama is the easiest, most flexible, and most privacy-respecting way to run local and cloud LLMs today. The fact that it offers cloud options without forcing you to use them is amazing.

The downside: You do not have access to the top GPT or Opus closed models.