HomeLab AI

About 1041 wordsAbout 3 min

2025-10-13

K3s HomeLab LLM Basics

By now, most readers of this blog is likely aware about how easy it is to download and run a LLM locally on your computer using for example Ollama or LM Studio or GPT4All or.. While doing so absolutely has its uses, I am more interested in a setup that is always on and accessible on a local network for multiple users to take advantage on. I am also interested in seeing what LLM models I can get to work on my K3s lab environment, running on old hardware. Basically edge inference on local cloud, or perhaps cloud inference on edge hardware, depending on your world view.

To achieve this I chose to try out Open WebUI. Open WebUI is a open source initiative that provides a platform for deploying multiple LLM runners and inference engines, like Ollama and OpenAI-compatible API:s, that allows enriching inference with both local RAG (Retrieval Augmented Generation) and web search, with a web-based interface for user interaction on top.

My goel here is simply to get Open WebUI working with any LLM model my cluster can manage. Best practices will be ignored and shortcuts taken.

As my Kubernetes lab does not yet have anything installed on it, other than what a default K3s installation provides, the first thing needed is a package manager for Kubernetes. Actually, we could probably make due without, but the Open WebUI installation guide for Kubernetes use Helm and I have intended to manage the lab with it anyway so let´s not complicate things.

Install Helm (on the K3s Ubuntu Server in my case):

$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
$ chmod 700 get_helm.sh
$ ./get_helm.sh

Install a application repository for Helm to use:

helm repo add bitnami https://charts.bitnami.com/bitnami

Apparently Open WebUI is not in the Bitnami repository, which I chose above since it is popular and commonly used, so we have to add a separate helm repo for it.

Install Open WebUI repo, which by default includes Ollama, to Helm:

helm repo add open-webui https://open-webui.github.io/helm-charts
helm repo update

Install Open WebUI and Ollama dependency using Helm:

helm install openwebui open-webui/open-webui --kubeconfig /etc/rancher/k3s/k3s.yaml

Note parameter "--kubeconfig /etc/rancher/k3s/k3s.yaml" is only needed because I run a K3s cluster that does not use the same default location for kubeconfig as K8s do.

Check what got installed:

kubectl get pods -o wide

NAME                                    READY   STATUS    RESTARTS   AGE   IP           NODE                       NOMINATED NODE   READINESS GATES
open-webui-0                            1/1     Running   0          13h   10.42.1.14   m1a1-kub1                  <none>           <none>
open-webui-ollama-7df4994cdf-jswrr      1/1     Running   0          13h   10.42.0.21   ubuntu-server-hp-prodesk   <none>           <none>
open-webui-pipelines-6ff4555794-mcfc2   1/1     Running   0          13h   10.42.1.15   m1a1-kub1                  <none>           <none>

Most Open WebUI guides will now tell you that you are done and can access it locally on your cluster node using http://localhost:8080 or to do a port forward to access it externally from cluster. That is not the case in my setup.

Note in the above that the pod for Open WebUI is in my setup running on one node (old Xiaomi MiA1 phone) and the pod for Ollama is running on another. Installation guides for Open WebUI, and its default Helm configuration, generally seems to assume you run everything on one and the same node, and then try to access it from the host of that node, or that you are accustomed to configure an ingress controller to expose services.

Best practice would definately be to configure the ingress controller (Traefik is included in default K3s installation) but since this is not for a permanent setup but rather for quick and dirty testing I will not.

Change Open WebUI to use node IP and port instead of its default ClusterIP and port:

helm upgrade openwebui open-webui/open-webui --kubeconfig /etc/rancher/k3s/k3s.yaml --set service.type=NodePort

Since I did not define which port to use for NodePort one has randomly been selected and we need to know which one:

export NODE_PORT=$(kubectl get -n default -o jsonpath="{.spec.ports[0].nodePort}" services open-webui)
echo $NODE_PORT
32752

This allows us to reach the Open WebUI interface using http://[node IP]:32752. Yay!

Unfortunately, once I have setup the initial administrator account in Open WebUI I realize I cannot access Ollama and cannot search for and download models. Same issue as with Open WebUI basically, the default Helm chart configuration for Ollama is set to use ClusterIP.

Changing values using helm upgrade will reset all other values to the Helm chart values, so we need to set NodePort for both ollama and open-webui at the same time:

helm upgrade openwebui open-webui/open-webui --kubeconfig /etc/rancher/k3s/k3s.yaml --set service.type=NodePort,ollama.service.type=NodePort

The Helm chart for Ollama does have a NodePort port defined, but to verify run:

export NODE_PORT=$(sudo kubectl get -n default -o jsonpath="{.spec.ports[0].nodePort}" services open-webui-ollama)
echo $NODE_PORT
31434

Update settings in Open WebUI to access Ollama using http:/[node IP hosting open-webui-ollama pod]:31434

We can now download models and test what LLM models this measly cluster can actually run!

First I tried mistral:7b because it is relatively small and I picked up somewhere that the Mistral models are supposed to be good on many languages. mistral:7b worked pretty fine on my cluster, although a little slow, but using my native language Swedish with it was a major dissappointment with many spelling and grammer errors in replies.

Then I tried OpenEuroLLM-Swedish which is a fine-tuned version of gemma3:12b which is supposed to reply well in Swedish. Without a GPU in my cluster it does however appear 12B models are too big for it. I kept getting timeouts trying to use it.

Next I will try with the viking-7b which seems to be more in size of what my hardware can manage.