How to Install a Local LLM using Ollama

How to install a local LLM using Ollama

Before we build any AI frameworks or demos, we need to get a local LLM running.

Why do I need to install a local LLM?

For any AI development you are going to need an LLM (Local Language Model) to chat with, you can use ChatGPT for this but you will have to pay for it, and also you will be talking out to the internet where security can be a concern.

Using https://ollama.com you can install your own local instance of your own LLM so any question you ask of it are being handled locally. It is also free. The only disadvantage is that you will need to provide a server to run it on, but it can easily run on your laptop, you need to be careful of the size of LLM you install and how much power it needs – more on that later.

For now I am doing this on an Ubuntu Server.

How to install a local LLM using Ollama for free

1. Create a Python virtual environment for the LLM

python3 -m venv llm_env
source llm_env/bin/activate

2. Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

The server I am running on is a cloud instance so does not have a GPU, so this will be running on CPU

3. Confirm Installation

Type ollama list to see what you have installed, which will be nothing!

If you get output, Ollama is installed correctly.

ollama list

4. Check Ollama help

To see all commands you can type ollama help

ollama help

5. Pull a model

The next step is to download a local language model, this is where you have to decide how big a model to pull vs how much compute power you have

To view the current list of models you can go here: https://ollama.com/library?sort=popular

For this demo I am going to pull llama3.2:3b- which has 3 billion parameters and will require 2GB of disk space.

To pull a model just enter the command ollama pull <name of your model>

ollama pull llama3.2:3b

6. Run your model

You now need to start your llm and run the model

To run your model you just type ollama run and the name of your model, I am going to check the models with ollama list and then run it.

I can then ask it a question, if you get a response your local LLM is working!

As you can see from the output this is working, I now have a local LLM which I can ask questions of and get a response locally without going out to the internet.

In the next post we are going to build and MCP server so I can connect my LLM to an MCP server and then talk to my network!

How to Install local LLM using Ollama