Ollama

Ollama is a lightweight, extensible framework designed for building and running large language models (LLMs) on local machines. It provides a command-line interface (CLI) that facilitates model management, customization, and interaction. Here’s a comprehensive guide to using Ollama, including essential commands and examples.
Author

Benedict Thekkel

1. Installation

  • For Linux:

    Open your terminal and execute:

    curl -fsSL https://ollama.com/install.sh | sh

    This command downloads and installs Ollama on your system.

2. System Requirements

  • Operating System: macOS or Linux
  • Memory (RAM): Minimum 8GB; 16GB or more recommended
  • Storage: At least 10GB of free space
  • Processor: Modern CPU from the last 5 years

3. Basic CLI Commands

  • Start the Ollama Server:

    To run Ollama without the desktop application:

    ollama serve
  • Download a Model:

    To download a specific model:

    ollama pull <model-name>

    Replace <model-name> with the desired model’s name, e.g., llama3.2.

  • List Downloaded Models:

    To view all models available on your system:

    ollama list
  • Run a Model:

    To start a model and enter an interactive session:

    ollama run <model-name>

    For example:

    ollama run llama3.2
  • Stop a Running Model:

    To stop a specific model:

    ollama stop <model-name>
  • Remove a Model:

    To delete a model from your system:

    ollama rm <model-name>
  • Display Model Information:

    To view details about a specific model:

    ollama show <model-name>
  • List Running Models:

    To see which models are currently active:

    ollama ps
  • Access Help:

    For a list of available commands and their descriptions:

    ollama help

4. Model Customization

Ollama allows users to customize models using a Modelfile. This file specifies the base model and any modifications, such as system prompts or parameters.

  • Create a Modelfile:

    Create a file named Modelfile with the following content:

    FROM llama3.2
    
    SYSTEM "You are an AI assistant specializing in environmental science. Answer all questions with a focus on sustainability."
    
    PARAMETER temperature 0.7
  • Build the Custom Model:

    Use the ollama create command to build the model:

    ollama create my_custom_model -f ./Modelfile
  • Run the Custom Model:

    Start the customized model:

    ollama run my_custom_model

5. Using Ollama with Files

  • Summarize Text from a File:

    To summarize the content of input.txt:

    ollama run llama3.2 "Summarize the content of this file." < input.txt
  • Save Model Responses to a File:

    To save the model’s response to output.txt:

    ollama run llama3.2 "Explain the concept of quantum computing." > output.txt

6. Common Use Cases

  • Text Generation:

    • Content Creation:

      Generate an article on a specific topic:

      ollama run llama3.2 "Write a short article on the benefits of renewable energy." > article.txt
    • Question Answering:

      Answer specific queries:

      ollama run llama3.2 "What are the latest advancements in artificial intelligence?"
  • Data Analysis:

    • Sentiment Analysis:

      Analyze the sentiment of a given text:

      ollama run llama3.2 "Determine the sentiment of this review: 'The product exceeded my expectations.'"

7. Python Integration

Ollama Python Library

The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.

Prerequisites

  • Ollama should be installed and running
  • Pull a model to use with the library: ollama pull <model> e.g. ollama pull llama3.2
    • See Ollama.com for more information on the models available.

Install

pip install ollama

Usage

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='llama3.2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)

See _types.py for more information on the response types.

Streaming responses

Response streaming can be enabled by setting stream=True.

from ollama import chat

stream = chat(
    model='llama3.2',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

Custom client

A custom client can be created by instantiating Client or AsyncClient from ollama.

All extra keyword arguments are passed into the httpx.Client.

from ollama import Client
client = Client(
  host='http://localhost:11434',
  headers={'x-some-header': 'some-value'}
)
response = client.chat(model='llama3.2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])

Async client

The AsyncClient class is used to make asynchronous requests. It can be configured with the same fields as the Client class.

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  response = await AsyncClient().chat(model='llama3.2', messages=[message])

asyncio.run(chat())

Setting stream=True modifies functions to return a Python asynchronous generator:

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  async for part in await AsyncClient().chat(model='llama3.2', messages=[message], stream=True):
    print(part['message']['content'], end='', flush=True)

asyncio.run(chat())

API

The Ollama Python library’s API is designed around the Ollama REST API

Chat

ollama.chat(model='llama3.2', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])

Generate

ollama.generate(model='llama3.2', prompt='Why is the sky blue?')

List

ollama.list()

Show

ollama.show('llama3.2')

Create

ollama.create(model='example', from_='llama3.2', system="You are Mario from Super Mario Bros.")

Copy

ollama.copy('llama3.2', 'user/llama3.2')

Delete

ollama.delete('llama3.2')

Pull

ollama.pull('llama3.2')

Push

ollama.push('user/llama3.2')

Embed

ollama.embed(model='llama3.2', input='The sky is blue because of rayleigh scattering')

Embed (batch)

ollama.embed(model='llama3.2', input=['The sky is blue because of rayleigh scattering', 'Grass is green because of chlorophyll'])

Ps

ollama.ps()

Errors

Errors are raised if requests return an error status or if an error is detected while streaming.

model = 'does-not-yet-exist'

try:
  ollama.chat(model)
except ollama.ResponseError as e:
  print('Error:', e.error)
  if e.status_code == 404:
    ollama.pull(model)
import ollama
from IPython.display import Markdown, display
response = ollama.generate(model='deepseek-r1:7b', prompt='write a haiku about time')
display(Markdown(response['response']))

Alright, so I need to write a haiku about time. First off, what’s a haiku? It’s a traditional Japanese poem with three lines. The structure is usually 5 syllables in the first line, 7 in the second, and 5 again in the third. So it’s got that 5-7-5 pattern.

Okay, time-related themes can be pretty broad—ages, seasons passing, moments slipping away, etc. I want to capture something meaningful or introspective because haikus often evoke emotions or deep thoughts.

Maybe I can focus on the passage of time. Like how time moves on without us noticing. Or perhaps something more personal, like aging. But maybe keeping it a bit more universal would be better for a haiku about time itself.

Let me think about imagery associated with time. Clocks, watches, sun setting, seasons changing… Hmm, or maybe something more abstract like the ticking of a clock as an metaphor for life’s flow.

Wait, another angle: the fleeting nature of time, how it takes away memories and opportunities. That could add some poignancy to the haiku.

So putting that together: first line about time moving on, second line showing something being taken by time, third line reflecting on its impact or wish for more time.

Let me try drafting some lines.

First line: “Time’s ticks tick on,” – that gives a rhythmic feel and shows progression. But maybe make it more poetic: “The clock ever moves.”

Second line: Something about what’s lost to time. Maybe the things we hold dear now, like a child’s laughter or a pet’s presence. Or perhaps natural elements. “A child’s laughter fades.”

Third line: Reflecting on time’s impact. “Time’s touch brings loss.”

Wait, that’s three lines but not quite hitting the 5-7-5 syllable structure. Let me check.

“The clock ever moves.” – That’s 5 syllables. “A child’s laughter fades.” – That’s 8, which is too long for the second line. Hmm.

Maybe adjust the second line to be more concise. “A fleeting memory fades” – that’s longer but still not fitting. Alternatively, “A moment slips away,” which is 7 syllables: 5-7-?

Third line needs to fit back into 5. Maybe “Time takes what’s near.”

Putting it together:

The clock ever moves. A moment slips away. Time takes what’s near.

But that might not flow well. Let me try another approach, focusing more on the passage of time and its impact.

First line: “Time passes without end,” – 5 syllables. Second line: “Mornings shift to sunsets,” – 7 syllables. Third line: “Yet moments slip through.”

But maybe that’s too straightforward. Maybe add a bit more depth or metaphor.

Alternatively, using imagery like shadows for time moving:

“A shadow falls each day,” – 5. “The hours grow long and slow,” – 7. “Beware the time you waste.” – 6 syllables, close but not quite.

Hmm, maybe tweak it again. “A shadow tugs at night” (5), “Sunlight fades as days pass” (7), “Time’s hand is ever cold.” (6). Still a bit off.

Perhaps another angle: focusing on aging or growing older:

“Eyes see, hearts grow old,” – 5. “The sand in time’s hourglass flows,” – 8 again. Not good.

Alternatively, “The leaves fall silent,” – 6 syllables. Hmm.

Wait, maybe think about the inner experience of time—how it affects us as we age:

“Time whispers through our ears,” – 7 syllables. “Beneath our feet, shadows stretch,” – 8. “Young once, now gray and wise.” – 7. Still not matching.

I might need to adjust my approach. Maybe start with a metaphor for time, like something breaking or fading:

“The ink runs dry on yesterday’s thought” – 10 syllables. Too long.

Or “A river flows forever,” but needs context about what’s being carried by the river—perhaps memories or things lost.

“A river carries time away,” – 7. “But in its current, old notes remain.” – 8.

Not matching either.

Alternatively, think of time as a thief:

“Thief of moments, ever present.” – But syllables? Thief:3, of:1, etc. Maybe “Theft of moments speeds by.”

Wait, maybe I’m overcomplicating it. Let’s go back to the original structure and try to fit something meaningful.

First line: “Time moves on” (5) Second line: “Echoes fade away” (7) Third line: “Weep for what’s past” (6) – close but not 5.

Or adjust third line to “Weep for time’s way.”

Wait, perhaps:

  1. Time ever flows,
  2. Moments fade into the past,
  3. A reminder of fleeting days.

That gives 5-7-5 syllables and reflects on the passage of time with a somber tone.

Yes, that might work. Each line captures part of the essence—time moving, moments passing, leaving behind memories.

Time ever flows,
Moments fade into the past,
A reminder of fleeting days.

Back to top