Ollama

Ollama is a lightweight, extensible framework designed for building and running large language models (LLMs) on local machines. It provides a command-line interface (CLI) that facilitates model management, customization, and interaction. Here’s a comprehensive guide to using Ollama, including essential commands and examples.
Author

Benedict Thekkel

1. Installation

  • For Linux:

    Open your terminal and execute:

    curl -fsSL https://ollama.com/install.sh | sh

    This command downloads and installs Ollama on your system.

2. System Requirements

  • Operating System: macOS or Linux
  • Memory (RAM): Minimum 8GB; 16GB or more recommended
  • Storage: At least 10GB of free space
  • Processor: Modern CPU from the last 5 years

3. Basic CLI Commands

  • Start the Ollama Server:

    To run Ollama without the desktop application:

    ollama serve
  • Download a Model:

    To download a specific model:

    ollama pull <model-name>

    Replace <model-name> with the desired model’s name, e.g., llama3.2.

  • List Downloaded Models:

    To view all models available on your system:

    ollama list
  • Run a Model:

    To start a model and enter an interactive session:

    ollama run <model-name>

    For example:

    ollama run llama3.2
  • Stop a Running Model:

    To stop a specific model:

    ollama stop <model-name>
  • Remove a Model:

    To delete a model from your system:

    ollama rm <model-name>
  • Display Model Information:

    To view details about a specific model:

    ollama show <model-name>
  • List Running Models:

    To see which models are currently active:

    ollama ps
  • Access Help:

    For a list of available commands and their descriptions:

    ollama help

4. Model Customization

Ollama allows users to customize models using a Modelfile. This file specifies the base model and any modifications, such as system prompts or parameters.

  • Create a Modelfile:

    Create a file named Modelfile with the following content:

    FROM llama3.2
    
    SYSTEM "You are an AI assistant specializing in environmental science. Answer all questions with a focus on sustainability."
    
    PARAMETER temperature 0.7
  • Build the Custom Model:

    Use the ollama create command to build the model:

    ollama create my_custom_model -f ./Modelfile
  • Run the Custom Model:

    Start the customized model:

    ollama run my_custom_model

5. Using Ollama with Files

  • Summarize Text from a File:

    To summarize the content of input.txt:

    ollama run llama3.2 "Summarize the content of this file." < input.txt
  • Save Model Responses to a File:

    To save the model’s response to output.txt:

    ollama run llama3.2 "Explain the concept of quantum computing." > output.txt

6. Common Use Cases

  • Text Generation:

    • Content Creation:

      Generate an article on a specific topic:

      ollama run llama3.2 "Write a short article on the benefits of renewable energy." > article.txt
    • Question Answering:

      Answer specific queries:

      ollama run llama3.2 "What are the latest advancements in artificial intelligence?"
  • Data Analysis:

    • Sentiment Analysis:

      Analyze the sentiment of a given text:

      ollama run llama3.2 "Determine the sentiment of this review: 'The product exceeded my expectations.'"

7. Python Integration

Ollama Python Library

The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.

Prerequisites

  • Ollama should be installed and running
  • Pull a model to use with the library: ollama pull <model> e.g. ollama pull llama3.2
    • See Ollama.com for more information on the models available.

Install

pip install ollama

Usage

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='llama3.2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)

See _types.py for more information on the response types.

Streaming responses

Response streaming can be enabled by setting stream=True.

from ollama import chat

stream = chat(
    model='llama3.2',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

Custom client

A custom client can be created by instantiating Client or AsyncClient from ollama.

All extra keyword arguments are passed into the httpx.Client.

from ollama import Client
client = Client(
  host='http://localhost:11434',
  headers={'x-some-header': 'some-value'}
)
response = client.chat(model='llama3.2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])

Async client

The AsyncClient class is used to make asynchronous requests. It can be configured with the same fields as the Client class.

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  response = await AsyncClient().chat(model='llama3.2', messages=[message])

asyncio.run(chat())

Setting stream=True modifies functions to return a Python asynchronous generator:

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  async for part in await AsyncClient().chat(model='llama3.2', messages=[message], stream=True):
    print(part['message']['content'], end='', flush=True)

asyncio.run(chat())

API

The Ollama Python library’s API is designed around the Ollama REST API

Chat

ollama.chat(model='llama3.2', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])

Generate

ollama.generate(model='llama3.2', prompt='Why is the sky blue?')

List

ollama.list()

Show

ollama.show('llama3.2')

Create

ollama.create(model='example', from_='llama3.2', system="You are Mario from Super Mario Bros.")

Copy

ollama.copy('llama3.2', 'user/llama3.2')

Delete

ollama.delete('llama3.2')

Pull

ollama.pull('llama3.2')

Push

ollama.push('user/llama3.2')

Embed

ollama.embed(model='llama3.2', input='The sky is blue because of rayleigh scattering')

Embed (batch)

ollama.embed(model='llama3.2', input=['The sky is blue because of rayleigh scattering', 'Grass is green because of chlorophyll'])

Ps

ollama.ps()

Errors

Errors are raised if requests return an error status or if an error is detected while streaming.

model = 'does-not-yet-exist'

try:
  ollama.chat(model)
except ollama.ResponseError as e:
  print('Error:', e.error)
  if e.status_code == 404:
    ollama.pull(model)
import ollama
from IPython.display import Markdown, display
response = ollama.generate(model='deepseek-r1:1.5b', prompt='How to create a django project')
display(Markdown(response['response']))

Okay, so I need to figure out how to create a Django project. Hmm, where do I even start? I’ve heard about Django before, but I’m not entirely sure how it all works. Let me think.

First, I guess I should check if Django is available on my system. I can try running the command python --m dj or maybe python-dotnet install. That sounds familiar; I remember seeing that before. So, if the command succeeds, that means I have a Django project set up and ready to go.

Now, what’s the structure of a typical Django project? I think there are directories named after different things. Oh right! There are app-related directories like auth, models, backends, and so on. Each of these has its own purpose. The main one is Auth, which contains all the settings related to authentication.

Let me list out what I see in that directory: - auth/__init__.py: That’s where most of the basic settings for auth are defined, right? I remember something about middleware like authentiative and maybe auth_token. There are options for using TSL instances or custom tokens. - auth/settings.py: This is probably where all the environment variables go. Variables like DEFAULT_AUTO_FIELD might be set here because I think that determines how fields are stored, either as Python objects or as strings. I’ve heard about models being serialized to either strings or objects, so maybe this file handles that. - auth/keys.py: That must be the keys provider for auth settings. It’s responsible for generating encryption keys necessary for authentication. - backends and other app-related directories: Those probably handle the database backends like SQLite, MySQL, etc.

I’m a bit confused about what each of these files does exactly. Let me try to break it down. The auth/__init__.py is where I set up the default middleware for authentication. So if I want to use TSL instances instead of the built-in ones, that’s probably in there. But how do I enable it? Oh right, you have to import the settings into other files.

Then there’s settings.py, which defines environment variables like TSL secp256r1 or something similar. That seems important because if your app is using a specific elliptic curve, you need to set that variable. But sometimes I don’t want to use an external one; maybe just the default? How does that work?

I’ve also heard about autowiring. Oh yeah, in Django, when you create models, you can define them autowireable by setting autowireable=True. That way, the model gets a Model instance automatically, which might make things simpler if I’m not doing all the setup manually.

What about middleware for login and logout? There’s something like authMiddleware, but how do I enable that in settings.py? Maybe it’s an environment variable or another setting. Or perhaps it’s handled through the app’s autowiring configuration?

I think I should make sure my project has exactly one database backend, which could be a SQLite or MySQL instance. So I need to set this up in the db directory. How do I do that? Do I create a separate file, maybe db.py, and configure it there?

Another thing is using user accounts for authentication. I’ve heard about using user accounts like auth_USER with an .Authentication alias or something similar. That way, the settings are more accessible through commands.

I’m also wondering how to handle security tokens. If my app uses TSL instances, I need to define a custom token file and make sure it’s in the right place. Maybe that’s part of the auth/settings.py.

Let me think about dependencies. Django comes with some models and backends, but maybe for production use, I need a third-party database like SQLite or MySQL. How do I handle both development and production environments? Oh right, in production, you might need to configure your database separately with environment variables.

I should probably set up my project structure first. Create an app.py file where all the models will be defined. Then create separate files for each app’s settings if needed. But how about the dependencies between apps? I think it’s okay as long as they have their own database backends.

Wait, what about if I don’t want to use a custom auth instance or TSL? Maybe I can just use the default built-in one by setting environment variables like DEFAULT_AUTO_FIELD in settings.py. That might be simpler for new projects.

I’m also curious about how to manage my Django project during development versus production. For development, maybe it’s easier to keep things simple and not include all the configuration files. But for a well-optimized app, I should probably have all necessary parts set up and use environment variables appropriately.

Let me try to outline the steps I need to take: 1. Install Django using python-dotnet install if needed. 2. Create an app directory. 3. Set up the app-specific directories with their respective files, handling auth settings, database backends, etc. 4. Ensure that all necessary dependencies are installed in development or production separately.

Wait, but when I run python-dotnet install, does it automatically create the Django project in my directory? Or do I need to explicitly set up some things before that?

Also, how do I access an app if I’m running this on multiple machines? Maybe through a shared repository so others can load it without rebuilding everything. That might be beyond my current task.

I should probably start by creating the app and setting up its settings. Then move on to installing dependencies in development. But for production, I need separate configuration files or maybe use environment variables directly.

Hmm, another thing: what about custom backends that are not part of Django’s default list? I think it’s allowed as long as they fit within the core requirements.

I’m also a bit fuzzy on how Django handles user accounts and authentication middleware. Maybe if I want to create an API endpoint with a different authentication method, I need to set specific variables or configurations there.

Let me try to write down what I know so far: - Start by creating an app directory. - Navigate into auth and create the necessary files for auth settings. - Define environment variables like TSL parameters if needed. - Set up autowiring for models. - Configure database backends in separate files. - Manage dependencies appropriately during development vs production.

I think that’s a good start. Now, I should probably try to sketch out an example project structure and see what each file does. Maybe look at existing Django examples online to see how they handle the app setup.

Oh, and there are some best practices in Django, like using relative imports when defining models and having clean configuration files. That might help keep my code organized as I build the project.

I guess the next step after setting up the basic structure is to write the auth/__init__.py file with the necessary settings. Then move on from there. For now, that’s where I should focus to get a working base for Django.

To create a Django project, follow these organized steps:

  1. Install Dependencies: Use python-dotnet install to set up Django in your current directory.

  2. Create an App Directory:

    • Navigate into the auth subdirectory.
    • Create and run __init__.py, which sets default settings for auth middleware, including enabling authentication-related settings.
  3. Set Up Database Backends (if needed): In separate files like db.py, configure SQLite or MySQL with environment variables if required.

  4. Define Environment Variables:

    • Ensure custom TSL parameters or .Authentication alias are set in production environments.
    • Avoid duplicating common values in production to prevent conflicts.
  5. Use Autowiring for Models: Define models autowireable by setting autowireable=True so they fetch instances from Python automatically.

  6. Access Settings and Database:

    • For development, access settings via the console or CLI.
    • Access database connections using environment variables in production.
  7. Install Environment Variables (for Production): Use separate configuration files if your app requires specific database setups for production.

  8. Manage Dependencies: In development, install necessary dependencies with pip3 install .... For production, use a distinct repository or configure separately.

By following these steps, you can create a Django project that starts with setting up core components and gradually adds more features as needed.

Back to top