Skip to content
Menu
vAndu
  • Home
  • Home Lab
  • AI/ML & vGPU
  • Learning Hub
    • Body & Mind
    • Study- Guides & Tutorials
      • Virtualization environment is your Shaolin Temple, where you train your IT skills
      • vSphere/Workstation Pro is like a Lego suitable for all ages
      • VCP Certification Practical Prep. Guide
      • How to Study IT
      • Home Lab Building Guide for Beginners.
      • Video- Nvidia Tesla K80+ vSphere 7 U3+ Horizon + W11- Complete Installation and Setup Guide for Gaming
        • 01 Nvidia Tesla K80+ vSphere 7 U3+ Horizon + W11- Complete Installation and Setup Guide for Gaming
        • 02 VMware vSphere 7 + Horizon+ Nvidia Tesla K80 + Complete Installation and Setup Guide for Gaming
        • 03 Tesla K80 error Operation Failed Module DevicePowerOn Power on failed
        • 04 GAMING Tesla K80 VMWARE HORIZON CLIENT Fix the mouse cursor
        • 05 Benchmark Tesla K80 Heaven Benchmark 4 0 max graphics part1
        • 06 Benchmark Tesla K80 Heaven Benchmark 4 0 max graphics Part2
        • 07 Benchmark Tesla K80 3DMark Fire Strike & Time Spy
        • 08 GAMING Tesla K80 Shadow of the Tomb Raider v1 0 max graphics
        • 09 GAMING Tesla K80 Need for Speed Payback max graphics
        • 10 GAMING Tesla K80 Apex Legends max graphics
        • 11 GAMING Tesla K80 Deus Ex Mankind Divided max graphics
        • 12 GAMING Tesla K80 Deus Ex Breach max graphics
        • 13 GAMING Tesla K80 Overwatch max graphics
        • 14 GAMING Tesla K80 Trepang 2 max graphics
        • 15 GAMING Tesla K80 Overwatch max graphics Part 2
        • Crysis Remastered Gameplay VMware vSphere 7 U3 + Horizon + Nvidia Tesla K8
        • Deus Ex: Mankind Divided Gameplay. VMware vSphere 7 U3 and Nvidia Tesla K80
        • Overwatch Gameplay VMware vSphere 7 U3 + Horizon + Nvidia Tesla K8
  • YouTube
    • Home Lab YouTube Channel
    • Home Lab Cloud Gaming
  • Podcasts
    • Home Labbers Podcast
  • Security
  • Other
vAndu

Quick and Easy Guide to Installing Meta Llama 3.1 405B, 70B, 8B Language Models with Ollama, Docker, and OpenWebUI

Posted on July 28, 2024July 28, 2024

I will show how easy and quick it is to install Llama 3.1 405B, 70B, 8B, or another language model on your computer or VM using Ollama, Docker, and OpenWebUI. It is so simple to install that even a grandmother or grandfather could do it. This is private AI, not cloud-based. All data is on your own computer. OpenWebUI is a front interface similar to OpenAI ChatGPT’s interface.

I will demonstrate how to install it on Windows 11, but it can be installed just as quickly on Apple Mac Silicon or various Linux distributions. My instructions for Linux versions will come later.

You can view the language model size by going to Ollama > Models and selecting, for example, Llama 3.1. Each language model is listed with:

For example, the 405B model shows 231GB. This means that the 405B language model requires 231GB of VRAM or device RAM. Keep in mind that Windows and any other software on your computer also need RAM. You need to have as much GPU VRAM as the language model requires to run it successfully. If your GPU has less RAM than the language model needs, the model will automatically use the computer/server/VM’s RAM and the CPU instead of the GPU. If there is not enough RAM, the computer/server/VM can become unstable because the language model will use all the RAM and may crash. So, make sure you have enough resources before starting.

Small language models like 8B are fine to run on the CPU, but using a GPU makes them lightning fast. It’s really cool, and I highly recommend experimenting with it.

Here’s how it works: it first reads the language model into VRAM/RAM and then starts responding. For larger language models, this RAM reading can take some time.

The most important factor is the size of the GPU VRAM. Next is the amount of RAM in the server/PC/VM, and then comes the CPU speed, but not so much the number of cores. I did not notice a significant difference whether it was a 120-core or a 16-core machine. However, the processor’s Hz does matter.

Watch the video on how I set this up quickly and follow along on your computer.

Quick Steps:

  • Download Ollama and install it on your computer. https://ollama.com/
  • Close Ollama
  • Set up variables for Ollama, specifying where you want the language models to be stored.
    • Advanced system settings
    • Environment Variables
    • Variable name: OLLAMA_MODELS
    • Variable value: Drive/Folder location
  • Start Ollama
  • Install Docker on your computer. https://www.docker.com
  • Launch a browser and go to your OpenWebUI address. https://openwebui.com/
  • Install OpenWebUI in your Docker setup.
    • Open Terminal (administrator rights)
    • Copy and paste the command
    • docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
  • Launch a browser and http://localhost:3000/
  • Create a user in OpenWebUI.
  • Install language models via OpenWebUI.
  • Install language models via Terminal

Note: You can also install language models via the terminal, but sometimes they may not be visible in OpenWebUI.

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Like this:

Like Loading...

Home Labber who likes to build things and push it to the limits. vSphere is like Lego for adults.

“The fastest way to learn IT is 80% labbing and 20% studying theory. Just do it and have fun.” – vAndu

“If you wish to achieve worthwhile things in your personal and career life, you must become a worthwhile person in your own self-development” – Brian Tracy

VMware vExpert 2023
VMware vExpert NSX
VMware vExpert Pro
©2025 vAndu | Powered by SuperbThemes!
Menu
vAndu
  • Home
  • Home Lab
  • AI/ML & vGPU
  • Learning Hub
    • Body & Mind
    • Study- Guides & Tutorials
      • Virtualization environment is your Shaolin Temple, where you train your IT skills
      • vSphere/Workstation Pro is like a Lego suitable for all ages
      • VCP Certification Practical Prep. Guide
      • How to Study IT
      • Home Lab Building Guide for Beginners.
      • Video- Nvidia Tesla K80+ vSphere 7 U3+ Horizon + W11- Complete Installation and Setup Guide for Gaming
        • 01 Nvidia Tesla K80+ vSphere 7 U3+ Horizon + W11- Complete Installation and Setup Guide for Gaming
        • 02 VMware vSphere 7 + Horizon+ Nvidia Tesla K80 + Complete Installation and Setup Guide for Gaming
        • 03 Tesla K80 error Operation Failed Module DevicePowerOn Power on failed
        • 04 GAMING Tesla K80 VMWARE HORIZON CLIENT Fix the mouse cursor
        • 05 Benchmark Tesla K80 Heaven Benchmark 4 0 max graphics part1
        • 06 Benchmark Tesla K80 Heaven Benchmark 4 0 max graphics Part2
        • 07 Benchmark Tesla K80 3DMark Fire Strike & Time Spy
        • 08 GAMING Tesla K80 Shadow of the Tomb Raider v1 0 max graphics
        • 09 GAMING Tesla K80 Need for Speed Payback max graphics
        • 10 GAMING Tesla K80 Apex Legends max graphics
        • 11 GAMING Tesla K80 Deus Ex Mankind Divided max graphics
        • 12 GAMING Tesla K80 Deus Ex Breach max graphics
        • 13 GAMING Tesla K80 Overwatch max graphics
        • 14 GAMING Tesla K80 Trepang 2 max graphics
        • 15 GAMING Tesla K80 Overwatch max graphics Part 2
        • Crysis Remastered Gameplay VMware vSphere 7 U3 + Horizon + Nvidia Tesla K8
        • Deus Ex: Mankind Divided Gameplay. VMware vSphere 7 U3 and Nvidia Tesla K80
        • Overwatch Gameplay VMware vSphere 7 U3 + Horizon + Nvidia Tesla K8
  • YouTube
    • Home Lab YouTube Channel
    • Home Lab Cloud Gaming
  • Podcasts
    • Home Labbers Podcast
  • Security
  • Other
%d