Large Language Models

1

Welcome to c/llm – a place to discuss large language models (lemmy.world)

submitted 3 months ago* (last edited 3 months ago) by [email protected] to c/[email protected]

0 comments fedilink

Rules

Please tag [not libre software] and [never on-device] services as such (those not green in the License column here).
Be useful to others

Resources

github.com/ollama/ollama
github.com/open-webui/open-webui
github.com/Aider-AI/aider
wikipedia.org/wiki/List_of_large_language_models

2

1

How to install Open WebUI on Arch Linux (Windows guide coming soon) (lemmy.world)

submitted 2 months ago* (last edited 2 months ago) by [email protected] to c/[email protected]

0 comments fedilink

Open WebUI lets you download and run large language models (LLMs) on your device using Ollama.

Install Ollama

See this guide: https://lemmy.world/post/27013201

Install Docker (recommended Open WebUI installation method)

Open Console, type the following command and press return. This may ask for your password but not show you typing it.

sudo pacman -S docker

Enable the Docker service [on-device and runs in the background] to start with your device and start it now.

sudo systemctl enable --now docker

Allow your current user to use Docker.

sudo usermod -aG docker $(whoami)

Log out and log in again, for the previous command to take effect.

Install Open WebUI on Docker

Check whether your device has an NVIDIA GPU.
Use only one of the following commands.

Your device has an NVIDIA GPU:

docker run -d -p 3000:8080 --gpus all -e WEBUI_AUTH=False --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda

Your device has no NVIDIA GPU:

docker run -d -p 3000:8080 -e WEBUI_AUTH=False --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Configure Ollama access

Edit the Ollama service file. This uses the text editor set in the $SYSTEMD_EDITOR environment variable.

sudo systemctl edit ollama.service

Add the following, save and exit.

[Service]
Environment="OLLAMA_HOST=0.0.0.0"

Restart the Ollama service.

sudo systemctl restart ollama

Get automatic updates for Open WebUI (not models, Ollama or Docker)

Create a new service file to get updates using Watchtower once everytime Docker starts.

sudoedit /etc/systemd/system/watchtower-open-webui.service

Add the following, save and exit.

[Unit]
Description=Watchtower Open WebUI
After=docker.service
Requires=docker.service

[Service]
Type=oneshot
ExecStart=/usr/bin/docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui
RemainAfterExit=true

[Install]
WantedBy=multi-user.target

Enable this new service to start with your device and start it now.

sudo systemctl enable --now watchtower-open-webui

(Optional) Get updates at regular intervals after Docker has started.

docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui

Use Open WebUI

Open localhost:3000 in a web browser.

3

1

llm tools - where to begin? (lemmy.world)

submitted 2 months ago by [email protected] to c/[email protected]

0 comments fedilink

I'm running ollama with llama3.2:1b smollm, all-minilm, moondream, and more. I am able to integrate it with coder/code-server, vscode, vscodium, page assist, cli, and also created a discord ai user.

I'm an infrastructure and automation guy, not a developer so much. Although my field is technically devops.

Now, I hear that some llms have "tools." How do I use them? How do I find a list of tools for a model?

I don't think I can simply prompt "Hi llama3.2, list your tools." Is this part of prompt engineering?

What, do you take a model and retrain it or something?

Anybody able to point me in the right direction?

4

1

A2A protocol (lemmy.world)

submitted 3 months ago by [email protected] to c/[email protected]

0 comments fedilink

Did any of you already took a look at the A2A protocol page on github?

5

1

How to use GPUs over multiple computers for local AI? (lemmy.world)

submitted 3 months ago by [email protected] to c/[email protected]

0 comments fedilink

cross-posted from: https://lemmy.dbzer0.com/post/41844010

The problem is simple: consumer motherboards don't have that many PCIe slots, and consumer CPUs don't have enough lanes to run 3+ GPUs at full PCIe gen 3 or gen 4 speeds.

My idea was to buy 3-4 computers for cheap, slot a GPU into each of them and use 4 of them in tandem. I imagine this will require some sort of agent running on each node which will be connected through a 10Gbe network. I can get a 10Gbe network running for this project.

Does Ollama or any other local AI project support this? Getting a server motherboard with CPU is going to get expensive very quickly, but this would be a great alternative.

Thanks

6

1

browser-use: Enable AI to control your browser (github.com)

submitted 3 months ago by [email protected] to c/[email protected]

0 comments fedilink

7

1

Ollama not using AMD GPU on Arch Linux [Fixed] (lemmy.world)

submitted 3 months ago* (last edited 3 months ago) by [email protected] to c/[email protected]

0 comments fedilink

This is an update to a previous post found at https://lemmy.world/post/27013201

Ollama uses the AMD ROCm library which works well with many AMD GPUs not listed as compatible by forcing an LLVM target.

The original Ollama documentation is wrong as the following can not be set for individual GPUs, only all or none, as shown at github.com/ollama/ollama/issues/8473

AMD GPU issue fix

Check your GPU is not already listed as compatibility at github.com/ollama/ollama/blob/main/docs/gpu.md#linux-support
Edit the Ollama service file. This uses the text editor set in the $SYSTEMD_EDITOR environment variable.

sudo systemctl edit ollama.service

Add the following, save and exit. You can try different versions as shown at github.com/ollama/ollama/blob/main/docs/gpu.md#overrides-on-linux

[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=10.3.0"

Restart the Ollama service.

sudo systemctl restart ollama

8

1

How to install Ollama on Arch Linux (lemmy.world)

submitted 3 months ago* (last edited 2 months ago) by [email protected] to c/[email protected]

0 comments fedilink

Ollama lets you download and run large language models (LLMs) on your device.

Install Ollama on Arch Linux

Check whether your device has an AMD GPU, NVIDIA GPU, or no GPU. A GPU is recommended but not required.
Open Console, type only one of the following commands and press return. This may ask for your password but not show you typing it.

sudo pacman -S ollama-rocm    # for AMD GPU
sudo pacman -S ollama-cuda    # for NVIDIA GPU
sudo pacman -S ollama         # for no GPU (for CPU)

Enable the Ollama service [on-device and runs in the background] to start with your device and start it now.

sudo systemctl enable --now ollama

Test Ollama alone

Open localhost:11434 in a web browser and you should see Ollama is running. This shows Ollama is installed and its service is running.
Run ollama run deepseek-r1 in a console and ollama ps in another, to download and run the DeepSeek R1 model while seeing whether Ollama is using your slow CPU or fast GPU.

AMD GPU issue fix

https://lemmy.world/post/27088416

Use with Open WebUI

See this guide: https://lemmy.world/post/28493612

9

1

Llama 3.1 Community License is not a free software license (www.fsf.org)

submitted 3 months ago by [email protected] to c/[email protected]

0 comments fedilink

10

1

happy non-engineers will slowly transition into sad underpaid engineers (cdn.fosstodon.org)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

11

1

Zuck's new Llama is a beast (lemmy.world)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

y2u.be/aVvkUuskmLY

Llama 3.1 (405b) seems 👍. It and Claude 3.5 sonnet are my go-to large language models. I use chat.lmsys.org. Openai may be scrambling now to release Chatgpt 5?

12

1

Marques Brownlee's latest vid is kinda unneeded (y2u.be)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

I'm an avid Marques fan, but for me, he didn't have to make that vid. It was just a set of comparisons. No new info. No interesting discussion. Instead he should've just shared that Wired podcast episode on his X.

I wonder if Apple is making their own large language model (llm) and it'll be released this year or next year. Or are they still musing re the cost-benefit analysis? If they think that an Apple llm won't earn that much profit, they may not make 1.

13

1

What If Someone Steals GPT-4 (LLM data)? | Asianometry [CC] (18:23) (www.youtube.com)

submitted 2 years ago by [email protected] to c/[email protected]

0 comments fedilink

14

1

DALL-E 3 Release (openai.com)

submitted 2 years ago by [email protected] to c/[email protected]

0 comments fedilink

15

1

Vicuna v1.5 Has Been Released! (lemmy.world)

submitted 2 years ago by [email protected] to c/[email protected]

0 comments fedilink

Click Here to be Taken to the Megathread!

from [email protected]

Vicuna v1.5 Has Been Released!

Shoutout to [email protected] for catching this in an earlier post.

Given Vicuna was a widely appreciated member of the original Llama series, it'll be exciting to see this model evolve and adapt with fresh datasets and new training and fine-tuning approaches.

Feel free using this megathread to chat about Vicuna and any of your experiences with Vicuna v1.5!

Starting off with Vicuna v1.5

TheBloke is already sharing models!

Vicuna v1.5 GPTQ

7B

Vicuna-7B-v1.5-GPTQ

Vicuna-7B-v1.5-16K-GPTQ

13B

Vicuna-13B-v1.5-GPTQ

Vicuna Model Card

Model Details

Vicuna is a chat assistant fine-tuned from Llama 2 on user-shared conversations collected from ShareGPT.

Developed by: LMSYS

Model type: An auto-regressive language model based on the transformer architecture

License: Llama 2 Community License Agreement

Finetuned from model: Llama 2

Model Sources

Repository: https://github.com/lm-sys/FastChat

Blog: https://lmsys.org/blog/2023-03-30-vicuna/

Paper: https://arxiv.org/abs/2306.05685

Demo: https://chat.lmsys.org/

Uses

The primary use of Vicuna is for research on large language models and chatbots. The target userbase includes researchers and hobbyists interested in natural language processing, machine learning, and artificial intelligence.

How to Get Started with the Model

Command line interface: https://github.com/lm-sys/FastChat#vicuna-weights

APIs (OpenAI API, Huggingface API): https://github.com/lm-sys/FastChat/tree/main#api

Training Details

Vicuna v1.5 is fine-tuned from Llama 2 using supervised instruction. The model was trained on approximately 125K conversations from ShareGPT.com.

For additional details, please refer to the "Training Details of Vicuna Models" section in the appendix of the linked paper.

Evaluation Results

Vicuna is evaluated using standard benchmarks, human preferences, and LLM-as-a-judge. For more detailed results, please refer to the paper and leaderboard.

16

1

Hello c/llm (lemmy.world)

submitted 2 years ago by [email protected] to c/[email protected]

0 comments fedilink

I noticed there didn't seem to be a community about large language models, akin to r/localllama. So maybe this will be it.

For the uninitiated, you can easily try a bleeding edge LLM in your browser here.

If you loved that, some places to get started with local installs and execution are here-

https://github.com/ggerganov/llama.cpp

https://github.com/oobabooga/text-generation-webui

https://github.com/LostRuins/koboldcpp

https://github.com/turboderp/exllama

and for models in general, the renowned TheBloke provides the best and fastest releases-

https://huggingface.co/TheBloke