this post was submitted on 10 May 2024

30 points (100.0% liked)

Selfhosted

46677 readers

389 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

[email protected]

How to detect problems on computer? (lemmy.ml)

submitted 1 year ago by [email protected] to c/[email protected]

16 comments fedilink hide all child comments

My server (fedora) stops all podman containers after 2-3 hours since 3 days. I can start all containers again, and the same happens after a while. I do not know where to look for the problem.

In top, I found a oom message. I assume that the system runs out of memory and stops all services. How can I find the problem? I can’t find anything in the container logs.

I can see that systemctl status is always starting. It doesn’t become “running”. But I do not know how to proceed.

all 17 comments

sorted by: hot top controversial new old

[–] [email protected] 11 points 1 year ago (1 children)

The issue with diagnosing memory issues is that it usually results in no memory available to handle the logging of such a problem when it happens.

I've found that the easieat approach is to set up a file as additional swap space, and swapon, then see if the problem disappears, either partially or fully.

[–] [email protected] 3 points 1 year ago (5 children)

I've got way too much RAM for swap being useful at all. Good idea though.

[–] [email protected] 17 points 1 year ago

There is no such thing as too much RAM...

[–] [email protected] 10 points 1 year ago (1 children)

If something you're running has a memory leak then it doesn't matter how much RAM you have.

You can try adding memory limits to your containers to see if that limits the splash damage. That's to say you would hopefully see only one container (the bad one) dying.

[–] [email protected] 2 points 1 year ago

that's neat. Tank you.

So far I follow a bottom up strategy. I'll keep adding containers each day (or after many hours) and wait for it to stop. I also looked up how to limit memory usage. It's a great idea to limit all containers and see which one fails. thanks!

[–] [email protected] 2 points 1 year ago (1 children)

How do you know that you have too much ram? Have you set up a monitoring solution like influxDB to track ram usage over time?

[–] [email protected] 1 points 1 year ago

I observed it during resource hungry usage. I never had issues with it, not even close.

[–] [email protected] 1 points 1 year ago (1 children)

Then you didn't understand how the system uses swap.

https://chrisdown.name/2018/01/02/in-defence-of-swap.html

[–] [email protected] 1 points 1 year ago

They could mean that they have swap but it's not being used.

[–] [email protected] 1 points 1 year ago

I'm just curious how much RAM you think that is.

[–] [email protected] 6 points 1 year ago

Are you running them from your user session? If so, when you log out it will stop your processes, unless you have enabled 'linger' mode.

[–] [email protected] 3 points 1 year ago (1 children)

When I had the issue with mariadb demon been killed, I think either in dmesg or syslog there was an entry reading "Out of memory: Kill process... " or similar.

[–] [email protected] 3 points 1 year ago

I'll have a look, thx

[–] [email protected] 3 points 1 year ago

I would start all containers except one. If everything works that one is the cause of the problem. Keep trying with a different container every time.

[–] [email protected] 2 points 1 year ago

If you’re seeing an OOM killer messsage note that it doesn’t necessarily kill the problem process, by default the kernel hands out memory upon requestt, regardless of whether it has ram to back the allocation. When a process then writes to the memory (at some later time) and the kernel determines that there is no physical ram to store that write, it then invokes OOM Killer. This then selects a process and kills it. MySQL (and MariaDB) use large quantities of ram for cache, and by default the kernel lies about how much is available, so they often end up using more than the system can handle.

If you have many databases in containers, set memory limits for those containers, that should make all the databases play nicer together. Additionally , you may want to disable overcommit in the kernel, this will cause the kernel to return out of memory to a process attempting to allocate ram and stop lying about free ram to processes that ask, often greatly increasing stability.

[–] [email protected] 2 points 1 year ago

Install atop, basically 'top' on steroids with history.. It defaults to capturing performance data every 5 minutes, I usually change it to 1 minute on production systems.