It’s literally tokens. Doesn’t matter if it completes the next word or next phrase, still completing the next most likely token 😎😎 can’t think can’t reason can witch’s brew facsimile of something done before
glizzyguzzler
You can prove it’s not by doing some matrix multiplication and seeing its matrix multiplication. Much easier way to go about it
Too deep on the AI propaganda there, it’s completing the next word. You can give the LLM base umpteen layers to make complicated connections, still ain’t thinking.
The LLM corpos trying to get nuclear plants to power their gigantic data centers while AAA devs aren’t trying to buy nuclear plants says that’s a straw man and you simultaneously also are wrong.
Using a pre-trained and memory-crushed LLM that can run on a small device won’t take up too much power. But that’s not what you’re thinking of. You’re thinking of the LLM only accessible via ChatGPT’s api that has a yuge context length and massive matrices that needs hilariously large amounts of RAM and compute power to execute. And it’s still a facsimile of thought.
It’s okay they suck and have very niche actual use cases - maybe it’ll get us to something better. But they ain’t gold, they ain't smart, and they ain’t worth destroying the planet.
Can’t help but here’s a rant on people asking LLMs to “explain their reasoning” which is impossible because they can never reason (not meant to be attacking OP, just attacking the “LLMs think and reason” people and companies that spout it):
LLMs are just matrix math to complete the most likely next word. They don’t know anything and can’t reason.
Anything you read or hear about LLMs or “AI” getting “asked questions” or “explain its reasoning” or talking about how they’re “thinking” is just AI propaganda to make you think they’re doing something LLMs literally can’t do but people sure wish they could.
In this case it sounds like people who don’t understand how LLMs work eating that propaganda up and approaching LLMs like there’s something to talk to or discern from.
If you waste egregiously high amounts of gigawatts to put everything that’s ever been typed into matrices you can operate on, you get a facsimile of the human knowledge that went into typing all of that stuff.
It’d be impressive if the environmental toll making the matrices and using them wasn’t critically bad.
TLDR; LLMs can never think or reason, anyone talking about them thinking or reasoning is bullshitting, they utilize almost everything that’s ever been typed to give (occasionally) reasonably useful outputs that are the most basic bitch shit because that’s the most likely next word at the cost of environmental disaster
I got my parents to get a NAS box, stuck it in their basement. They need to back up their stuff anyway. I put in 2 18 TB drives (mirrored BTRFS raid1) from server part deals (peeps have said that site has jacked their prices, look for alts). They only need like 4 TB at most. I made a backup samba share for myself. It’s the cheapest symbology box possible, their software to make a samba share with a quota.
I then set up a wireguard connection on an RPi, taped that to the NAS, and wireguard to the local network with a batch script. Mount the samba share and then use restic to back up my data. It works great. Restic is encrypted, I don’t have to pay for storage monthly, their electricity is cheap af, they have backups, I keep tabs on it, everyone wins.
Next step is to go the opposite way for them, but no rush on that goal, I don’t think their basement would get totaled in a fire and I don’t think their house (other than the basement) would get totaled in a flood.
If you don’t have a friend or relative to do a box-at-their-house (peeps might be enticed with reciprocal backups), restic still fits the bill. Destination is encrypted, has simple commands to check data for validity.
Rclone crypt is not good enough. Too many issues (path length limits, password “obscured” but otherwise there, file structure preserved even if names are encrypted). On a VPS I use rclone to be a pass-through for restic to backup a small amount of data to a goog drive. Works great. Just don’t fuck with the rclone crypt for major stuff.
Lastly I do use rclone crypt to upload a copy of the restic binary to the destination, as the crypt means the binary can’t be fucked with and the binary there means that is all you need to recover the data (in addition to the restic password you stored safely!).
I was hoping the distros would just do the scrub/balance work for you - makes it no effort then! Good to know OpenSUSE does it for ya. Searching it looks like Fedora doesn’t have anything built in sadly, but the posts are +1 yr old so maaaybe they’ve done something.
It’s great for single drive, raid 0, and raid 1. Don’t use it for more raid, it is not acceptable for that (raid 10 obv ok). It still can lose data for raid 5/6 still.
I’m not sure of the tools that Fedora includes to manage BTRFS but these scripts are great https://github.com/kdave/btrfsmaintenance you use them to scrub and balance. Balance is for redistributing blocks and scrub checks if bits have unexpectedly changed due to bit rot (hardware issue or cosmic ray). Scrub weekly for essential photos, important docs, and the like. Monthly for everything else. Balance monthly, or on demand if free drive space is tight and you want a bit more bits.
RAID 1 will give you bit rot detection with scrub and self-recover said bit rot detection (assuming both drives don’t mystically have the same bit flip, which is very unlikely). Single drive will just detect.
BTRFS snapshot then send/receive is excellent for a quick backup.
Remember that a BTRFS snapshot will keep all files in the snapshot, even if you delete them off the live drive. Delete 500 GB of stuff, but the space didn’t reduce? Probably a snapshot is remembering that 500 GB. Delete the snapshot and your space is back.
You can make sub volumes inside a BTRFS volume, which are basically folders but you can snapshot just them. Useful for scrubbing your essential docs folder more often than everything else, or snapshotting more often too.
Lastly, you can disable copy-on-write (cow) for volumes. Reduces their safety but increases write speed, good for caches and I’ve read VM drive images need it for performance.
Overall, great. Built-in and no need to muck with ZFS’s extra install steps, but you get the benefits ZFS has (as long as you’re ok to be limited to RAID 1)
Odd, I’ll try to deploy this when I can and see!
I’ve never had a problem with a volume being on the host system, except with user permissions messed up. But if you haven’t given it a user parameter it’s running as root and shouldn’t have a problem. So I’ll see sometime and get back to you!
That’s pretty damn clever
I try to slap anything I’d face the Internet with with the read_only to further restrict exploit possibilities, would be abs great if you could make it work! I just follow all reqs on the security cheat sheet, with read_only
being one of them: https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html
With how simple it is I guessed that running as a user
and restricting cap_drop: all
wouldn’t be a problem.
For read_only
many containers just need tmpfs: /tmp
in addition to the volume for the db. I think many containers just try to contain temporary file writing to one directory to make applying read_only
easier.
So again, I’d abs use it with read_only
when you get the time to tune it!!
Looks awesome and very efficient, does it also run with read_only: true
(with a db volume provided, of course!)? Many containers just need a /tmp, but not always
Improper comparison; an audio file isn’t the basic action on data, it is the data; the audio codec is the basic action on the data
“An LLM model isn’t really an LLM because it’s just a series of numbers”
But the action of turning the series of numbers into something of value (audio codec for an audio file, matrix math for an LLM) are actions that can be analyzed
And clearly matrix multiplication cannot reason any better than an audio codec algorithm. It’s matrix math, it’s cool we love matrix math. Really big matrix math is really cool and makes real sounding stuff. But it’s just matrix math, that’s how we know it can’t think