this post was submitted on 20 Jun 2025
29 points (100.0% liked)
Linguistics
1246 readers
1 users here now
Welcome to the community about the science of human Language!
Everyone is welcome here: from laypeople to professionals, Historical linguists to discourse analysts, structuralists to generativists.
Rules:
- Instance rules apply.
- Be reasonable, constructive, and conductive to discussion.
- Stay on-topic, specially for more divisive subjects. And avoid unnecessary mentioning topics and individuals prone to derail the discussion.
- Post sources when reasonable to do so. And when sharing links to paywalled content, provide either a short summary of the content or a freely accessible archive link.
- Avoid crack theories and pseudoscientific claims.
- Have fun!
Related communities:
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]
Resources:
Grammar Watch - contains descriptions of the grammars of multiple languages, from the whole world.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If it's a hoax, it's a bloody great job - they started it centuries ago, but it's still trolling us in 2025. It's worth to check how they did it. (@[email protected] posted a video with one way they could've done it.)
And if it's a lost language, my guess is like Bax: Indo-Aryan language, either Romani or related to. That should explain why it's so hard to decode - we're basically looking for your typical European language when it's something way less typical.
One of the big problems with Voynichese is the very peculiar way it behaves structurally, and especially with regards to its characters. Here is a great video about it.
This video has some great info, and it's easier to approach than videos talking directly about the entropy issue.
Hypothetically speaking, if there's a language encoded there, the problem of relatively few letters can be still addressed by multiple phonemes sharing the same letter.
For example, not representing VOT distinctions - so pairs like "peat" vs. "bead" are written the same, even if phonemically distinct. It seems the Southwestern Paleohispanic worked like this, so even if it's syllabary-ish you'd see way less symbols than you'd expect for one (only ~30).
That introduces some complexity though - it's the sort of thing you'd expect to see when a language is trying to adapt the writing system of another, and so far we didn't attest the writing system elsewhere. Plus it makes Ockham's Razor scream bloody murder.
It's also notable for the repetitive strings it produces that is very unlike most (all?) known languages, such as the famous: "qokeedy qokedy shedy tchedy [...] qokal otedy qokedy qokedy dal qokedy qokedy rgam."
It's also possible the manuscript could contain two (or more) different languages, or different encoding methods, depending on what it is. See: https://www.voynich.nu/extra/lang.html