this post was submitted on 20 Mar 2025
139 points (100.0% liked)

Fediverse

31904 readers
596 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)

founded 2 years ago
MODERATORS
 

I made a Lemmy instance with a custom algorithm that keeps only the top 20% most unique (=interesting?) posts. It does this by calculating a similarity score between every post on my instance and all posts that came before it. The top 80% of posts with the highest self-similarity get removed instantly.

The idea would be that this allows me to cut through the noise that's running through the communities, similar to how xkcd-signal attempted to do 20 years ago.

The instance is mostly meant for reading, not posting. So it has a very open federation policy (for now).

If anything, this is experimental. So please let me know what you think! You can see the type of stuff that gets removed in the modlog (https://lemmy.coffee/modlog).

top 17 comments
sorted by: hot top controversial new old
[–] [email protected] 26 points 1 day ago

If there are a bunch of posts on a particular topic, shouldn’t it keep at least one of them? Otherwise it would tend to completely filter out the most significant or interesting topics.

[–] [email protected] 16 points 1 day ago* (last edited 1 day ago) (1 children)

Interesting. One of my instance's guiding philosophies is "Quality over Quantity". I've taken different steps toward achieving that (defederate from the Reddit repost instances, disallow pretty much all content bots, manually/locally mod duplicate posts, etc).

Do you plan to publish your algorithm/filter? Would be interested in seeing if it could be tuned and possibly reduce some of the workload for me.

[–] [email protected] 10 points 1 day ago (1 children)

Do you plan to publish your algorithm/filter?

In an ideal world sure. But I'd have to think about that some more, because in principle I don't want people to game it :)

[–] [email protected] 11 points 1 day ago (1 children)

Lemmy's license is AGPL, so you would need to at least publish changes to Lemmy itself 😉

(I don't know if e.g. the code for the algorithm is separate, in order to have a closed source algorithm with an open source Lemmy fork)

[–] [email protected] 2 points 1 day ago (1 children)

Does GPL/AGPL require you to publish the code even if you are not selling the software? As in I could run a library computer with my custom Linux distro without giving anyone the source, but I wouldn't be able to publish it or sell it only as binary blobs, right?

[–] [email protected] 6 points 1 day ago (1 children)

Selling is outside the scope of the licence, you can do whatever you want with monetisation, be it free or paid-for.
But any one person that uses your GPL if local, AGPL if local or through a remote service, has the right to request you a copy of the code and you have an obligation to comply and provide it

[–] [email protected] 2 points 1 day ago
[–] [email protected] 7 points 1 day ago

I was curious what would happen to the ratio of political posts, specifically Trump/Elon, to other communities, but it feels >= the amount as All on lemmy.world.

None of my superb owls look to have made it through, but I didn't see them removed in the mod log. We're a pretty large community, so I'd have thought some would have gotten through. I don't recall if I saw stuff from any of the animal comms.

[–] [email protected] 9 points 1 day ago (1 children)
[–] [email protected] 6 points 1 day ago

That's the idea, yes.

[–] [email protected] 3 points 1 day ago (1 children)

I've been through a few pages, could only find this post about Lemmy apps from [email protected] ; https://lemmy.coffee/post/6860?scrollToComments=true , with a single comment (mine).

No posts from [email protected], while it is much more active. Do you know why?

On the other hand, [email protected] , [email protected] and [email protected] seem to be doing fine

[–] [email protected] 4 points 1 day ago (2 children)

I added the larger communities before starting to remove posts, so there may be historical posts still hanging around. Maybe everything from BuyFromEU was deleted?

You can see the kind of stuff that stays best via homepage ALL > Top last N hours

[–] [email protected] 1 points 1 day ago

Cool Idea!, Created an account to check it out.

[–] [email protected] 1 points 1 day ago (1 children)

This is very interesting. I've been thinking of how a similar but different system could be implemented. The front page of any instance always seems to have two to four posts by the same person, and I've been following a rule that if I notice it I block them to remove their clutter of posts from my feed. Unfortunately, most of these accounts are brand new ones posting memes, so it feels like for every two I swat down, four more take their place.

I was looking for an instance that allowed me to mute all new accounts, or less likely an instance that filters out posts by the same person, but it doesn't seem such a thing has been created.

This isn't the same that I wanted, and if I'm not mistaken it only applies to posts inside this instance, since the algorithm is removing them? Or does it filter out all similar posts across the fediverse feed? Still, it's close in concept.

[–] [email protected] 2 points 1 day ago

PieFed places an icon next to the username to help highlight such aspects of a person's account. The one for "new account, less than 2 weeks old" is very useful. Others need tweaking such as "potential unregistered bot that posts far more often than comments", and "contentious user with far more downvotes than upvotes". I find it useful not to block people but to simply scroll past or to tailor my response to knowing that info.

In addition to PieFed, there are some Lemmy apps that will do this too, although I am not sure which ones (perhaps check out Sync and Connect) - the trick here is ofc to auto apply it to all accounts as you read through your feed.