this post was submitted on 21 Dec 2023
273 points (97.9% liked)

Fediverse

32478 readers
641 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)

founded 2 years ago
MODERATORS
 

Hey everyone,

This isn't an announcement, just wanted peoples thoughts on this.

I think everyone knows searching the fediverse can be better. Googling doesn't work too well, etc. So I wanted to do my part and help out.

Indexing all posts, etc is quite a lot to handle, so I wanted to start small and just focus on video search. I've started indexing videos from Peertube and other video websites. (Even YouTube but this could be removed to just focus on independent sites)

I know Peertube has their own search engine for videos. I will be reaching out to them. Compared to my site I'm planning it'll have other video sources and be easier to use.

So that leads to feedback from you guys.

  • What do you think about indexing videos posted on the fediverse and other independent platforms?
  • Are there similar services?
  • Am I just wasting my time?
top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 32 points 1 year ago (16 children)

I found FediSearch, and also this post basically saying that a fediverse search engine would just be used as a tool by trolls.

[–] [email protected] 11 points 1 year ago

It's worth noting that since FedSearch, Mastodon has actually natively implemented opt-in search on posts.

[–] [email protected] 9 points 1 year ago (4 children)

That’s a good point. But those people can be banned? I guess Reddit handles this by moderation and archiving old posts.

[–] [email protected] 4 points 1 year ago

Yes, but moderation teams on the fediverse are very small, and by nature of it, can make hundreds of account of different servers all trailing that would need to be individually sought out and banned.

It is a game of cat & 100 mice

load more comments (3 replies)
[–] [email protected] 4 points 1 year ago (1 children)

That post wasn't claiming that a search engine would only be used by trolls; it was explaining that they shut down their project because a chunk of the fediverse thinks that and complain about any search engine projects. Discoverability is one of the network's biggest challenges and a search engine could really help with that.

[–] [email protected] 4 points 1 year ago (1 children)

Yes, not only used by trolls, but would be a tool that could be leveraged by trolls. And I think the fediverse makes it easier to establish instances for marginalized groups, but also has more admins that just don't want trolls because nobody here is making $ off them like the corporate socials are. I think if adding search that is going to try and vacuum up everyone's posts in the fediverse and make them easily sortable/targetable without instance admins permission, then that isn't cool. If someone is running a general instance that covers nothing that a troll could latch onto and wants the instance catalogued and searchable then that's fine by me. I don't think boys should be doing that to the fediverse as a whole without admin permission though.

[–] [email protected] 2 points 1 year ago (3 children)

I don't think an admin's permission has anything to do with it. If you post publicly on the fediverse, your posts are public. You should have the option to opt out of any indexing (just like you do for the rest of the open web). But saying its ok for you to read this post if it happens to come across your feed but you shouldn't be allowed to find it via a search is ridiculous. Users get to make the choice with each post whether its public or not, but they don't get to control how people consume those public posts.

load more comments (3 replies)
load more comments (13 replies)
[–] [email protected] 24 points 1 year ago* (last edited 1 year ago) (1 children)

Well, please make sure it respects post privacy at least but also realize that on the microblogging side of the fediverse, they may not take kindly to this prospect at all. People who start these kinds of projects are often harassed or at least receive passive hostility. Making it opt in instead out of opt out in some capacity is best.

[–] [email protected] 20 points 1 year ago (1 children)

A good search engine would be quite important. One thing that annoyed me back on the site that should not be named was that their search engine was completely useless - It was not even capable to find posts where I entered verbatim text of.

Having a good search engine that can actually find a post I was looking for would be a major plus for the fediverse.

[–] [email protected] 4 points 1 year ago

Yep, the idea is to simulate the type of results you get from Google. People trust Lemmy answers more than spam sites now a days.

[–] [email protected] 18 points 1 year ago (1 children)

Why wouldnt people want do have search engine? Without it Fediverse stands no chance against non-free internet. Everything posted here would be much more valuable if it was searchable. Now comment posted once is viewed only until post gets less popular. Any other site of this kind displays answers decades old. Privacy isnt issue as everything posted here is available to everyone on internet.

[–] [email protected] 4 points 1 year ago (1 children)

I mean they are posting on the public internet, they should know that it can be read by anyone. I like the idea of users opting out.

[–] [email protected] 12 points 1 year ago (1 children)

If you give users the choice to opt out, all the privacy-focussing communities won't be searchable.
What if someone who opts out posts a comment and someone who opts in answers?
The Fediverse is public, a search engine doesn't show anything that isn't already open for anyone to see.

[–] [email protected] 8 points 1 year ago

Good point. They should know they are making public comments. If you want it private then send a private message.

[–] [email protected] 11 points 1 year ago (1 children)

Is this something you can point yacy at?

[–] [email protected] 2 points 1 year ago (2 children)

I heard it’s not optimized well but I’ll take a look at it.

load more comments (2 replies)
[–] [email protected] 11 points 1 year ago (2 children)

You should federate the search engine so that folks can defed from the search as desired.

But then we would need a search engine for all the search engines...

[–] [email protected] 17 points 1 year ago* (last edited 1 year ago)

Everything you post on the Fediverse is public.
If you don't want to show up in an internet search, post your stuff on your private server and only give access to the people you want to invite.

[–] [email protected] 8 points 1 year ago (1 children)

There is already Sepiasearch specifically for Peertube.

[–] [email protected] 5 points 1 year ago

It's limited to only Peertube and it's not the most intuitive. I want to work with them on expanding this.

[–] [email protected] 8 points 1 year ago (1 children)

I love the idea, especially from a technical standpoint!

How big is the fediverse today? How many posts are there? What kind of algorithms atmre you using to store the results? Do you scan sites and then their connected sites or do you have a premade list?

More technical information please 😊!

[–] [email protected] 3 points 1 year ago

The fediverse is a few thousand servers, from Mastodon, Lemmy, etc. Can't say the amount of posts but there are a lot.

So on the more technical side, I plan on using a light weight fast search engine called Sonic (It's written in rust). I have already used it in other projects and it can handle billions of messages / posts. But it has a cost it doesn't have faceted search, like for example if you want to exclude certain texts from the results. I think this is a fair trade off. The other solution would be to use something more mature like ElasticSearch but it'll be expensive (I'm assuming not much money will be made from this and I'm talking about donations)

For scanning sites there are premade lists to start with and it'll be possible to scan new sites from other instances if found. So a bit of both.

[–] [email protected] 4 points 1 year ago (1 children)

I support bigger picture. Rather than an independent site, wouldn't it be more practical to work with current fediverse app developers for lemmy, mastodon, etc to integrate search engine within the app?

[–] [email protected] 2 points 1 year ago

I’m reaching out to see their thoughts. But there are limitations to what they can index.

[–] [email protected] 2 points 1 year ago

The fedi elite will snipe you from a bell tower.

load more comments
view more: next ›