this post was submitted on 20 Mar 2025
497 points (100.0% liked)

Technology

67050 readers
4337 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 6 points 19 hours ago (1 children)

Last I checked, cloudflare requires the user to have JavaScript and cookies enabled. My institution doesn't want to require those because it would likely impact legitimate users as well as bots.

[–] [email protected] 1 points 18 hours ago (1 children)

Huh? I can reach my site via curl that has neither. How did you come up with this random set of requirements?

[–] [email protected] 0 points 17 hours ago (1 children)

Odd. I just tried

curl https://www.scrapingcourse.com/cloudflare-challenge

and got

Enable JavaScript and cookies to continue

I'm clearly not on the same setup as you are, but my off-the-cuff guess is that your curl command was issued from a system that cloudflare already recognized (IP whitelist, cookies, I dunno).

Anyways, I'm reading through this blog post on using cURL with cloudflare-protected sites and I'm finding it interesting.

[–] [email protected] 1 points 15 hours ago

Of course their challenge requires those things. How else could they implement it? Most users will never be presented with a challenge though and it is trivial to disable if you don't want to ever challenge anyone. I was just saying CF blocks ML crawlers.