This has happened to me personally. I have a very small git server sitting on a very small instance. I noticed my traffic was spiking for the past few days and thought nothing of it.
After I looked at the logs, it was 6 ip addresses. 1 Amazon LLM search and open AI with the other 5. I added them t the fail2ban and thought nothing of it. About 10 min later, openAI had a multitude of new IPs hitting the server again, specifically the git repos. I again looked up to confirm that, yes it totally was OpenAI.
I had to create an AI blackhole with python and create new rules just to stop the LLM madness. It worked but im tempted to put a very small capta like give me 2+2 on a form in order to see my code.
Worst part, I had a robot.txt that totally blocked the indexing of my sites. But they just didnt care.
Its making the open web hard to keep up for small entities thats for sure.