theterrasque

joined 2 years ago
[–] [email protected] 1 points 1 week ago

Well, it wasn't a comment on the quality of the model, just that the context limitation has already been largely overcome by one company, and others will probably follow (and improve on it further) over time. Especially as "AI Coding" gets more marketable.

That said, was this the new gemini 2.5 pro you tried, or the old one? I haven't tried the new model myself, but I've heard good things about it.

[–] [email protected] 2 points 1 week ago

Yeah, I've been seeing the same. Purely economically it doesn't make sense with junior developers any more. AI is faster, cheaper and usually writes better code too.

The problem is that you need junior developers working and getting experience, otherwise you won't get senior developers. I really wonder how development as a profession will be in 10 years

[–] [email protected] 0 points 1 week ago (2 children)

Working on a big codebase, I don't even get the idea to ask an AI, you just can't feed enough context to the AI that it's really able to generate meaningful code...

That's not a hard limit, for example google's models can handle 2-million-token context window.

https://ai.google.dev/gemini-api/docs/long-context

[–] [email protected] 3 points 2 weeks ago

Well, anything else just wouldn't be Christian, you know. I'd hate to have to report you..

[–] [email protected] 5 points 2 weeks ago

In the wise words of Londo Mollari

Only an idiot fights a war on two fronts. Only the heir to the throne of the kingdom of idiots would fight a war on twelve fronts.

[–] [email protected] 4 points 3 weeks ago (1 children)

Since I already use ZFS for my data storage, I just created a private dataset for sensitive data. I also have my services split based on if it's sensitive or not, so the non sensitive stuff comes up automatically and the sensitive stuff waits for me to log in and unlock the dataset.

[–] [email protected] 8 points 3 weeks ago

Don't let the man get you down

[–] [email protected] 1 points 3 weeks ago* (last edited 3 weeks ago)

I'm sorry, but what is ill informed or opinion about it? Fact is it can do things no other image generator can do, open source or not. It can also effortlessly do things that would require a lot of tinkering with controlnet in comfyui, or even making custom lora's. It's a multimodal model that can do image and text both input and output, and does it well. All other useful image generators are diffusion based, which doesn't read a prompt in the same way, and is more about weighting patterns based on keywords rather than any real understanding of the prompt. That's why they're struggling with relatively simple things like "a full glass of wine" or "a horse riding an astronaut on the moon". If I'm wrong about this, please prove me wrong. Nothing would make me happier than finding an open source model that can do what openai's new image model can do, really. I already run llama.cpp servers and comfyui locally, I have my own AI server in the basement with a P40 and a 3090. Please, please prove me wrong here.

I love open models, and been running them locally since first llama model, but that doesn't mean I willfully ignore and pretend what claude and openai and google develops doesn't exist. Rather I want awareness about it, that it does exist, and I want an open source version of it.

[–] [email protected] 1 points 3 weeks ago (2 children)

ah yes, I forgot we live in post-truth society where reality doesn't matter and only your feelings are important. And since your feelings say AI bad, proprietary bad, and reddit bad, you don't have to actually think or take into consideration reality.

[–] [email protected] 1 points 3 weeks ago (4 children)

I know them, and used them a bit. I even mentioned them in an earlier comment. The capabilities of OpenAI's new model is on a different level in my experience.

https://www.reddit.com/r/StableDiffusion/comments/1jlj8me/4o_vs_flux/ - read the comments there. That's a community dedicated to running local diffusion models. They're familiar with all the tricks. They're pretty damn impressed too.

I can't help but feel that people here either haven't tried the new openai image model, or have never actually used any of the existing ai image generators before.

[–] [email protected] 2 points 3 weeks ago* (last edited 3 weeks ago) (6 children)

No other model on market can do anything like that. The closest is diffusion based where you could train a lora with a person's look or a specific clothing, then generate multiple times and / or use controlnet to sorta control the output. That's fast hours or days of work, plus it's quite technical to set it up and use.

OpenAI's new model is a paradigm shift in both what the model can do and how you use it, and can easily and effortlessly produce things that was extremely difficult or impossible without complicated procedures and post processing in Photoshop.

Edit Some examples. Try to make any of this in any of the existing image generators

[–] [email protected] 4 points 3 weeks ago* (last edited 3 weeks ago) (10 children)
  • Autoregressive model
  • Multimodal with the LLM
  • Can keep consistency between images
  • Much better at text rendering
  • Can combine images, like you have one image and you upload a picture of a jacket and say "put this on him" and it does it
  • Can upload a picture of yourself and say "put me on the beach", and then for example if you don't like it you can tell it to do a different type of beach, and then say "and put me on a white horse and give me some nice beach wear" for example.

It understands what you're telling it, and can generate images from vague descriptions, combine things from different images just by telling it, modify it and understand the context - like knowing that "me" is the person in the image, for example.

Edit: From OpenAI - "4o image generation is an autoregressive model natively embedded within ChatGPT"

view more: next β€Ί