Technology

67242 readers

5948 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

369

Show top LLMs buggy code and they'll finish off the mistakes rather than fix them. (www.theregister.com)

submitted 5 days ago by [email protected] to c/[email protected]

43 comments fedilink hide all child comments

Researchers have found that large language models (LLMs) tend to parrot buggy code when tasked with completing flawed snippets.

That is to say, when shown a snippet of shoddy code and asked to fill in the blanks, AI models are just as likely to repeat the mistake as to fix it.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 10 points 3 days ago* (last edited 3 days ago) (1 children)

100%. As a solo dev who used to work corporate, I compare it to having a jr engineer who completes every task instantly. If you give it something well-documented and not too complex, it'll be perfect. If you give it something more complex or newer tech, it could work, but may have some mistakes or unadvised shortcuts.

I've also found it pretty good for when a dependency I'm evaluating has shit documentation. Not always correct, but sometimes it'll spit out some apis I didn't notice.

Edit: Oh also I should mention, I've found TDD is pretty good with ai. Since I'm building the tests anyways, it can often give the ai a good description of what you're looking for, and save some time.

[–] [email protected] 4 points 3 days ago (1 children)

I've found it okay to get a general feel for stuff but I've been given insidiously bad code. Functions and data structures that look similar enough to real stuff but are deeply wrong or non+existent.

[–] [email protected] 1 points 3 days ago* (last edited 3 days ago)

Mmm it sounds like you're using it in a very different way to me; by the time I'm using an LLM, I generally have way more than a general feel for what I'm looking for. People rag on ai for being a "fancy autocomplete", but that's literally what I like to use it for. I'll feed it a detailed spec for what I need, give it a skeleton function with type definitions, and tell the ai to fill it in. It generally fills in basic functions pretty well with that level of definition (ymmv depending on the scope of the function).

This lets me focus more on the code design/structure and validation, while the ai handles a decent amount of grunt work. And if it does a bad job, I would have written the spec and skeleton anyways, so it's more like bonus if it works. It's also very good at imitation, so it can help to avoid double-work with similar functionalities.

Kind of shortened/naive example of how I use:

/* Example of another db update function within the app */
/* UnifiedEventUpdate and UnifiedEvent type definitions */

Help me fill in this function

/// Updates event properties, and children:
///   - If `event.updated` is newer than existing, update as normal
///   - If `event.updated` is older than existing, error
///   - If no `event.updated` is provided, assume updated to be now()
/// For updating Content(s):
///   - If `content.id` exists, update the existing content
///   - If `content.id` does not exist, create a new content
///   - If an existing content isn't present, delete the content
pub fn update_event(
    conn: &mut Conn,
    event: UnifiedEventUpdate,
) -> Result<UnifiedEvent, Error> {