this post was submitted on 07 Jul 2024
387 points (100.0% liked)

Science Memes

15546 readers
2815 users here now

Welcome to c/science_memes @ Mander.xyz!

A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.



Rules

  1. Don't throw mud. Behave like an intellectual and remember the human.
  2. Keep it rooted (on topic).
  3. No spam.
  4. Infographics welcome, get schooled.

This is a science community. We use the Dawkins definition of meme.



Research Committee

Other Mander Communities

Science and Research

Biology and Life Sciences

Physical Sciences

Humanities and Social Sciences

Practical and Applied Sciences

Memes

Miscellaneous

founded 2 years ago
MODERATORS
387
Sardonic Grin (mander.xyz)
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 40 points 1 year ago (4 children)

It's a fundamental problem with the tech in general. It inherently has no concept of "I don't know" and will just be confident, specific, and wrong.

[–] [email protected] 23 points 1 year ago (3 children)

That's blatantly untrue. My plant ID app gives multiple suggestions with certainty percentages.

[–] [email protected] 6 points 1 year ago (2 children)
[–] [email protected] 6 points 1 year ago

inaturalist does this, and also lets other people suggest an ID so you can get a consensus.

[–] [email protected] 2 points 11 months ago
[–] [email protected] 4 points 1 year ago

My app does this too!

Feeling like half these commenters hating on this feature use one bad program and think the whole concept is bad.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago) (1 children)

Probably because your app has an actual database of plants to compare with instead of feeding it into an AI.

Edit: The term AI is getting to be a little useless these days. What I meant to say, was it's not using image recognition as implemented by a multi-modal LLM. It's using the more traditional machine learning algorithms that came before "Attention is all you need"

[–] [email protected] 8 points 1 year ago (1 children)

Why do you think so, and how do you think the plants are compared without AI?

Image classification/object detection AI (usually) gives you a confidence value for every result. It's a natural consequence of their architecture.

[–] [email protected] 2 points 1 year ago (2 children)

Weren't image recognition algorithms like the first types of AI that got good enough to be useful?

[–] [email protected] 3 points 1 year ago

No. AI and, what you're more likely to be referring to, machine learning has had applications for decades. Basic work was used back into the '60s, mostly for quick things, and 1D data analysis was useful long before images (voice and stuff like biometrics). But there are many more types of AI. Bayesian networks (still in the learned category) were huge breakthroughs and still see a lot of use today. Decision trees, Markov chains, and first order logic are the most common video games AI and usually rely on expert tuning rather than learned results.

AI is a huge field that's been around longer than you expected, and permeates a lot of tech. Image stuff is just the hot application since it's deep learning based buff that started around 2009 with a bunch of papers that helped get actual beneficial learning in deeper models (I always thought it started roughly with Deep Boltzmann Machines, but there's a lot of work in that era that chipped away at the problem). The real revolution was general purpose GPU programming getting to a state where these breakthroughs weren't just theoretical.

Before that, we already used a lot of computer vision, and other techniques, learned and unlearned, for a lot of applications. Most of them would probably bore you, but there are a lot of safety critical anomaly detectors.

[–] [email protected] 2 points 1 year ago

It really depends on what you call AI, but just to put things into context: XKCD 1425 was released in 2014. Compare that to the timeline of AI on Wikipedia.

[–] [email protected] 10 points 1 year ago (1 children)

This is blatantly false. Classification tasks like this all have a level of certainty for each possible category - it's just up to the person writing the software to interpret those levels of certainty in a way that's useful to the user. Whether this is saying "I don't know" when the certainties are too spread out, or providing a list of options like other people in this thread have said their apps do. The problem is that "100% certainty" comes off well with the general public, so there's a financial incentive to make the system seem more certain than it is by using a layer (from memory it's called Softmax?) that will return only the category with the highest degree of certainty.

[–] [email protected] 5 points 1 year ago (1 children)

The key issue here is that 'level of certainty' doesn't really mean what you would like it to.

You get back a number yes, but it can change according to what's visible in the background, the angle that the plants at, how close is it to the camera, and how nice the camera is you're using (professional photographers use expensive cameras and take shots of different things to everyone else).

Interpreting this score as "how safe is it to eat the plant" is a really bad idea. You will still eat the wrong plant. These scores can lead to very confident random guessing when you show it a plant it's never seen before.

And no, softmax is a trick for making the scores all sum to one, so you get back a confidence for every possible thing the image could be of.

[–] [email protected] 1 points 1 year ago (1 children)

I feel like you're viewing this from the wrong angle, or at very least we're viewing it from different angles. You seem to be doing a binary classification (Is this plant edible) rather than a group classification (what plant is this?) where edibility is an attribute of the plant to be returned to the user (yes; no; when green; only the roots; etc.) - the latter is the approach most of these apps take, classify the image into a species (or list of potential species) then give the user details such as identifying features, common growing areas, edibility, and lookalikes. You're right about softmax, it's been a couple of years since I've done the programming side of this so my terminology is a bit rusty.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

I'm not describing binary classification, I'm describing multiclass. "Group classification" isn't really a thing. Yes, your ml system probably guesses what kind of plant it is and then looks up the ediblity of components.

The problem with this is how they will handle rare plants that aren't in the dataset, or that are in the dataset but with insufficient data to be recognised.

Because multiclass assumes that it's seen representative data on all possible outputs (e.g. plant types) it will tend to be dangerously confident on plant types it hasn't seen before.

This is because it can rule out other classes. E.g. if you're trying to classify as rose, tulip, or daisy and you get a bramble, your classifier is likely to be very certain it's a rose because tulips and daisies don't have thorns. So your softmax score is likely to show heavy confidence in rose even though it's actually none of them.

This is exactly what can go wrong when you try to use the softmax/standard multiclass approach and come across an interesting rare mushroom or wild carrot. You don't want it to guess which type of plant in the database it's most like, even if this guess comes with scores, you want it to say that it genuinely doesn't know and you shouldn't eat it.

[–] [email protected] 7 points 1 year ago

uhhh do you have any clue how it actually works? i mean maybe there's some sort of visual AI tech that doesn't let you make it say "idk fam" but the standard stuff just gives a point value to each result, and you could just.. have a minimum limit..

and like i'm pretty certain the current chatbots available generally are capable of responding that they don't know, they're certainly capable of "recognizing" when it's a topic they're not allowed to talk about.

[–] [email protected] 7 points 1 year ago (1 children)

This actually is a symptom from the sort of "beneficial" overfit in Deep Learning. As someone whose research is in low data, long tails, and few shot learning, there's a few things that smaller networks did better in generalization, and one thing they particularly did better (without explicit training for it) is gauging uncertainty. This uncertainty is sometimes referred to as calibration. Calibrating deep networks can yield decent probabilities that can be used to show uncertainty.

There are other tricks for this. My favorite strategies prep the network for learning new things. Large margin training and the like are a good thing to look into. Having space in the output semantic space (the layer immediately before the output or earlier for encoder decoder style networks) allows for larger regions for distinct unknown values to be separated from the known ones, which helps inherently calibrate the network.