Science Memes

15750 readers

2911 users here now

Welcome to c/science_memes @ Mander.xyz!

A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.

Rules

Don't throw mud. Behave like an intellectual and remember the human.
Keep it rooted (on topic).
No spam.
Infographics welcome, get schooled.

This is a science community. We use the Dawkins definition of meme.

Research Committee

[email protected]

Other Mander Communities

Science and Research

Biology and Life Sciences

Physical Sciences

Humanities and Social Sciences

Practical and Applied Sciences

Memes

Miscellaneous

founded 2 years ago

MODERATORS

[email protected]

1508

Breast Cancer (mander.xyz)

submitted 11 months ago by [email protected] to c/[email protected]

201 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[+] [email protected] 67 points 11 months ago (2 children)

[removed by mod]

[–] [email protected] 22 points 11 months ago (1 children)

That's actually really smart. But that info wasn't given to doctors examining the scan, so it's not a fair comparison. It's a valid diagnostic technique to focus on the particular problems in the local area.

"When you hear hoofbeats, think horses not zebras" (outside of Africa)

[–] [email protected] 8 points 11 months ago (2 children)

AI is weird. It may not have been given the information explicitly. Instead it could be an artifact in the scan itself due to the different equipment. Like if one scan was lower resolution than the others but you resized all of the scans to be the same size as the lowest one the AI might be picking up on the resizing artifacts which are not present in the lower resolution one.

[–] [email protected] 4 points 11 months ago

I'm saying that info is readily available to doctors in real life. They are literally in the hospital and know what the socioeconomic background of the patient is. In real life they would be able to guess the same.

[–] [email protected] 3 points 11 months ago (1 children)

That is quite a statement that it still had a better detection rate than doctors.

What is more important, save life or not offend people?

[+] [email protected] 3 points 11 months ago (1 children)

[removed by mod]

[–] [email protected] 4 points 11 months ago (2 children)

Citation needed.

Usually detection rates are given on a new set of samples, on the samples they used for training detection rate would be 100% by definition.

[–] [email protected] 4 points 11 months ago* (last edited 11 months ago) (1 children)

Right, there's typically separate "training" and "validation" sets for a model to train, validate, and iterate on, and then a totally separate "test" dataset that measures how effective the model is on similar data that it wasn't trained on.

If the model gets good results on the validation dataset but less good on the test dataset, that typically means that it's "over fit". Essentially the model started memorizing frivolous details specific to the validation set that while they do improve evaluation results on that specific dataset, they do nothing or even hurt the results for the testing and other datasets that weren't a part of training. Basically, the model failed to abstract what it's supposed to detect, only managing good results in validation through brute memorization.

I'm not sure if that's quite what's happening in maven's description though. If it's real my initial thoughts are an unrepresentative dataset + failing to reach high accuracy to begin with. I buy that there's a correlation between machine specs and positive cases, but I'm sure it's not a perfect correlation. Like maven said, old areas get new machines sometimes. If the models accuracy was never high to begin with, that correlation may just be the models best guess. Even though I'm sure that it would always take machine specs into account as long as they're part of the dataset, if actual symptoms correlate more strongly to positive diagnoses than machine specs do, then I'd expect the model to evaluate primarily on symptoms, and thus be more accurate. Sorry this got longer than I wanted

[–] [email protected] 2 points 11 months ago

It's no problem to have a longer description if you want to get nuance. I think that's a good description and fair assumptions. Reality is rarely as black and white as reddit/lemmy wants it to be.