this post was submitted on 02 May 2025
20 points (100.0% liked)
Programming
21140 readers
290 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This can be the new slogan of our development. :')
I have convinced management to switch to a modern server. In addition we hope refactoring our approach (no random reads, no dedupe processes for a whole table, etc.) will lead us somewhere.
Actually now. We are adding a layer of processing products to an already in-production system which handles already multiple millions of products on a daily basis. Since we not only have to process the new/updated products but have to catch up with processing the historical (older) products as well its a massive amount of products. We thought since the order is not important to use a random approach to catch up. But I see now that this is a major bottleneck in our design.
so no. No narrowing.
Also no IMO. since we dont want a product to be processed twice, we want to ensure deduplication - this requires knowledge of all already processed products. Therefore comparing with the whole table everytime.
Sorry for taking so long to get back to you on this, but I'm not always on Lemmy. There's always more code to be written - you know how it is, I'm sure.
Given the constraints you outline, one other avenue of attack could be to consider the time-sensitivity of product updates and the relative priority thereof. If it's acceptable for updates to products to lag somewhat, you can at least perform them at a lower rate over longer time, thus reducing hardware load at any given time. If the periodic updates are make to the same per-product values, you could even potentially get smart and replace queued updates not yet performed, if they're superseded by a subsequent change before they're actually committed thus further reducing load.