Uplifting News
Welcome to /c/UpliftingNews, a dedicated space where optimism and positivity converge to bring you the most heartening and inspiring stories from around the world. We strive to curate and share content that lights up your day, invigorates your spirit, and inspires you to spread positivity in your own way. This is a sanctuary for those seeking a break from the incessant negativity often found in today's news cycle. From acts of everyday kindness to large-scale philanthropic efforts, from individual achievements to community triumphs, we bring you news that gives hope, fosters empathy, and strengthens the belief in humanity's capacity for good.
Here in /c/UpliftingNews, we uphold the values of respect, empathy, and inclusivity, fostering a supportive and vibrant community. We encourage you to share your positive news, comment, engage in uplifting conversations, and find solace in the goodness that exists around us. We are more than a news-sharing platform; we are a community built on the power of positivity and the collective desire for a more hopeful world. Remember, your small acts of kindness can be someone else's big ray of hope. Be part of the positivity revolution; share, uplift, inspire!
view the rest of the comments
With a CPU or even a GPU, there is a bunch of inefficiencies for every task as they're designed to be able to do pretty much anything - your H265 media decoder isn't going to be doing much when you're keeping a running sum of the number of a certain type of bond in a list of chemicals
With ASICs and a lesser extent FPGAs, you can make it so every single transistor is being used at every moment which makes them wildly efficient for doing a single repetitive task, such as running statistical analysis on a huge dataset. This is because rather than being limited by the multiprocessing ability of the CPU or GPU, you can design the "program" to run with as much multiprocessing ability as is possible based on the program, meaning if you stream one input per clock cycle, after a delay you will get one input per clock cycle out, including your update function so long as it's simple enough (eg moving average, running sum or even just writing to memory)
This is one specific application of FPGAs (static streaming) but it's the one that's relevant here
So it sounds like we're designing the instruction pipeline for maximum parallelism for our task. I was surprised to learn that the first commercial FPGAs were available as early as the '80s. I can see how this would have been an extremely effective option before CUDA became available.