Friday, August 2, 2024

Chips At Stake


Integrating dimensions to get more out of Moore's Law and advance electronics
Jan 2024, phys.org

Moore's Law - the number of transistors on a chip will double every two years (but a limit exists)

More Moore - vertically stacking multiple layers of semiconductor devices to beat the limit (3D integration)

More than Moore - transistors made from 2D materials (using 2D materials, not just 2D-like layers of 3D materials)

Monolithic Moore - monolithic 3D integration uses one brick of 2D things, instead of layers of 2D things, so there's no connections between the layers, so it saves space 

via Penn State: Darsith Jayachandran et al, Three-dimensional integration of two-dimensional field-effect transistors, Nature (2024). DOI: 10.1038/s41586-023-06860-5



Emulating neurodegeneration and aging in artificial intelligence systems
Apr 2024, phys.org

"We used IQ tests performed by large language models (LLMs) and, more specifically, the LLaMA 2, to introduce the concept of 'neural erosion. This deliberate erosion involves ablating synapses or neurons or adding Gaussian noise during or after training, resulting in a controlled decline in the LLMs' performance."

The researchers found that when they deliberately ablated (i.e., removed) some of the artificial synapses or neurons of the LLaMA 2 model, its performance on IQ tests declined, following a particular pattern.

"The LLM loses abstract thinking abilities, followed by mathematical degradation, and ultimately, a loss in linguistic ability, responding to prompts incoherently. ... We are now conducting further tests to better understand this observed pattern."

Interestingly, this 'neuro-erosion' pattern is aligned with the neurodegeneration patterns observed in humans.

via University of California Irvine: Antonios Alexos et al, Neural Erosion: Emulating Controlled Neurodegeneration and Aging in AI Systems, arXiv (2024). DOI: 10.48550/arxiv.2403.10596

Post Script: A new kind of chip called an NPU for neural processing unit, is like the new GPU, which was the new CPU:
Your current PC probably doesn’t have an AI processor, but your next one might
Feb 2024, Ars Technica

The comments in this article are the most clearly examined examples of what NPUs will do than any other thing I've read, and considering that the forum of Ars is known for the technical proficiency of its members: 
  • From the article, Intel Senior Director of Technical Marketing Robert Hallock  - "Camera segmentation, this whole background blurring thing... moving that to the NPU saves about 30 to 50 percent power versus running it elsewhere."
  • **RTS** - It is pretty shocking looking at powermetrics on macOS to see how rare the Neural Engine is even powered as most applications (even AI exclusive applications like stable diffusion/LLM frontends) just run any AI work they need on the CPU or GPU.
  • AlicePlaysWithRockets - I use it daily in Final Cut Pro and meeting camera effects.
  • LlamaDragon - NPUs can be used by Photoprism to speed up importing of images (it tries to ID pics with dogs or cars or "cooking" and so on) or by something like Frigate (camera monitoring) to do similar ID-ing in real time, it might be useful.
  • AmanoJyaku Ars Praefectus - [why] It has to do with CPUs being general purpose, and therefore able to do anything, vs. hardware accelerators that do specific things. Hardware accelerators can't do most of the things CPUs can do, but the things they can do can be done much faster than CPUs do them. In turn, this makes them more power efficient. The trend started with graphics cards, continued with sound cards and network cards, and have since grown to include other devices, some of which accelerate things as minute as image decoding. ... Theoretically, the most efficient device is one that has accelerators for everything you do on a device. However, there will always be new things, so CPUs are unlikely to be eliminated. As for what makes general-purpose CPUs different from task-specific accelerators, forum posts can't easily sum this up. That's a topic explained in college courses, and the basis for entire careers. 
  • autostop - The Neural Engine on the Apple chips is what makes "Live Text" (universal OCR) work on the Macintosh and iPhone. (You can open a scanned PDF, or a screen grab, and the mouse pointer turns into an I-beam and you can highlight and copy just as if it were text in Word or something.)
  • Siosphere - Users who view photos a lot on their laptop/desktop, it could be used for on-device people finding, or computing interoplation for zooming in/out of a photo so it is smoother/faster. ... It could be used for generic type-ahead suggestions, which if they suck is super annoying, but when they work well are a very useful timesaver. (When they actually suggest what I was actually going to write, I'm really pleased, when they suggest something else they are annoying, but they are getting better). ... Better searching capabilities by understanding more of what you are searching for on your computer, being smarter about intent, are you searching for a specific file by name, type of file, application, etc. ... Generalized automation of repetitive tasks, things you might currently script yourself could be automatically found and setup (again, this is annoying if it doesn't do what you want, but I'm just saying best case it automates in the exact way you are wanting). ... Faster windows hello (if you use that), better camera processing for video calls, better workload predicting for changing the task priority to save battery on laptops, or give CPU power more to the things that need it. ... There are a lot of subtle ways that a dedicated AI chip could be used, and there will be so many more when it is just an available resource any developers can tap into
  • Isaacc7 - NPUs can be used to accelerate many tasks like voice recognition, camera processing, live subtitles, etc. 
  • Toastr - Also things like on-device automatic image tagging of faces for your photo library, which is a big plus for me who wants to organize my family photo library but doesn't like the privacy implications of cloud photo services like Google Photos. ... Or, for another example:
  • The self-hosted Frigate NVR software supports object detection via CPU, GPU, or NPU. Sure, you could run it on an RTX 4080, but you could also just throw a $25 Coral Accelerator in a mini PC and get plenty of performance but with a device that draws just a few watts.
  • longhornchris04 - Side note, while GPUs can do the highly parallel low precision computing, they are designed for graphics which generally require a higher level of precision. So yes, GPUs can do the work, and do it quite easily, but they are often overkill for the job and thus are less efficient at it. 
  • or just anandtech: https://www.anandtech.com/show/20046/intel-unveils-meteor-lake-architecture-intel-4-heralds-the-disaggregated-future-of-mobile-cpus/4

No comments:

Post a Comment