Hacker News new | past | comments | ask | show | jobs | submit login

Xeon D 2700 comes in 16-core flavors. At 2GHz AVX512 with 1x FMA per clock tick, that's at least 1TFlop, especially because other pipelines are still available to contribute to the processing.

2GHz x 16 SIMD (32-bits per lane) x 16 cores x 2 ops per instruction (Multiply+Accumulate) == 1TFlop or there-abouts.

> Obviously, throw a 64, 96, or 128+ core EPYC or the latest Xeons at it and that will make it cry (while gobbling up hundreds of watts doing so ;)

And those are closer to 4 to 10 TFlops. The bigger Xeons are 2x FMA or more instructions per clock tick IIRC. So not only are there more cores, each core can do double the work.

We are talking about the "little" Xeon D with relatively low clocks. But 512-bit FMA really gets going.

---------

GPUs are well into 20TFlop to 50TFlop regions.

-------

Although I'm calculating ~32-bit Flops, you may have done 64-bit Flops. Its a bit ambiguous but we might not be apples-to-apples yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: