Tachyum, the trailblazing creator of Prodigy, the world’s first Universal Processor, has recently marked a significant achievement by successfully completing vector-based High-Performance LINPACK (HPL) testing on the Prodigy FPGA.
LINPACK, renowned for benchmarking supercomputers, gauges a system’s floating-point computing power by solving complex linear equations. Tachyum’s successful execution of 1kb vector-based HPL tests on Prodigy’s FPGA showcases a substantial leap in performance evaluation.
This accomplishment follows prior scalar LINPACK benchmarks, affirming Prodigy’s prowess with its IEEE-compliant scalar Floating-Point Unit (FPU). Tachyum has now progressed to demonstrate the capabilities of its vector unit, designed with cutting-edge features to deliver industry-leading performance.
Key to Prodigy’s exceptional performance is its vector unit, boasting two pipelines with a 1024b-wide data path. Executing 2x1K SIMD operations per cycle, Prodigy delivers an outstanding 32 double precision Fused IEEE floating-point multiply-add operations per cycle, resulting in 64 double precision floating-point operations per core.
Moreover, Prodigy’s innovative memory access micro-architecture supports unaligned data, ensuring optimal processing without the performance penalties seen in other architectures. These features, combined with high clock rates, position Tachyum to lead in vectorized data processing performance.
Dr. Radoslav Danilak, Tachyum’s CEO, highlighted the complexity of achieving such milestones in supercomputing, emphasizing the meticulous verification process, including ensuring correct wiring of vectors, accurate reporting of IEEE flags, and seamless data-shuffling vector operations.
Tachyum’s focus now shifts towards finalizing vector unit verification and testing with FPU for AI matrix operations. Dr. Danilak expressed confidence in moving toward volume production in 2024, aiming to fulfill a multibillion-dollar sales pipeline.
The Prodigy-powered data center servers offer versatility across computational domains, seamlessly transitioning between AI/ML, HPC, and cloud workloads on a singular architecture. This versatility eliminates the need for dedicated AI hardware, significantly reducing costs while delivering unparalleled data center performance, power efficiency, and economics.
Prodigy integrates 192 high-performance custom-designed 64-bit compute cores, outperforming x86 processors by up to 4.5 times for cloud workloads, 3 times that of the highest-performing GPUs for HPC, and 6 times for AI applications.
Tachyum’s recent breakthrough not only propels the company towards its production goals but also signifies a remarkable leap in supercomputing capabilities, potentially redefining the landscape of computational power and versatility in data centers