IA3 2016 Sixth Workshop on Irregular Applications: Architectures and Algorithms (SC16 Workshop) @ Salt Lake City, UT
Data compression can be very powerful way to improve performance of unstructured grid calculation
Many real problems (e.g. FEM or other CAE problems) requires to solve large linear equations:
\[A \boldsymbol{x} = \boldsymbol{b}\]
\(A\) is large sparse and irregular matrix in most cases of real problems
Iterative methods are suited
Multiplication of sparse matrix and vector (SpMV) is the most time-consuming process
Rank | Computer | Rpeak | HPCG | HPCG/Rpeak |
---|---|---|---|---|
1 | MilkyWay-2 | 54.9 | 0.58 | 1.1% |
2 | K computer | 11.3 | 0.55 | 4.9% |
3 | Sunway TaihuLight | 125.4 | 0.37 | 0.3% |
Modern computer's memory bandwidth is too small against (arithmetic) instruction throughput for SpMV
Achieved 168.06 GFLOPS with 8 nodes (32 PEZY-SCs)
93% of the theoretical limit
Compression | Achieved GFLOPS | Theoretical GFLOPS | ratio (achieved/theretical) |
---|---|---|---|
None | 11.6 | 12.5 | x1.0 / x1.0 |
Data table | 15.9 | 34.8 | x1.4 / x2.8 |
Data+Index table | 32.4 | 326.0 | x2.8 / x26.1 |
NOTE: theoretical estimate ignores input vector random access
Compression technique will improve solver for linear equations in CAE application and others on the current and future HPC system!