D3.3 – Initial report on the performance characteristics on relevant hardware for upcoming supercomputers
This deliverable documents the initial performance analysis results obtained for all 6 TREX flagship applications. The focus was, in particular, the scaling of the application and the ability to exploit parallelism at all the different levels of modern HPC architectures. This ranges from the efficient use of SIMD instructions to the use of highly parallel compute accelerators like Graphics Processing Units (GPUs).
For assessing the applications in terms of scalability, it needs to be taken into account that they differ in terms of their principle ability to be highly parallelized. Some of the applications, e.g. QMC=Chem, implement scalable methods with a strong focus on scalability, which could be demonstrated using up to 32,768 CPU cores. Furthermore, very encouraging results have been obtained for TurboRVB from GPU acceleration using Europe’s currently fastest supercomputer, i.e. JUWELS Booster.
The performance results collected for this deliverable will help to guide further work and optimisations during the second half of the project.