References
- [1] M. Adams, P. Colella, D. T. Graves, J. N. Johnson, Keen, N. D., T. J. Ligocki, D. F. Martin, P. W. McCorquodale, D. Modiano, P. Schwartz, T. Sternberg, and B. van Straalen. Chombo software package for AMR applications - design document. Technical Report LBNL-6616E, Lawrence Berkeley National Laboratory, Jan 2015.
- [2] S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficient management of parallelism in object oriented numerical software libraries. In Modern Software Tools in Scientific Computing, pages 163–202. Birkhäuser Press, 1997.10.1007/978-1-4612-1986-6_8
- [3] W. Bangerth, R. Hartmann, and G. Kanschat. deal.II – a general purpose object oriented finite element library. ACM Trans. Math. Softw., 33(4):24/1–24/27, 2007.10.1145/1268776.1268779
- [4] P. Bastian, C. Engwer, D. Göddeke, O. Iliev, O. Ippisch, M. Ohlberger, S. Turek, J. Fahlke, S. Kaulmann, S. Müthing, and D. Ribbrock. EXA-DUNE: Flexible pde solvers, numerical methods and applications. In Euro-Par 2014: Parallel Processing Workshops, volume 8806 of Lecture Notes in Computer Science, pages 530–541. Springer, 2014.10.1007/978-3-319-14313-2_45
- [5] M. Bauer, F. Schornbaum, C. Godenschwager, M. Markl, D. Anderl, H. Köstler, and U. Rüde. A python extension for the massively parallel multiphysics simulation framework walberla. International Journal of Parallel, Emergent and Distributed Systems, 31(6):529–542, 2016.10.1080/17445760.2015.1118478
- [6] B. Bergen, T. Gradl, F. Hülsemann, and U. Rüde. A massively parallel multigrid method for finite elements. Computing in Science and Engineering, 8(6):56–62, 2006.10.1109/MCSE.2006.102
- [7] B. Bergen and F. Hülsemann. Hierarchical hybrid grids: data structures and core algorithms for multigrid. Numer. Linear Algebra Appl., 11:279–291, 2004.10.1002/nla.382
- [8] M. Blatt, A. Burchardt, A. Dedner, C. Engwer, J. Fahlke, B. Flemisch, C. Gersbacher, C. Grüser, F. Gruber, C. Gräninger, D. Kempf, R. Klöfkorn, T. Malkmus, S. Müthing, M. Nolte, M. Piatkowski, and O. Sander. The distributed and unified numerics environment, version 2.4. Archive of Numerical Software, 4(100):13–29, 2016.
- [9] M. Bolten, F. Franchetti, P. H. J. Kelly, C. Lengauer, and M. Mohr. Algebraic description and automatic generation of multigrid methods in SPIRAL. Concurrency and Computation: Practice and Experience, 29(17):4105:1–4105:11, 2017. Special Issue on Advanced Stencil-Code Engineering.10.1002/cpe.4105
- [10] T. Brandvik and G. Pullan. SBLOCK: A framework for efficient stencil-based PDE solvers on multi-core platforms. In 2010 10th IEEE International Conference on Computer and Information Technology, pages 1181–1188, Jun 2010.10.1109/CIT.2010.214
- [11] M. Christen, O. Schenk, and H. Burkhart. PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In 2011 IEEE International Parallel Distributed Processing Symposium, pages 676–687, May 2011.10.1109/IPDPS.2011.70
- [12] C. Coarfa, Y. Dotsenko, J. Mellor-Crummey, F. Cantonnet, T. El-Ghazawi, A. Mohanti, Y. Yao, and D. Chavarría-Miranda. An evaluation of global address space languages: Co-array Fortran and unified parallel C. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ‘05, pages 36–47, New York, NY, USA, 2005. ACM.10.1145/1065944.1065950
- [13] Z. DeVito, N. Joubert, F. Palaciosy, S. Oakley, M. Medina, M. Barrientos, E. Elsen, F. Ham, A. Aiken, K. Duraisamy, E. Darve, J. Alonso, and P. Hanrahan. Liszt: A domain specific language for building portable mesh-based PDE solvers. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pages 1–12. ACM, 2011.10.1145/2063384.2063396
- [14] H. C. Edwards, C. R. Trott, and D. Sunderland. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. Journal of Parallel and Distributed Computing, 74(12):3202 – 3216, 2014. Special issue on Domain-Specific Languages and High-Level Frameworks for High-Performance Computing.10.1016/j.jpdc.2014.07.003
- [15] R. D. Falgout, J. E. Jones, and U. M. Yang. The design and implementation of hypre, a library of parallel high performance preconditioners. In Numerical Solution of Partial Differential Equations on Parallel Computers, pages 267–294, Berlin, Heidelberg, 2006. Springer.10.1007/3-540-31619-1_8
- [16] M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. SIGPLAN Not., 33(5):212–223, May 1998.10.1145/277652.277725
- [17] K. Fürlinger, C. Glass, A. Knüpfer, J. Tao, D. Hünich, K. Idrees, M. Maiterth, Y. Mhedheb, and H. Zhou. DASH: Data structures and algorithms with support for hierarchical locality. In Euro-Par 2014 Workshops (Porto, Portugal), pages 542–552, 2014.10.1007/978-3-319-14313-2_46
- [18] B. Gmeiner, T. Gradl, H. Köstler, and U. Rüde. Highly parallel geometric multigrid algorithm for hierarchical hybrid grids. In K. Binder, G. Münster, and M. Kremer, editors, NIC Symposium 2012, volume 45 of Publication series of the John von Neumann Institute for Computing, pages 323–330, Jülich, Germany, 2012.
- [19] B. Gmeiner, M. Huber, L. John, U. Rüde, and B. Wohlmuth. A quantitative performance study for Stokes solvers at the extreme scale. J. Comput. Sci., 17(3):509–521, 2016.10.1016/j.jocs.2016.06.006
- [20] B. Gmeiner, H. Köstler, M. Stürmer, and U. Rüde. Parallel multigrid on hierarchical hybrid grids: a performance study on current high performance computing clusters. Concurrency and Computation: Practice and Experience, 26(1):217–240, 2014.10.1002/cpe.2968
- [21] B. Gmeiner, U. Rüde, H. Stengel, C. Waluga, and B. Wohlmuth. Performance and Scalability of Hierarchical Hybrid Multigrid Solvers for Stokes Systems. SIAM J. Sci. Comput., 37(2):C143–C168, 2015.10.1137/130941353
- [22] B. Gmeiner, U. Rüde, H. Stengel, C. Waluga, and B. Wohlmuth. Towards textbook efficiency for parallel multigrid. Numer. Math. Theory Methods Appl., 8:2246, 2015.10.4208/nmtma.2015.w10si
- [23] T. Gysi, T. Grosser, and T. Hoefler. MODESTO: Data-centric analytic optimization of complex stencil programs on heterogeneous architectures. In Proceedings of the 29th ACM on International Conference on Supercomputing, ICS ‘15, pages 177–186, New York, NY, USA, 2015. ACM.10.1145/2751205.2751223
- [24] T. Gysi, C. Osuna, O. Fuhrer, M. Bianco, and T. C. Schulthess. STELLA: A domain-specific tool for structured grid methods in weather and climate models. In Proceedings International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pages 41:1–41:12. ACM, Nov 2015.10.1145/2807591.2807627
- [25] M. Heisig. Petalisp: A common lisp library for data parallel programming. In 11th European Lisp Symposium, page 4, 2018.
- [26] M. Heisig and H. Köstler. Petalisp: run time code generation for operations on strided arrays. In Proceedings of the 5th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, pages 11–17. ACM, 2018.10.1145/3219753.3219755
- [27] M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley. An overview of the Trilinos project. ACM Trans. Math. Softw., 31(3):397–423, 2005.10.1145/1089014.1089021
- [28] L. V. Kale and S. Krishnan. CHARM++: A portable concurrent object oriented system based on C++. SIGPLAN Notices, 28(10):91–108, Oct 1993.10.1145/167962.165874
- [29] N. Kohl, D. Thönnes, D. Drzisga, D. Bartuschat, and U. Rüde. The hyteg finite-element software framework for scalable multigrid solvers. International Journal of Parallel, Emergent and Distributed Systems, 0(0):1–20, 2018.
- [30] H. Köstler, C. Schmitt, S. Kuckuk, F. Hannig, J. Teich, and U. Rüde. A scala prototype to generate multigrid solver implementations for different problems and target multi-core platforms. Int. J. of Computational Science and Engineering, 14(2):150–163, 2017.10.1504/IJCSE.2017.082879
- [31] H. Köstler, M. Stürmer, and T. Pohl. Performance engineering to achieve real-time high dynamic range imaging. Journal of Real-Time Image Processing, pages 1–13, 2013.10.1007/s11554-012-0312-3
- [32] S. Kronawitter, S. Kuckuk, H. Köstler, and C. Lengauer. Automatic data layout transformations in the exastencils code generator. Modern Physics Letters A, 28(03):1850009, 2018.10.1142/S0129626418500093
- [33] S. Kronawitter, S. Kuckuk, H. Köstler, and C. Lengauer. Automatic data layout transformations in the ExaStencils code generator. Parallel Processing Letters, 28(03):1850009, 2018.10.1142/S0129626418500093
- [34] S. Kronawitter, S. Kuckuk, and C. Lengauer. Redundancy elimination in the ExaStencils code generator. In Algorithms and Architectures for Parallel Processing, pages 159–173, Cham, 2016. Springer International Publishing.10.1007/978-3-319-49956-7_13
- [35] S. Kuckuk, G. Haase, D. A. Vasco, and H. Köstler. Towards generating efficient flow solvers with the ExaStencils approach. Concurrency and Computation: Practice and Experience, 29(17):4062:1–4062:17, 2017. Special Issue on Advanced Stencil-Code Engineering.10.1002/cpe.4062
- [36] S. Kuckuk and H. Köstler. Automatic generation of massively parallel codes from ExaSlang. Computation, 4(3):27:1–27:20, 2016. Special Issue on High Performance Computing (HPC) Software Design.10.3390/computation4030027
- [37] S. Kuckuk and H. Köstler. Whole program generation of massively parallel shallow water equation solvers. In 2018 IEEE International Conference on Cluster Computing (CLUSTER), pages 78–87, Sept 2018.10.1109/CLUSTER.2018.00020
- [38] S. Kuckuk and H. Kstler. Automatic generation of massively parallel codes from exaslang. Computation, 4(3):27:1–27:20, 2016. Special Issue on High Performance Computing (HPC) Software Design.10.3390/computation4030027
- [39] S. Kuckuk, L. Leitenmaier, C. Schmitt, D. Schönwetter, H. Köstler, and D. Fey. Towards virtual hardware prototyping for generated geometric multigrid solvers. Technical Report CS 2017-01, Technische Fakultät, 2017.
- [40] C. Lengauer, S. Apel, M. Bolten, A. Größlinger, F. Hannig, H. Köstler, U. Rüde, J. Teich, A. Grebhahn, S. Kronawitter, et al. Exastencils: Advanced stencil-code engineering. In European Conference on Parallel Processing, pages 553–564. Springer, 2014.10.1007/978-3-319-14313-2_47
- [41] C. Lengauer, S. Apel, M. Bolten, A. Größlinger, F. Hannig, H. Köstler, U. Rüde, J. Teich, A. Grebhahn, S. Kronawitter, S. Kuckuk, H. Rittich, and C. Schmitt. ExaStencils: Advanced stencil-code engineering. In L. Lopes et al., editors, Euro-Par 2014: Parallel Processing Workshops, volume 8806 of Lecture Notes in Computer Science (LNCS), pages 553–564. Springer, 2014.10.1007/978-3-319-14313-2_47
- [42] A. Logg, K.-A. Mardal, and G. N. Wells. Automated Solution of Differential Equations by the Finite Element Method, volume 84 of Lecture Notes in Computational Science and Engineering (LNCSE). Springer, 2012.10.1007/978-3-642-23099-8
- [43] N. Maruyama, K. Sato, T. Nomura, and S. Matsuoka. Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In SC ‘11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–12, Nov 2011.10.1145/2063384.2063398
- [44] G. R. Mudalige, I. Reguly, M. B. Giles, C. Bertolli, and P. H. J. Kelly. OP2: An active library framework for solving unstructured mesh-based applications on multi-core and many-core architectures. In Proc. Innovative Parallel Computing (InPar), San Jose, California, May 2012. IEEE.10.1109/InPar.2012.6339594
- [45] G. Ofenbeck, T. Rompf, and M. Püschel. Staging for generic programming in space and time. SIGPLAN Not., 52(12):15–28, Oct 2017.10.1145/3170492.3136060
- [46] M. Püschel, F. Franchetti, and Y. Voronenko. Spiral, volume 4, pages 1920–1933. Springer, 2011.
- [47] F. Rathgeber, D. A. Ham, L. Mitchell, M. Lange, F. Luporini, A. T. T. Mcrae, G.-T. Bercea, G. R. Markall, and P. H. J. Kelly. Firedrake: Automating the finite element method by composing abstractions. ACM Trans. on Mathematical Software (TOMS), 43(3):24:1–24:27, 2016.10.1145/2998441
- [48] P. Rawat, M. Kong, T. Henretty, J. Holewinski, K. Stock, L.-N. Pouchet, J. Ramanujam, A. Rountev, and P. Sadayappan. SDSLc: A multi-target domain-specific compiler for stencil computations. In Proc. 5th Int’l Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), pages 6:1–6:10. ACM, Nov 2015.10.1145/2830018.2830025
- [49] C. Schmitt, S. Kuckuk, F. Hannig, H. Köstler, and J. Teich. Exa-Slang: A domain-specific language for highly scalable multigrid solvers. In Proc. 4th Int’l Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), pages 42–51. IEEE Computer Society, Nov. 2014.10.1109/WOLFHPC.2014.11
- [50] C. Schmitt, M. Schmid, F. Hannig, J. Teich, S. Kuckuk, and H. Köstler. Generation of multigrid-based numerical solvers for FPGA accelerators. In Proc. 2nd Int’l Workshop on High-Performance Stencil Computations (HiStencils), pages 9–15, Jan. 2015.
- [51] C. Schmitt, M. Schmid, S. Kuckuk, H. Köstler, J. Teich, and F. Hannig. Reconfigurable hardware generation of multigrid solvers with conjugate gradient coarse-grid solution. Parallel Processing Letters, 28(04):1850016, 2018.10.1142/S0129626418500160
- [52] J. Schmitt, H. Köstler, J. Eitzinger, and R. Membarth. Unified code generation for the parallel computation of pairwise interactions using partial evaluation. In 2018 17th International Symposium on Parallel and Distributed Computing (ISPDC), pages 17–24. IEEE, 2018.10.1109/ISPDC2018.2018.00012
- [53] Y. Tang, R. A. Chowdhury, B. C. Kuszmaul, C.-K. Luk, and C. E. Leiserson. The Pochoir stencil compiler. In Proceedings of the Twenty-Third Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 117–128. ACM, 2011.10.1145/1989493.1989508
- [54] U. Trottenberg, C. Oosterlee, and A. Schüller. Multigrid. Academic Press, San Diego, CA, USA, 2001.
- [55] A. Vogel, S. Reiter, M. Rupp, A. Nägel, and G. Wittum. UG 4: A novel flexible software system for simulating pde based models on high performance computers. Computing and Visualization in Science, 16(4):165–179, 2013.10.1007/s00791-014-0232-9
- [56] T. Weinzierl. The peano softwareparallel, automaton-based, dynamically adaptive grid traversals. ACM Transactions on Mathematical Software (TOMS), 45(2):14, 2019.10.1145/3319797