|Teoria||4||II semestre||Nicola Bombieri|
|Laboratorio||2||II semestre||Nicola Bombieri|
This course aims at providing theoretical and practical knowledge about programming and analysis of advanced computational architectures, with emphasis on multiprocessor and GPU platforms. At the end of the course the student will have to demonstrate the ability to apply the knowledge necessary to: identify techniques for parallel programming, also in a research context, through analysis of application efficiency and by considering both functional and non-functional design constraints (correctness, performance, energy consumption). This knowledge will allow the student to be able to analyze performance and to perform code profiling, by identifying critical zone and the corresponding optimizations by considering the architectural characteristics of the platform. At the end of the course the student will be able to compare parallel patterns and to select the best one by considering the use case; by defining the structure of the optimized code, demonstrate the ability to identify the proper architectural choices, by considering the target application and platform contexts. During the definition of the optimized code structure, the student will have the ability to continue the study autonomously in the field of the parallel programming languages and of the Software development for parallel embedded platforms.
Theory module (32 h):
-) Intro to parallelism and parallel architectures.
-) Programming parallel architectures.
-) Models of parallel programming.
-) Measurement and analysis of performance, Amdhal’s low and metrics for performance analysis.
-) Pipeline: basic and advanced concepts.
-) Instruction-level parallelism (ILP).
-) Memory hierarchy: basic and advanced concepts.
-) Advanced optimization techniques of cache performance.
-) Thread-level parallelism (TLP).
-) General purpose Graphic Processing Unit (GP-GPU).
-) Intro to non-functional contraints: power consumption and energy efficiency.
Lab module (24 h):
-) Parallel compilers for multicore architectures (OpenMP).
-) Paralle compilers for cluster architectures (MPI).
-) GP-GPU programming: CUDA.
To pass the exam, the student has to demonstrate:
- he/she has understood the principles related to the parallel programming
- he/she is able to describe the concepts in a clear and exhaustive way without digressions
- he/she is able to apply the acquired knowledge to solve application scenarios described by means of exercises, questions and projects.
The exam consists of a written test, which contains questions with multiple answers, questions with open answers, and exercises related both the theoretical and lab modules. Alternatively, the student can elaborate a project assigned by the teacher.
|Teoria||John Hennessy, David Patterson||Computer Architecture - A Quantitative Approach (Edizione 6)||Morgan Kaufmann||2018||9780128119051|
|Teoria||David B. Kirk, Wen-mei W. Hwu||Programming Massively Parallel Processors - A Hands-on Approach (Edizione 3)||Morgan Kaufmann||2017||978-0-12-811986-0|