Thesis

This section of the Web site provides theses and projects proposals for students. The main topics include Cloud computing, Edge computing, Deep Learning applications, and GPGPUs systems scheduling.

If you are interested in one of the proposal, contact me by e-mail.

A Reinforcement Learning framework for the runtime management of next generation Smart Glasses (in Collaboration with Luxottica)

Virtual reality and augmented reality have gained popularity during the last decades.
Companies are pushing efforts towards making these technologies profitable for everyday life.
This is the reason why some companies are designing smart glasses that can be used in healthcare, industry, and entertainment. Those smart glasses are meant to run Artificial Intelligence (AI) applications that rely on Deep Neural Networks. Most of these applications have real-time constraints that must be considered while designing the smart glasses. Unfortunately, despite the advancement of chip design technology, many devices still have limited computational and energy capacity when running AI applications. One of the adopted solutions is the offloading of some computations to an edge device or to the cloud to alleviate the workload on the smart glasses, hence reduce the power consumption. The offloading comes at a price of data transfer latency, i.e. when the smart glasses choose to offload parts of its computation it should send data to the other device using a network channel which introduces a latency that depends on the channel throughput.
As there are real-time constraints, we should ensure that the end-to-end execution time (computing time, data transfer time) on all the devices will remain below a specific value. In this case there should be a mechanism to switch among different DNN configurations considering the state of the networks, the current battery capacity of the smart glasses to ensure that the constraints are met.

This project aims at designing a Reinforcement Learning (RL) framework for runtime management of the next generation smart glasses. A RL agent will choose at runtime which DNN configuration  to run to minimize the energy consumption and the 5G connection cost while considering the real-time constraint (End-to-end execution time below the maximum admissible value). The framework should be DNN independent which means that it should support different DNNs with different configurations and partitioning points.

References

  1.  Y. Kang et al., Neurosurgeon: Collaborative intelligence between the cloud and mobile edge.
    Proc. 22nd Int. Conf. Archit. Support Program. Lang. Oper. Syst. (ASPLOS), pp. 615-629, 2017.
  2. N. M. Kumar, N.K. Singh, and V. Peddiny. Wearable smart glass: Features, applications, current progress and challenges. In 2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT), pages 577–582. IEEE, 2018.
  3. H.-N. Wang et al., “Deep reinforcement learning: A survey,” Front. Inf. Technol. Electron. Eng.,vol. 21, no. 12, pp. 1726–1744, 2020.

Optimal Component Placement and Runtime Management of Artificial Intelligence Applications in Edge Systems

Edge computing is a distributed computing paradigm that brings computation and data storage closer to the location where it is needed to improve execution times and save bandwidth.  Nowadays, edge computing systems include remote cloud servers,  edge servers, and sensors (which recently provide also some limited computing capabilities).

Artificial Intelligence (AI) is becoming pervasive today, with the AI software platforms worldwide market forecast to grow significantly through 2023, approaching USD 11.8 billion in revenue at a compound annual growth rate of 35.3%. Many of the benefits of this evolution will come from using edge computing resources. Many companies are evaluating the use of edge computing for data collection, processing, and online analytics to reduce applications latency and data transfers. A growing number of use cases, e.g., predictive maintenance, machine vision, and healthcare to name a few, can benefit from AI applications spanning edge-to-cloud infrastructures. Edge intelligence, i.e., edge-based inferencing, will become the foundation of all industrial AI applications while most new applications will involve some AI components at various levels of the edge-to-cloud infrastructure.

If on the one side the main advantage of edge systems is to improve applications performance by reducing the latency, on the other side, edge resources have usually less computing capacity than the cloud and can become a bottleneck in the computation. Moreover, the workload can fluctuate during the runtime because a different number of users can connect to the system or different data volumes can be generated in different times. Therefore, application component assignment to the resources should change so as to guarantee Quality of Service (QoS) constraints. QoS constraints usually include response time constraints predicating on application component execution times (e.g., a frame of a complex image-processing application needs to be processed in less than 100ms) or application throughput (e.g., 40 frames per seconds needs to be processed to identify security violations in a video surveillance system).

The goal of this project is to develop reinforcement learning and Bayesian optimization methods for optimizing the component placement of AI applications running in edge-to-cloud infrastructures and managing the edge system at runtime to cope with workload variations.

References

  1. Sedghani, F. Filippini, D. Ardagna. A Random Greedy based Design Time Tool for AI Applications Component Placement and Resource Selection in Computing Continua. IEEE Edge 2021 Proceedings (2021 IEEE International Conference On Edge Computing). 32-40. Guangzhou, China (online). 2021.
  2. Sedghani, F. Filippini, D. Ardagna. A randomized greedy method for AI applications component placement in Computing Continua. IEEE JCC 2021 Proceedings (12th IEEE International Conference On JointCloud Computing). Short paper. Online, 1-6. doi: 10.1109/JCC53141.2021.00022.
  3. E. Galimberti, B. Guindani, F. Filippini, H. Sedghani, D. Ardagna, S. Risco, G. Moltó, M. Caballer. OSCAR-P and aMLLibrary: Performance Profiling and Prediction of Computing Continua Applications. AIPerf 2023@ICPE Workshop. (1st International Workshop for Performance Modeling, Prediction, and Control). ACM ICPE Companion Proceedings. 139-146. 2023.

Bayesian Optimization for Sizing High Performance Computing Systems

Advisors: Prof. Alessandra Guglielmi, Prof. Danilo Ardagna

Today High-Performance Computing (HPC) systems are heavily changing our society, e.g., in the financial sector or healthcare. Companies are becoming more and more aware of the benefits of data processing technologies; across almost any sector most of the industries use or plan to use machine learning or simulation-based techniques.

Picking the right HPC cluster configuration for recurring applications is hard, because there can be tens of possible configurations to pick from. Choosing poorly can lead to performance degradation and higher costs to run an application. However, it is challenging to identify the best configuration from a broad spectrum of alternatives.

The aim of this research work is to build accurate machine learning models to estimate the performance of HPC applications by considering only few test runs on reference systems and to identify optimal or close to optimal configurations.  Bayesian methods will be mixed with traditional techniques for performance modelling, which includes computer systems simulations or bounding techniques.

References

  1. Brochu, V. M. Cora, N. de Freitas. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning.
  2. Alipourfard, H. H. Liu, J. Chen, S. Venkataraman, M. Yu, M. Zhang. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics. NSDI 2017 Proceedings.
  3. Palermo, G. Accordi, D. Gadioli, E. Vitali, C. Silvano, B. Guindani, D. Ardagna, A. R. Beccari, D. Bonanni, C. Talarico, F. Lunghini, J. Martinovic, P. Silva, A. Bohm, J. Beranek, J. Krenek, B. Jansik, L. Crisci, B. Cosenza, P. Thoman, P. Salzmann, T.Fahringer, L. Alexander, G. Tauriello, T. Schwede, J. Durairaj, A. Emerson, F. Ficarelli, S. Wingbermuhle, E. Lindahl, D. Gregori, E.Sana, S. Coletti, P. Gschwandtner. Tunable and Portable Extreme-Scale Drug Discovery Platform at Exascale: the LIGATE Approach. ACM International Conference on Computing Frontiers. 1-7.
  4. B. Guindani, D. Ardagna, A. Guglielmi. MALIBOO: When Machine Learning meets Bayesian Optimization. IEEE SmartCloud 2022 Proceedings (The 7th IEEE International Conference on Smart Cloud). 1-9. Shanghai, China. 2022.

Job Scheduling and Optimal Capacity Allocation Problems for Deep Learning Training Jobs with Stochastic Execution Times

The Deep Learning (DL) paradigm has gained remarkable popularity in the last few years. DL models are often used to tackle complex problems in the fields of, e.g., image recognition and healthcare; however, the training of such models requires a very large computational power. The recent adoption of GPUs as general-purpose parallel processors has partially fulfilled this need, but the high costs related to this technology, even in the Cloud, dictate the necessity of devising efficient capacity planning and job scheduling algorithms to reduce operational costs via resource sharing. Starting from an already developed heuristic approach, based on random greedy and path relinking, to tackle the joint capacity planning and job scheduling problem, the aim of this work is to extend the existing framework to a more general setting, exploring the stochastic nature of the jobs expected execution times due to the variability of the number of iterations needed to achieve a target accuracy in the training process.

References

  1. F. Filippini, M. Lattuada, M. Ciavotta, A. Jahani, D. Ardagna, E. Amaldi, Hierarchical Scheduling in on-demand GPU-as-a-Service Systems, SYNASC2020: 125-132″
  2. F. Filippini, D. Ardagna, M. Lattuada, E. Amaldi, M. Ciavotta, M. Riedl, K. Materka, P. Skrzypek, F. Magugliani, M. Cicala. ANDREAS: Artificial intelligence traiNing scheDuler foR accElerAted resource clusterS. EMSICC@FiCloud 2021 Workshop Proceedings. (7th International Workshop on Energy Management for Sustainable Internet-of-Things and Cloud Computing). 388-393. 2021.
  3. Filippini, M. Lattuada, M. Ciavotta, A. Jahani, D. Ardagna, E. Amaldi. A Path Relinking Method for the Joint Online Scheduling and Capacity Allocation of DL Training Workloads in GPU as a Service Systems. IEEE Transactions on Services Computing. 16(3). 1630-1646. 2023.

Some Previous Thesis Works