This PhD is written in french.
Ordonnancement Multiprocesseurs et Distribué Temps réel
"Scheduling on Parallel and Distributed Real Time Systems"
PhD Thesis Abstract:
In this thesis, we consider several scheduling problems of parallel computations. A parallel application is modeled by a Directed Acyclic Graph (DAG), where vertices represent tasks and edges the precedence constraints between them. Our contributions fit into three contexts.
The first problem considered is the scheduling and clustering of general and large task graphs with arbitrary execution/communication time onto parallel processing systems with distributed memory. In this context, we propose two algorithms: 1-- An efficient clustering algorithm, based on the technique of critical path scheduling. It is guided by a set of rules in order to reduce considerably the number of processors used in the scheduling process. In order to reduce the time complexity of the proposed algorithm, the length computation of the critical path is done incrementally from one step to another. 2-- A scheduling algorithm on a bounded number of processors. The task selection for scheduling is reduced to only one instead of all and processor selection is reduced to only two instead of all. This algorithm is both fast and powerful compared to other algorithms in the literature.
The second context of this thesis deals with the problem of reliability and scheduling in heterogeneous computing systems. We propose an algorithm which takes into account not only the time makespan but also the failure probability of the application. The heuristic is validated by a series of experimentations and the results obtained are promising.
Finally, we present a new distributed dynamic scheduling scheme for sporadic real-time jobs (DAGs) on arbitrary wide networks. Jobs arrive on any site at any time and compete for the computational resources of the network. The scheduling algorithm presented in this thesis, is based upon a new concept of Computing Spheres in order to determine a good neighborhood of sites that may cooperate for the execution of a job if it cannot be guaranteed locally. This leads to a noticeable increase of the number of accepted jobs. The salient feature of this new concept is that it allows the algorithm to be performed on arbitrary wide networks since it uses a limited number of sites and communication links.
Keywords: reliability, scheduling, clustering, real-time systems, parallel systems, distributed computing systems, directed acyclic graphs DAGs, bi-criteria scheduling, heterogeneous systems.