Final year project guidance:: projectwale

Deep Learning-based Job Placement in Distributed Machine Learning Clusters

Abstract

Production machine learning (ML) clusters commonly host a variety of distributed ML workloads, e.g., speech recognition, machine translation. While server sharing among jobs improves resource utilization, interference among co-located ML jobs can lead to significant performance downgrade. Existing cluster schedulers (e.g., Mesos) are interference-oblivious in their job placement, causing suboptimal resource efficiency. Interference-aware job placement has been studied in the literature, but was treated using detailed workload profiling and interference modeling, which is not a general solution. This paper presents Harmony, a deep learning-driven ML cluster scheduler that places training jobs in a manner that minimizes interference and maximizes performance (i.e., training completion time). Harmony is based on a carefully designed deep reinforcement learning (DRL) framework augmented with reward modeling. The DRL employs state-of-the-art techniques to stabilize training and improve convergence, including actor-critic algorithm, job-aware action space exploration and experience replay. In view of a common lack of reward samples corresponding to different placement decisions, we build an auxiliary reward prediction model, which is trained using historical samples and used for producing reward for unseen placement. Experiments using real ML workloads in a Kubernetes cluster of 6 GPU servers show that Harmony outperforms representative schedulers by 25% in terms of average job completion time.

Modules

1)login2)registration3)exam system4)prediction of job

Algorithms

DNN

Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder Frontend :-python Backend:- MYSQL

Price

₹10000 (INR)

Year

2019

Deep Learning-based Job Placement in Distributed Machine Learning Clusters

Abstract

Modules

Algorithms

Software And Hardware

Price

Year

Click here to Call us on +91 9004670813

For synopsis of more than 400 topic click here

Projectwale

Deep Learning-based Job Placement in Distributed Machine Learning Clusters

Abstract

Modules

Algorithms

Software And Hardware

Price

Year

Click here to Call us on +91 9004670813

For synopsis of more than 400 topic click here

Topic name