Archives

Spectral Clustering and Bounded-load Consistent Hashing for Data Placement in Heterogeneous Geo-distributed Systems


T.V. Rohini and Dr.M.V. Ramakrishna
Abstract

There has been considerable progress in processing data placed in heterogeneous distributed environment. The geo-distributed environment consisting of commodity processors has evolved to handle huge data in hundreds of terabytes to petabytes. In this paper we propose a novel approach for dynamic data placement which uses a combination of spectral clustering and consistent hashing. The system is modeled as hyper-graph. We construct the corresponding hyper-graph incidence matrix and apply spectral clustering algorithm. Bounded-Load consistent hashing is used to store data in the storage nodes, which enables us to achieve load balancing and fault-tolerance and enables scalability. In addition to the constraints of the storage and processing capacities of the nodes, the network bandwidth is considered for data placement. Data placement optimization is to minimize access latency, maximizing the throughput of the storage system. Simulation results are presented which show the merits of our system in comparison to the existing systems.

Volume 11 | 06-Special Issue

Pages: 306-315