An IDA-Based Parallel Storage Scheme in the Scientific Data Grid

Weizhong Lu; Yuanchun Zhou; Lei Liu; Baoping Yan

doi:10.2481/dsj.009-006

An IDA-Based Parallel Storage Scheme in the Scientific Data Grid

Data Science Journal

Volume 9 (2010): Issue 0

By: Weizhong Lu, Yuanchun Zhou, Lei Liu and Baoping Yan

Open Access

|May 2010

Abstract

It is important to improve data reliability and data access efficiency for data-intensive applications in a data grid environment. In this paper, we propose an Information Dispersal Algorithm (IDA)-based parallel storage scheme for massive data distribution and parallel access in the Scientific Data Grid. The scheme partitions a data file into unrecognizable blocks and distributes them across many target storage nodes according to user profile and system conditions. A subset of blocks, which can be downloaded in parallel to remote clients, is required to reconstruct the data file. This scheme can be deployed on the top of current grid middleware. A demonstration and experimental analysis show that the IDA-based parallel storage scheme has better data reliability and data access performance than the existing data replication methods. Furthermore, this scheme has the potential to reduce considerably storage requirements for large-scale databases on a data grid.

Articles in this issue

DOI: https://doi.org/10.2481/dsj.009-006 | Journal eISSN: 1683-1470

Journal RSS Feed

Language: English

Page range: 29 - 41

Published on: May 19, 2010

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

Parallel,

Storage,

Data Grid

© 2010 Weizhong Lu, Yuanchun Zhou, Lei Liu, Baoping Yan, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 9 (2010): Issue 0