Challenges of big data storage and management
Main Article Content
Abstract
The amount of data generated daily by industries, large organizations and research institute is increasing at a very fast rate. These huge volumes of data need to be kept not just for analytic purposes, but also in compliance with laws and service level agreements to protect and preserve data. Storage and management are major concern in this era of big data. The ability for storage devices to scale to meet the rate of data growth, enhance access time and data transfer rate is equally challenging. These factors, to a considerable extent, determine the overall performance of data storage and management. Big data storage requirements are complex and thus needs a holistic approach to mitigate its challenges. This paper examines the challenges of big data storage and management. In addition, we also examines existing current big data storage and management platforms and provide useful suggestions in mitigating these challenges.
Keywords: big data, storage systems, challenges, performance.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
References
Bouganim, L., Jónsson, B., & Bonnet, P. (2009). uFLIP: Understanding flash IO patterns. arXiv preprint arXiv:0909.1780.
Chervenak, A., Vellanki, V., & Kurmas, Z. (1998). Protecting file systems: A survey of backup techniques. Paper presented at the Joint NASA and IEEE Mass Storage Conference.
Fontana, R. E., Hetzler, S. R., & Decad, G. (2012). Technology Roadmap Comparisons for TAPE, HDD, and NAND Flash: Implications for Data Storage Applications. Magnetics, IEEE Transactions on, 48(5), 1692-1696. doi: 10.1109/TMAG.2011.2171675
Geer, D. (2008). Reducing the storage burden via data deduplication. Computer, 41(12), 15-17.
He, X., & Zhao, L. (2013). A Data Management and Analysis System in Healthcare Cloud. Paper presented at the Service Sciences (ICSS), 2013 International Conference on.
Ji, C., Li, Y., Qiu, W., Awada, U., & Li, K. (2012). Big data processing in cloud computing environments. Paper presented at the Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks, I-SPAN.
Nakamura, S., Nakayama, K., & Nakagawa, T. (2009). Optimal backup interval of database by incremental backup method. Paper presented at the Industrial Engineering and Engineering Management, 2009. IEEM 2009. IEEE International Conference on.
Policroniades, C., & Pratt, I. (2004). Alternatives for Detecting Redundancy in Storage Systems Data. Paper presented at the USENIX Annual Technical Conference, General Track.
Reed, D. A., Gannon, D. B., & Larus, J. R. (2011). Imagining the future: Thoughts on computing. Computer(1), 25-30.
Renuga, K., Tan, S., Zhu, Y., Low, T., & Wang, Y. (2009). Balanced and efficient data placement and replication strategy for distributed backup storage systems. Paper presented at the Computational Science and Engineering, 2009. CSE'09. International Conference on.
Rui-Xia, Y., & Bo, Y. (2012). Study of NAS Secure System Base on IBE. Paper presented at the Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on.
Singhal, R., Bokare, S., & Pawar, P. (2010). Enterprise storage architecture for optimal business continuity. Paper presented at the Data Storage and Data Engineering (DSDE), 2010 International Conference on.
Sun, G.-Z., Dong, Y., Chen, D.-W., & Wei, J. (2010). Data backup and recovery based on data de-duplication. Paper presented at the Proceedings of the 2010 International Conference on Artificial Intelligence and Computational Intelligence-Volume 02.
Sun, Y., & Xu, Z. (2004). Grid replication coherence protocol. Paper presented at the Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International.
White, T. (2012). Hadoop: The definitive guide: " O'Reilly Media, Inc.".
You, L. L., Pollack, K. T., & Long, D. D. (2005). Deep Store: An archival storage system architecture. Paper presented at the Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on.
YouTube Data Statistics. (2015). Retrieved 01-15-2015, 2015, from http://www.youtube.com/yt/press/statistics.html
Zheng, S., Li, M.-C., & Sun, W.-F. (2011). DRCSM: a Novel Decentralized Replica Consistency Service Model. Journal of Chinese Computer Systems, 32(8), 1622-1627.
Zhou, R., Liu, M., & Li, T. (2013). Characterizing the efficiency of data deduplication for big data storage management. Paper presented at the Workload Characterization (IISWC), 2013 IEEE International Symposium on.