Tuesday, June 21 • 2:50pm - 3:15pm
Deduplicating Compressed Contents in Cloud Storage Environment

Data compression and deduplication are two common approaches to increasing storage efficiency in the cloud environment. Both users and cloud service providers have economic incentives to compress their data before storing it in the cloud. However, our analysis indicates that compressed packages of different data and differ- ently compressed packages of the same data are usual- ly fundamentally different from one another even when they share a large amount of redundant data. Existing data deduplication systems cannot detect redundant data among them. We propose the X-Ray Dedup approach to extract from these packages the unique metadata, such as the “checksum” and “file length” information, and use it as the compressed file’s content signature to help detect and remove file level data redundancy. X-Ray Dedup is shown by our evaluations to be capable of breaking in the boundaries of compressed packages and significantly reducing compressed packages’ size requirements, thus further optimizing storage space in the cloud.

Tuesday June 21, 2016 2:50pm - 3:15pm MDT
Denver Marriott City Center 1701 California Street, Denver, CO 80202