The Details of DropBox H.264 Lossless Compression
A couple of weeks ago we bumped into an article about Dropbox engineers recreating the Pied Piper algorithm from a popular TV show. Dropbox developed a lossless compression algorithm for H.264 and JPEG files, and our team got down to evaluate this solution and glean some tangible details.
With half an eye, we realized a re-compressed H.264 file ceased to be H.264 and could only be used for interim storage. Besides, this compression method could result effective under two conditions: if the file used CAVLC as a coder, or the file was coded with PU and TU blocks of maximal size. Both conditions can only be met if the H.264 codec is set for maximal speed coding.
About Pied Piper project
Video is one of the heaviest data formats, so there's no surprise that video processing, transfer and storage services have to consider compression as a viable solution. However, there are a few options out there. In August 2015 DropBox unveiled their vision of the issue - an original, yet incomplete, algorithm for H.264 standard video files. The company's main focus is on storage of their customers' files. An average user cares little about how these files are stored, what matters is to download exactly the same files intact – the way they got uploaded. Hence the DropBox algorithm is a lossless one. Besides, the result of the compression is not a video file of the source format.
Our article aims to assess efficiency of this algorithm on compressing diverse file formats. For that purpose, we'll be using Solveig Multimedia's ZOND 265, an analyzer of H.264 and H.265 video files, as an auxiliary tool.
Estimating efficiency of the Pied Piper project compressor
The compressor's source code and test files are available at GitHub. For starters, we'll compile a compressor and estimate the test files. Afterwards, we'll measure the effectiveness against real-life video files.
Specific guidelines for Pied Piper compressor compilation are only available for Linux. In fact, it's a single file - "piedpiper_make" script. Hence we need to load Linux Ubuntu x64 and enter three commands:
After the compilation, check your current folder for the compressor files:
- h264dec – the compressor and decompressor executable file
- so.0, libopenh264.so – a dynamic auxiliary library and a link to it.
The compression if performed by the command:
./h264dec ./source-file.264 ./destination-file.pip,
./h264dec ./compressed-file.pip ./decompressed-file.264.
Pied Piper test files
According to Git repository, the DropBox team used these files as test material: "black.264", "tibby.264", "walk.264", 'BA1_FT_C.264", "BAMQ2_JVC_C.264". We upload them into Zond 265 and discover they have been compressed by the same method (see Zond 265 screenshots, Bitstream tab, pic. 1 - 3 for "tibby.264" file). The files' main properties are the use of CAVLC (PPS, entropy_coding_mode_flag: 0) and no B-frames (SPS,max_num_reorder_frames: 0). We picked the first three files for our efficiency tests.
Pic. 1. Sequence Parameter Set (SPS) block for "tibby.264" file
Pic. 2. Picture Parameter Set block for "tibby.264" file
Pic. 3. Video stream structure for "tibby.264" file
Other test files
Users may employ a bunch of tools to get video files: shoot a video with a camera (eg., on a cell-phone), download it from numerous video services (YouTube, VK, Vimeo, Facebook, etc.), or use software with recoding functionality.
We select the "VID_20150917_131139.264" file as an ordinary cell-phone video clip. Like the previous samples, it contains no B-frames, but uses CABAC rather than CAVLC as an arithmetic coder. The compressor returns an error for YouTube files (they include B-frames and use CABAC), so we leave those behind our testing framework.
Picking recoding-enabled software, we opt for the ffmpeg console utility ("libx264" module). Leaping ahead, the compression is only possible with the "ultrafast" preset, revealing no result with the "superfast" preset. The test files are "tractor-ultrafast.264", "tractor-superfast.264".
The test results are displayed in Tables 1 and 2. The quantitative data for PU and TU blocks has been collected using Zond 265 software (Stream Stats tab). Pic. 4 reveals a screenshot of "tibby.264" file data.
Pic. 4. PU and TU blocks data for "tibby.264" file
Table 1. Pied Piper compressor. Efficiency test results.
Table 2. Pied Piper compressor. Efficiency test results.
As the tables above imply, the current version of the Pied Piper algorithm only works under two scenarios: if a file uses CAVLC as a coder, or the file is coded with PU and TU blocks of maximal size. These scenarios require maximal speed coding from a H.264 codec. Apparently, all this boils down to pretty large files. Such are the files created by the ffmpeg codec with libx264, "ultrafast" preset on.
That's it for now. We appreciate the time taken to go through this research with our team. Looking forward to your feedback!