# Design Study of a Computer System Employing Memory Compression

[Examensarbete på avancerad nivå]

Memory compression is a promising technique for computer systems to increase cache and memory capacity, leading to a decrease of the number of required lower-level accesses
without adding signicant cost or energy consumption to a system. This thesis answers some of the questions arising when implementing a Human algorithm as compression
algorithm on main memory content, such as compression factor compared to other compression algorithms and feasibility for compression in a system.

This thesis contains a compression analysis of two scale-out benchmarks from CloudSuite. The two benchmarks, Data analytics and Graph analytics, were set up running on virtual machines. While the benchmarks were running, the virtual machines had their memory extracted at multiple occasions. By using the virtual-to-physical address translation for the benchmarks' processes, in combination with the extracted memory sample, it was possible to create images of the memory used by the benchmarks' processes. These images were analysed with several dierent compression algorithms including Huffman, BI, and FPC.

The Huffman algorithm uses a value frequency table (VFT) which contains values and their frequency. The VFT is used to create the encoding for the Human tree where values with high frequency are given a short code word. A larger VFT will give a better compression factor but reduce speed, increase area and energy consumption. This thesis shows that the size of the VFT can be kept small and that the gains in compression factor of having a large VFT signicantly decreased with VFT sizes over 1024 entries.
The results in the compression analysis show that the memory compression factor for the two CloudSuite benchmarks is around 2.7X for a VFT with 1024 entries.

In a traditional computer system blocks and pages are xed in size. A computer system using memory compression must allow for blocks and pages to vary in size, however, it
would be infeasible to allow for all dierent block and page sizes. The thesis shows the impact on the Huffman compression factor when only particular compression factors or compressed sizes are allowed. Four additional block addressing bits result in a compression loss of between 4%-7%. A similar analysis for pages show that adding four or six additional addressing bits result in compression losses of between 6%-9% and 1%-2%, respectively.

Following the compression analysis, a few compression schemes were implemented and evaluated in hardware. One of these schemes, the Huffman coding scheme, was also implemented as part of a proof-of-concept. The proof-of-concept was implemented on a ZedBoard, an evaluation board that incorporates both a dual ARM core as well as programmable logic. The ordinary datapath between the cores and the memory was rerouted to pass through the programmable logic of the ZedBoard. In this programmable
logic, the Huffman compressing and decompressing logic was implemented.