Run-Length Encoding (RLE) Explained

Run-Length Encoding (RLE) is one of the simplest and oldest forms of lossless data compression. Its core idea is to reduce the size of repeating sequences of data, often called "runs." Instead of storing each individual data item in a run, RLE stores the data item and the count of its repetitions.
How Does RLE Work?
Imagine you have a sequence of data like: AAAAABBBCCDAA
.
RLE would encode this as:
A
repeats 5 times: Encode as5A
(orA5
, depending on implementation)B
repeats 3 times: Encode as3B
C
repeats 2 times: Encode as2C
D
repeats 1 time: Encode as1D
(or justD
if single occurrences aren't encoded with a count)A
repeats 2 times: Encode as2A
So, the original sequence of 13 characters could be represented more compactly, for example, as 5A3B2C1D2A
(10 characters, in this specific representation).
The actual implementation can vary. For example:
- Some RLE schemes only encode runs of three or more identical characters to avoid expanding data where there are few runs (e.g.,
ABCDE
would become1A1B1C1D1E
, which is longer). - Special markers might be used to differentiate between literal data and run-count/data pairs.
Example
Consider a small black and white bitmap image line represented by WWWWBBWWWBBBBW
(W for white pixel, B for black pixel).
Using RLE, this could be encoded as:
- 4 White pixels:
4W
- 2 Black pixels:
2B
- 3 White pixels:
3W
- 4 Black pixels:
4B
- 1 White pixel:
1W
The encoded string: 4W2B3W4B1W
. The original string had 14 characters, while the RLE version (in this format) has 10 characters.
Advantages of RLE
- Simplicity: It's very easy to implement and understand.
- Speed: Compression and decompression are generally very fast due to the straightforward logic.
- Effective for specific data types: Works very well on data with many long runs of identical values, such as simple icons, line drawings, and some types of bitmap images (especially those with large areas of uniform color), or intermediate data in scientific computations.
Disadvantages of RLE
- Inefficiency with non-repetitive data: If the data has few or no runs, RLE can actually increase the file size. For example, compressing
ABCDEFG
might result in1A1B1C1D1E1F1G
. - Limited compression ratios: For complex data like photographs or text documents, RLE typically achieves poor compression compared to more sophisticated algorithms like LZW or Huffman coding.
Common Use Cases
- Image Compression: Early image formats like PCX and BMP (in some modes) use RLE. It's also a component in some fax machine standards (ITU-T T.4 standard).
- Video Compression: Can be used for compressing sequences of identical frames or regions within frames.
- Intermediate Compression Step: Sometimes RLE is used as a preliminary step before applying another, more complex compression algorithm.
- Computer Graphics: Used in areas like encoding masks or simple textures.
Further Reading
For a more technical dive into data compression techniques, including RLE, you can explore resources like the GeeksforGeeks article on Run-Length Encoding.