In telecommunications, 6b/8b is a line code that expands 6-bit codes to 8-bit symbols for the purposes of maintaining DC-balance in a communications system.[1]
The 6b/8b encoding is a balanced code --
each 8-bit output symbol contains 4 zero bits and 4 one bits. So the code can, like a parity bit, detect all single-bit errors.
The number of 8-bit patterns with 4 bits set is the binomial coefficient = 70. Further excluding the patterns 11110000 and 00001111, this allows 68 coded patterns: 64 data codes, plus 4 additional control codes.
Coding rules
The 64 possible 6-bit input codes can be classified according to their disparity, the number of 1 bits minus the number of 0 bits:
Ones
Zeros
Disparity
Number
0
6
−6
1
1
5
−4
6
2
4
−2
15
3
3
0
20
4
2
+2
15
5
1
+4
6
6
0
+6
1
The 6-bit input codes are mapped to 8-bit output symbols as follows:
The 20 6-bit codes with disparity 0 are prefixed with 10 Example: 000111 → 10000111 Example: 101010 → 10101010
The 15 6-bit codes with disparity +2, other than 001111, are prefixed with 00 Example: 010111 → 00010111
The 15 6-bit codes with disparity −2, other than 110000, are prefixed with 11 Example: 101000 → 11101000
The remaining 20 codes: 12 with disparity ±4, 2 with disparity ±6, 001111, 110000, and the 4 control codes, are assigned to codes beginning with 01 as follows:
Type
Input
Output
Type
Input
Output
Complement
−6
000000
01011001
+6
111111
01100110
01_xx__x
−4
000001
01110001
+4
111110
01001110
01xx____
000010
01110010
111101
01001101
000100
01100101
111011
01011010
01x____x
001000
01101001
110111
01010110
010000
01010011
101111
01101100
01_____xx
100000
01100011
011111
01011100
−2
110000
01110100
+2
001111
01001011
01____x__
Control
K 000111
01000111
Control
K 111000
01111000
K 010101
01010101
K 101010
01101010
No data symbol contains more than four consecutive matching bits, and because the patterns 11110000 and 00001111 are excluded, no data symbol begins or ends with more than three identical bits.
Thus, the longest run of identical bits that will be produced is 6. (I.e. this is a (0,5) RLL code, with a worst-case running disparity of +3 to −3.)
Any occurrence of 6 consecutive identical bits constitutes a comma sequence or sync mark or syncword; it identifies the symbol boundaries precisely.
Those 6 bits straddle the inter-symbol boundary with exactly 3 of those identical bits at the end of one symbol, and 3 of those identical bits at the start of the following next symbol.