Skip to content

Instantly share code, notes, and snippets.

@noahdominic
Created October 29, 2018 03:04
Show Gist options
  • Save noahdominic/8d6b52918005dbdb04e853355af51bbc to your computer and use it in GitHub Desktop.
Save noahdominic/8d6b52918005dbdb04e853355af51bbc to your computer and use it in GitHub Desktop.
error detection

Error detection

Networks must be able to transfer data from one device to another with acceptable accuracy. For most applications, a system must guarantee that the data received are identical to the data transmitted. However, data can be corrupted during transmission. Thus, applications must have a mechanism for detecting and correcting errors.

Introduction

Types of Errors

  1. Single-bit error. Happens when only one (1) bit of a given data unit (byte, character, packet) is changed from 0 to 1 or 1 to 0.
  2. Burst error Happens when two or more bits of a given data unit is changed. This kind of error is more likely to happen because most noises don’t last fast enough to affect only one byte. For example, if one is transmitting data at 1 kbps and a noise occurs at 1/100 s, 10 bits will be affected. If one is transmitting at 1 Mbps, the same noise affects 10000 bits.

Redundancy

To detect errors, we need to send some extra bits with our data. The process of sending additional bits along with the data is called redundancy.

Sender {
    message = some_bit_value
    encoded_message = generator(message)
    send(encoded_message)
}

Receiver{ receive(encoded_message) isAccepted = checker(encoded_message) if(isAccepted) message = extract(encoded_message) else discardMessage() }

Coding

Redundancy is achieved through various coding schemes. The sender adds redundant bits through a process that creates a relationship between the redundant bits and the actual data bits. The receiver checks the relationships between the two sets of bits to detect or correct the errors. The ratio of redundant bits to the data bits and the robustness of the process are important factors in any coding scheme.

Block Coding

In block coding, we divide our message into blocks, each of kkk bits, called datawords. We add rrr redundant bits to each block to make the length n=k+rn = k + rn=k+r. The resulting n-bit blocks are called codewords.

In this model, we have a set of datawords, each of size kkk, to create a combination of 2k2^k2k datawords, and a set of codewords, each of size n=k+rn = k + rn=k+r, to create a set of 2k+r2^{k+r}2k+r codewords. The block coding process is one-to-one; the same datawords is always encoded as the same codeword. Since n>kn > kn>k, we have 2n−2k2^n - 2 ^ k2n2k codewords that are not used.

Example

For example, let us assume that k=2k = 2k=2 and n=3n = 3n=3. The table below shows the list of datawords and codewords.

Datawords Codewords
00 000
01 011
10 101
11 110

Assume that the sender encodes datawork 01 as 011 and sends it to the receiver. Consider the following cases:

  1. The receiver receives 011. It is a valid codeword. The receiver extracts 01.
  2. The codeword is corrupted. 111 is received. This is not a valid codeword and it is discarded.
  3. The code word is corrupted. 000 is received. This is a valid codeword and is accepted. The receiver incorrectly extracts 00. The corruption is undetected.

Hamming Distance

The Hamming distance is one of the central concepts in error control. The Hamming distance is the distance between two words of the same size.

Minimum Hamming Distance

The Hamming distance between the received codewords and the sent codewords is the number of bits corrupted during transmission.
If our code is to detect up to sss errors, the minimum distance between valid codewords must be s+1s + 1s+1.

Let’s assume a code

Datawords Codewords
00 000
01 011
10 101
11 110

d(000,011)=2d(000, 011) = 2d(000,011)=2
d(000,101)=2d(000, 101) = 2d(000,101)=2
d(000,110)=2d(000, 110) = 2d(000,110)=2
d(011,101)=2d(011, 101) = 2d(011,101)=2
d(011,110)=2d(011, 110) = 2d(011,110)=2
d(101,110)=2d(101, 110) = 2d(101,110)=2
dmin=2d_{min}=2dmin=2
This code has a minimum Hamming distance of 222. And could therefore guarantee to catch an error of size 111. Since s+1=dmins + 1 = d_{min}s+1=dmin.

Linear Block codes

A code win which the exclusive OR of two valid codewords creates another valid codeword.
The minimum Hamming distance of a linear block code is the number of 1s on the nonzero valid codeword with the smallest number of 1s.

Our first code is an example of a linear block code

Datawords Codewords
00 000
01 011
10 101
11 110

Simple parity check

In this code, a k-bit dataword is changed to an n-bit codeword where n = k + 1. This extra bit, called the parity bit, is selected to make the total number of 1s even.

Datawords Codewords
0000 00000
0001 00011
0010 00101
0011 00110
0100 01001
0101 01010
0110 01100
0111 01111
1000 10001
1001 10010
1010 10100
1011 10111
1100 11000
1101 11011
1110 11101
1111 11110

The parity bit is generated by adding the four bits in the datawords. The result is the parity bit.
r0=a3+a2+a1+a0(modulo2)r_0 = a_3 + a_2 + a_1 + a_0 (modulo2)r0=a3+a2+a1+a0(modulo2)

The receiver adds all the bits in the received codeword. The result is the syndrome. It’s 000 when the number of 1s is even.
s0=b3+b2+b1+b0+q0(modulo2)s_0=b_3+b_2+b_1+b_0+q_0 (modulo2)s0=b3+b2+b1+b0+q0(modulo2)

Example

Let’s send 1011. We’ll encode it as 10111. There are 5 possible outcomes.

  1. No error occurs. The received codewords is 10111. The syndrome is 0. Dataword 1011 is received.
  2. A single-bit error occurs. 10011. The syndrome is 1. Rejected.
  3. A single-bit error occurs. 10110 is received. The syndrome is 1. Rejected.
  4. Burst error. 00110 is received. Syndrome is 0. Dataword 011 is received.
  5. Burst error. 01011. Syndrome is 1. Rejected.

Hamming codes

In this code, the minimum hamming distance is 3. If m = 3, then n = 7 and k = 4.

Redundant bits are generated and verified by performing three parity checks on a2+a1+a0a_2+a_1+a_0a2+a1+a0, a3+a2+a1a_3+a_2+a_1a3+a2+a1, and a1+a0+a3a_1+a_0+a_3a1+a0+a3.

Checksum

Suppose our data is a list of 4-bit numbers. In addition to these numbers, we send the sum of the numbers. For example, let’s send [7,11,12,0,6][7, 11, 12, 0 ,6][7,11,12,0,6], we will have to include [36][36][36], resulting in a final transmission of [7,11,12,0,6,36][7, 11, 12, 0, 6, 36][7,11,12,0,6,36]. The receiver adds the five numbers and compares it to the sum.

Sometimes the inverse of the sum is sent. If we use our previous example, we’d be sending [7,11,12,0,6,−36][7, 11, 12, 0, 6, -36][7,11,12,0,6,36]. The receiver adds all the numbers. If the result is 000, there were no errors.

References

Behrouz A. Forouzan & Sophia Chung Fegan. Data Communications and Networking [4th Ed.], 2007.
Samuel Zheler. Crates [picture]. Available: unsplash.com
EMI. Trade ad for Beatles’ 1964 Grammys. — This is a version with just the Beatles isolated from the ad, 1965. Available: commons.wikimedia.org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment