# Number TheoryHelp with encoding matrices

#### Yuuki

##### Member
In my textbook, the encoding matrix H was given as the matrix such that
if U is a codeword then HUT is the zero vector.
In this case, the number of columns in H would be the length of the codeword.
But in another explanation, I read that an encoding matrix H is such that if M is the message and U the codeword with the check digits appended, HM = U.
In this case, the number of columns would be the length of the message.

This question arose in the context of the following question, so I may ask questions related to it later, after I understand better what exactly an encoding matrix is.

To transmit the eight possible messages of three binary symbols, a code appends three further symbols: if the message is ABC, the code transmits ABC(A+B)(A+C)(B+C). Write down the encoding matrix for this code. How many errors can it detect and correct?

#### chisigma

##### Well-known member
In my textbook, the encoding matrix H was given as the matrix such that
if U is a codeword then HUT is the zero vector.
In this case, the number of columns in H would be the length of the codeword.
But in another explanation, I read that an encoding matrix H is such that if M is the message and U the codeword with the check digits appended, HM = U.
In this case, the number of columns would be the length of the message.

This question arose in the context of the following question, so I may ask questions related to it later, after I understand better what exactly an encoding matrix is...

To transmit the eight possible messages of three binary symbols, a code appends three further symbols: if the message is ABC, the code transmits ABC(A+B)(A+C)(B+C). Write down the encoding matrix for this code. How many errors can it detect and correct?...
The [n,k] code has generating matrix that in standard form $G = [I_{k} | A ]$, with n=6 and k=3 is written as...

$$G = \left| \begin{array}{ccc}1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 \end{array} \right| \ (1)$$

... and from (1) You derive the parity check matrix in the form $H = [- A^{T}|I_{n-k}]$ ...

$$H = \left| \begin{array}{ccc}1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 & 1 \end{array} \right| \ (2)$$

The minimum distance can be computed from the 8 possible codewords...

0 0 0 0 0 0
0 0 1 0 1 1
0 1 0 1 0 1
0 1 1 1 1 0
1 0 0 1 1 0
1 0 1 1 0 1
1 1 0 0 1 1
1 1 1 0 0 0

... and it is easy to see that the minimun distance is $D_{min}=3$, so that the code is capable to correct 1 error in the block and detect [but not correct...] 2 errors in the block...

Kind regards

$\chi$ $\sigma$

Last edited:

#### Yuuki

##### Member
Thank you, I now understand.
I see that I was confused between generating matrix and encoding matrix.

I have a tag-along question: are the following properties unique to an encoding matrix for a Hamming code matrix, or do they hold in general?
- if V is a non-codeword, HVT shows exactly which symbol is wrong.
- each column of H is a binary representation of its column number. (I think holds only for Hamming codes from the problem above. if so, is it true that every encoding matrix for a (2n-1, 2n-1-1n) can be obtained by permuting the columns of one for a Hamming code encoding matrix?)

Last edited:

#### chisigma

##### Well-known member
Thank you, I now understand.
I see that I was confused between generating matrix and encoding matrix.

I have a tag-along question: are the following properties unique to an encoding matrix for a Hamming code matrix, or do they hold in general?
- if V is a non-codeword, HVT shows exactly which symbol is wrong.
- each column of H is a binary representation of its column number. (I think holds only for Hamming codes from the problem above. if so, is it true that every encoding matrix for a (2n-1, 2n-1-1n) can be obtained by permuting the columns of one for a Hamming code encoding matrix?)
... if V is a non-codeword, HVT shows exactly which symbol is wrong?...

Unfortunately in general that's not true!... if a single error occurred in a symbols block, then the syndrome HVT shows You the position of the error, but if a double error occurred, because half of de codeword have distance 3 and half have distance 4, Your are not able to correct the corrupted word. A possible strategy to improove security is to renounce to the capability to correct errors and maintain the capability to detect errors, and in this case all double errors are detected. From this point of view the proposed code is not optimal and a stanrad [7,4] Hamming code, for which is...

$$G = \left| \begin{array}{ccc}1 & 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \end{array} \right| \ (1)$$

... and...

$$H = \left| \begin{array}{ccc}1 & 1 & 0 & 1 & 1 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 1 & 0 \\ 0 & 1 & 1 & 1 & 0 & 0 & 1 \end{array} \right| \ (2)$$

... has greater efficiency and, because is for it $D_{min} = 3$, allows the correction of all single errors and the detection of all double errors...

... each column of H is a binary representation of its column number. I think holds only for Hamming codes from the problem above. If so, is it true that every encoding matrix for a (2n-1, 2n-1-1n) can be obtained by permuting the columns of one for a Hamming code encoding matrix?...

Yes, that's correct...

Kind regards

$\chi$ $\sigma$

#### Yuuki

##### Member
Thank you.

1. I understand "n-error-correcting" is a property such that if a non-codeword has at most n errors, the original codeword can be uniquely retrieved. Is n-error correcting equivalent to having a parity check matrix such that HVT shows exactly which (at most n) symbols are wrong for non-codewords?
?

2.
has greater efficiency and
can you explain what the two words "security" and "efficiency" mean?
why is the hamming code more efficient than the [6, 3] code above?
From what I looked up, "efficiency" means less computational effort; how does this relate to the matrices of each code?

3.
A possible strategy to improove security is to renounce to the capability to correct errors and maintain the capability to detect errors
so for a code, detecting an error is more important than correcting them?

4.
if a double error occurred, because half of de codeword have distance 3 and half have distance 4
can you give me a hint to how I can arrive at this?

Last edited:

#### chisigma

##### Well-known member
1. I understand 'n-error-correcting' is a property such that if a non-codeword has at most n errors, the original codeword can be uniquely retrieved. Is n-error correcting equivalent to having a parity check matrix such that HVT shows exactly which (at most n) symbols are wrong for non-codewords?...

For semplicity sake let's consider the case n=1, which menas that the minimum distance among its codewords must be not less than 3. In reception You compute the syndrome S= HVT and if S is not the zero vector You know that the received message contains one or more errors. At this point You have two alternatives :

a) You suppose that only one error occurred and proceed to correct it...

b) You 'suspect' that more than one error occurred, don't accept the 'risk' and require, if possible, the repetition of the message...

2. ... can you explain what the two words 'security' and 'efficiency' mean?... why is the Hamming code more efficient than the [6, 3] code above?... from what I looked up, 'efficiency' means less computational effort; how does this relate to the matrices of each code?...

I mean 'efficiency' of a block code the ratio between information bits and transmitted bits in each block, i.e. the ratio $S= \frac{k}{n}$. A [6,3] code has $S= \frac{1}{2}$ and a [7,4] Hamming code has $S=\frac{4}{7} > \frac{1}{2}$ and that means that You have an higher 'information rate'...

3. ... so for a code, detecting an error is more important than correcting them?...

As explained in 1., that's depends from Your 'strategy' to oppose to errors...

4. ... if a double error occurred, because half of de codeword have distance 3 and half have distance 4... can you give me a hint to how I can arrive at this?...

Very well!... the 8 possible codewords of the [6,3] code are reported here...

0 0 0 0 0 0
0 0 1 0 1 1
0 1 0 1 0 1
0 1 1 1 1 0
1 0 0 1 1 0
1 0 1 1 0 1
1 1 0 0 1 1
1 1 1 0 0 0

If You suppose to have trasmitted the codeword 0 0 0 0 0 0 , it is easy to see that there are 4 codewords with distance 3 and 3 words with distance 4 from it. You can easily check that the same if for all the remaining codewords...

Kind regards

$\chi$ $\sigma$

#### Yuuki

##### Member
I mean 'efficiency' of a block code the ratio between information bits and transmitted bits in each block, i.e. the ratio S=kn. A [6,3] code has S=12 and a [7,4] Hamming code has S=47>12 and that means that You have an higher 'information rate'...
So I guess the intuitive approach as to why a code is a Hamming code is more "efficient" is because you can pack more content into the same amount of data.

Thank you so much for the answers.
For the moment, I have no more questions.