Files
lab_encoding/questions.md
2026-03-09 13:59:17 -04:00

120 lines
5.0 KiB
Markdown

# Boolean questions
Create the following variables.
```
a = Bits("11110000")
b = Bits("10101010")
```
For each of the following bytes, give an equivalent
expression which uses only `a`, `b`, and bit operators.
The answers to the first two questions are given.
1. 01010101
~b
2. 00000101
~a & ~b
3. 00000001
~a >> 3
4. 10000000
a << 3
5. 01010000
a & ~b
6. 00001010
~a & b
7. 01010000
a & ~b
8. 10101011
b | (~a >> 3)
## Integer questions
These questions are difficult! Try exploring ideas with `Bits`
in Terminal, a paper and pencil, and a whiteboard. And definitely
talk with others.
9. If `a` represents a positive integer, and `one = Bits(1, length=len(a))`,
give an expression equivalent to `-a`, but which does not use negation.
~a + 1
10. It is extremely easy to double a binary number: just shift all the bits
to the left. (`a << 1` is twice `a`.) Explain why this trick works.
Shifting to the left moves all the bits one over and a 0 on the right. Multyplying by 2 is the same as adding itself. So in the addtion, If you have a 1 in a postion, it will result in 0 with a 1 carried to the next postion. This is exactly what doubling looks like in binary adddition. For example, 1 = 0001, 2 = 0010, 4 = 0100, 8 = 1000. Here the one is moved to the next postion for each double. The same is true for: 3 = 0011, 6 = 0110, 12 = 1100.
11. Consider the following:
```
>>> hundred = Bits(100, 8)
>>> hundred
01100100
>>> (hundred + hundred)
11001000
>>> (hundred + hundred).int
-56
```
Apparently 100 + 100 = -56. What's going on here?
When expressing data that can be positive or negaitve, the first bit represents the sign. 0 is positive and 1 is negative. In this example, we have a length of 8 bits to work with. If the first bit is denotes the sign, we have now 7 bits left to express an integer. Since each bit has 2 possiblities (1 or 0), we have 2^7 possible positve integers, or 128 possible positive integers (and their negative counterparts). With 100 + 100, we would get 200. Since one of the integers we must account for is 0, that leaves a maximum sum of 127. Now, the first bit does not simply switch the sign. It actually adds -128 to the rest of the data. Now, if we take the result of hundred + hundred, 1100100, we would have -128 + 64 + 8 = -56.
12. What is the bit representation of negative zero? Explain your answer.
An integer + its negative should be 0. Since 0 is already 0, nothing needs to change. So, -0 should be exactly the same as 0. Also if we use the algorithm to make a postive integer a negative, we flip all the bits and add 1. This gives -0 = 11111111 + 1 = 00000000 = 0. So our logic follows.
13. What's the largest integer that can be represented in a single byte?
Explain your reasoning.
A signle byte has a length 8. Since the first bit contribute to sign, the max amount of integers we can store (with negative conterparts) is 2^7 = 128. However, one of these integers is 0, so the maximum integer we can represent is 127 which is 01111111
14. What's the smallest integer that can be represented in a single byte?
Explain your reasoning.
The first bit of a byte adds -128 to the data. The rest of the bits add positive integers. So the smalles we can have is -128 which is 10000000.
15. What's the largest integer that can be represented in `n` bits?
Explain your reasoning.
Using our method from before, we cannot include the first bit in determining the max. So we are left with (n-1) bits that contribute to the max. Since every bit has two possiblities, we take 2^(n-1) to find the amount of positive integers (and 0) we can store. Then we subtract 1 from that value to account for 0 (since 0 does not contribute to the sum), so we get 2^(n-1) -1 as the maximum integer for 'n' bits.
## Text questions
16. Look at the bits for a few different characters using the `utf8` encoding.
You will notice they have different bit lengths:
```
>>> Bits('a', encoding='utf8')
01100001
>>> Bits('ñ', encoding='utf8')
1100001110110001
>>> Bits('♣', encoding='utf8')
111000101001100110100011
>>> Bits('😍', encoding='utf8')
11110000100111111001100010001101
```
When it's time to decode a sequence of utf8-encoded bits, the decoder
somehow needs to decide when it has read enough bits to decode a character,
and when it needs to keep reading. For example, the decoder will produce
'a' after reading 8 bits but after reading the first 8 bits of 'ñ', the
decoder realizes it needs to read 8 more bits.
Make a hypothesis about how this could work.
The fist byte or bytes store how to find a character. If they are all 0s, the decoder can move to the next bit and just read that. if it contains 10000000, it needs 2 bytes, 11000000 it needs 3, etc. If there is just one byte that determines length, we can have characters up to 255 bits long (not including location byte). This gives 255 x 8 = 2040 which allows for 2^2040 characters. This seems to be a plausable amount of room to encode every character from every langauge.