lab_encoding/questions.md

# Boolean questions

Create the following variables.

```
a = Bits("11110000")
b = Bits("10101010")
```

For each of the following bytes, give an equivalent
expression which uses only `a`, `b`, and bit operators.
The answers to the first two questions are given.

1. 01010101

~b

2. 00000101

~a & ~b

3. 00000001

~a >> 3

4. 10000000

a << 3

5. 01010000

a & ~b

6. 00001010

~a & b

7. 01010000

a & ~b

8. 10101011

b | (~a >> 3)

## Integer questions

These questions are difficult! Try exploring ideas with `Bits`
in Terminal, a paper and pencil, and a whiteboard. And definitely
talk with others.

9. If `a` represents a positive integer, and `one = Bits(1, length=len(a))`,
   give an expression equivalent to `-a`, but which does not use negation.

~a + 1

10. It is extremely easy to double a binary number: just shift all the bits
    to the left. (`a << 1` is twice `a`.) Explain why this trick works.

Shifting to the left moves all the bits one over and a 0 on the right. Multyplying by 2 is the same as adding itself. So in the addtion, If you have a 1 in a postion, it will result in 0 with a 1 carried to the next postion. This is exactly what doubling looks like in binary adddition. For example, 1 = 0001, 2 = 0010, 4 = 0100, 8 = 1000. Here the one is moved to the next postion for each double. The same is true for: 3 = 0011, 6 = 0110, 12 = 1100.

11. Consider the following:
   ```
   >>> hundred = Bits(100, 8)
   >>> hundred
   01100100
   >>> (hundred + hundred)
   11001000
   >>> (hundred + hundred).int
   -56
   ```
   Apparently 100 + 100 = -56. What's going on here?

   When expressing data that can be positive or negaitve, the first bit represents the sign. 0 is positive and 1 is negative. In this example, we have a length of 8 bits to work with. If the first bit is denotes the sign, we have now 7 bits left to express an integer. Since each bit has 2 possiblities (1 or 0), we have 2^7 possible positve integers, or 128 possible positive integers (and their negative counterparts). With 100 + 100, we would get 200. Since one of the integers we must account for is 0, that leaves a maximum sum of 127. Now, the first bit does not simply switch the sign. It actually adds -128 to the rest of the data. Now, if we take the result of hundred + hundred, 1100100, we would have -128 + 64 + 8  = -56.

12. What is the bit representation of negative zero? Explain your answer.

An integer + its negative should be 0. Since 0 is already 0, nothing needs to change. So, -0 should be exactly the same as 0. Also if we use the algorithm to make a postive integer a negative, we flip all the bits and add 1. This gives -0 = 11111111 + 1 = 00000000 = 0. So our logic follows.

13. What's the largest integer that can be represented in a single byte?
    Explain your reasoning.

    A signle byte has a length 8. Since the first bit contribute to sign, the max amount of integers we can store (with negative conterparts) is 2^7 = 128. However, one of these integers is 0, so the maximum integer we can represent is 127 which is 01111111

14. What's the smallest integer that can be represented in a single byte?
    Explain your reasoning.

    The first bit of a byte adds -128 to the data. The rest of the bits add positive integers. So the smalles we can have is -128 which is 10000000.

15. What's the largest integer that can be represented in `n` bits?
    Explain your reasoning.

    Using our method from before, we cannot include the first bit in determining the max. So we are left with (n-1) bits that contribute to the max. Since every bit has two possiblities, we take 2^(n-1) to find the amount of positive integers (and 0) we can store. Then we subtract 1 from that value to account for 0 (since 0 does not contribute to the sum), so we get 2^(n-1) -1 as the maximum integer for 'n' bits.

## Text questions

16. Look at the bits for a few different characters using the `utf8` encoding.
    You will notice they have different bit lengths:

    ```
    >>> Bits('a', encoding='utf8')
    01100001
    >>> Bits('ñ', encoding='utf8')
    1100001110110001
    >>> Bits('♣', encoding='utf8')
    111000101001100110100011
    >>> Bits('😍', encoding='utf8')
    11110000100111111001100010001101
    ```

    When it's time to decode a sequence of utf8-encoded bits, the decoder
    somehow needs to decide when it has read enough bits to decode a character,
    and when it needs to keep reading. For example, the decoder will produce
    'a' after reading 8 bits but after reading the first 8 bits of 'ñ', the
    decoder realizes it needs to read 8 more bits.

    Make a hypothesis about how this could work.

The fist byte or bytes store how to find a character. If they are all 0s, the decoder can move to the next bit and just read that. if it contains 10000000, it needs 2 bytes, 11000000 it needs 3, etc. If there is just one byte that determines length, we can have characters up to 255 bits long (not including location byte). This gives 255 x 8 = 2040 which allows for 2^2040 characters. This seems to be a plausable amount of room to encode every character from every langauge.