AVR Coding Part 4: Why Binary is Crucial

Decimal to binary conversion
Decimal to binary conversion function

In written language, numbers are recorded using decimal symbols 0 through 9.  When microcontrollers need to record numbers, they often do so using collections of voltages.  Binary symbols 0 and 1 represent these voltage states, and they both represent one unit of information.

Why is this important?

  1. Decimal, binary and other systems have relationships between the number of symbols written, their left-to-right order, and what each symbol represents.
  2. Microcontrollers use binary to control hardware-level functions that vary from model to model.
  3. Out of context, binary and decimal may resemble each other.  Misidentifying which is which can lead to costly mistakes.

Example: consider the string “101101101.”  Taken as a decimal number it comes out to about 101.1 million.  But as a binary number it converts to “365” (i.e. the number of days in a year).  So which interpretation is correct?  This project will explore how this and a few other quirks of binary to decimal notation.

Notice of Non-Affiliation and Disclaimer: As of the publication date, we are not affiliated with, associated with, authorized with, endorsed by, compensated by, or in any way officially connected with Microchip Technology Inc., Atmel Corporation, or their owners, subsidiaries or affiliates.  The names Microchip Technology Inc., AVR, as well as related names, marks, emblems, and images are trademarks of their respective owners.

External Links: Links to external web pages have been provided as a convenience and for informational purposes only. Unboxing Tomorrow and Voxidyne Media bear no responsibility for the accuracy, legality or content of the external site or for that of subsequent links. Contact the external site for answers to questions regarding its content.

Scope

This project will focus on positive, countable numbers (i.e. positive integers).  Signed numbers and fractional numbers are outside of today’s scope, but a good reference for them is: “Understanding Floating Point Numbers: Concepts in Computer Systems,” volume 2.

Positional Number Systems

The decimal system is the system of everyday informal use.  These numbers are built up from an “alphabet” of ten possible symbols or digits (0 through 9).  To count beyond the original ten digits, extra digits must be added to adjacent positions called place values.

Decimal

The opening example of “365,” expresses the same quantity as:

300\ +\ 60\ +\ 5

Another way of looking at it:

3(100)\ +\ 6(10)\ +\ 5(1)

Or:

(3\times{10}^2)+(6\times{10}^1)+(5\times{10}^0)

Each digit has a power of ten associated with it. This is why the decimal system is also called base 10 (or alternatively: denary or radix-10).

Binary

Binary is built from an “alphabet” of only two symbols (0 and 1), and each symbol has a power of two  associated with it.  Consider the binary example: 101…

(1\times2^2)+(0\times2^1)+(1\times2^0)=5

Other counting systems follow the same convention, except with a different base, and a different set of permissible symbols.  In general form:

N=\sum_{k=0}^{n}{a_kb^k}

…where b is the base (also called radix), n is the total number of symbols written, k is a symbol’s position (also called place value), and a is any symbol occupying position k.

SYSTEMBASE
Binary2
Octal8
Decimal (denary)10
Hexadecimal16
Table 1: List of Positional Number Systems Important to AVR Programming

Consider the message “365” as a decimal (b = 10) value…

(3\times{10}^2)+(6\times{10}^1)+(5\times{10}^0)\ =\ 365

Versus the same “365” as a hexadecimal (b = 16) value…

(3\times{16}^2)+(6\times{16}^1)+(5\times{16}^0)\ =\ 869

Numbers in Electronic Storage

Practical embedded software will eventually need to store symbols in memory.  For binary, the smallest possible unit of information is a bit, and a memory cell can store exactly one bit at a time.

By definition, a byte is the smallest addressable unit of information on a computer.  The AVR core uses an 8-bit byte convention for its memory, and therefore memory cells are also grouped in 8s.  Just like a positional number system, the order of bits matters just as much as the bits themselves.  Therefore, each bit is labelled according to its exponent…

Figure 1: Bit Order as used in Microchip Documentation and AVR-GCC Coding

The most significant bit (MSbit) is the bit with the highest exponent, and the least significant bit (LSbit) has the lowest.

Numbers too Large for 1 Byte

Consider what happens if 365 is converted from decimal to binary.  This conversion amounts to:

365=(1\times2^8)+(0\times2^7)+(1\times2^6)+(1\times2^5)+(0\times2^4)+(1\times2^3)+(1\times2^2)+(0\times2^1)+(1\times2^0)

Which is the 9-bit patten…

Figure 2: Integer “365” Stored as a Binary Symbols

The 9th bit (on the left) has overflowed the 8-bit byte and needs to be stored elsewhere.  One solution is to move it to an entirely new byte in memory, and then treat the 2 bytes as a 16-bit pair…

Figure 3: A 9-bit Value Recorded Using Two 8-bit Bytes

Under this plan, the byte containing the LSbit is called the least-significant byte (LSB); and the byte containing the MSbit becomes the most-significant byte (MSB).  For clarity, this blog series will deliberately use “LSB” or “MSB” to indicate bytes and “LSbit” or “MSbit” to indicate bits.

Figure 4: A 2-Byte Value in Memory

AVRs are Little Endian

This begs the question: why did Figure 4 print the LSB left of the MSB?

AVRs follow the little-endian convention when it comes to multi-byte values in memory.  This convention always stores the LSB to a lower address in memory than the MSB.  This convention is not mandatory, knowing the endianness of what is in memory is crucial, especially during troubleshooting.

Here is how the arrangement would look if 4 bytes (i.e., 32 bits) were required…

Figure 5: A 4-Byte Value in Little-Endian Representation

Closing Remarks

Side note: If I’m not mistaken, the 8-bit byte tradition was started by the IBM System/360 of the 1960s.

Regardless, programmers can clarify the base of a number by attaching a specific prefix to it…

BASEbPREFIXEXAMPLE
Binary20b0b10101010
Octal800365
Decimal10(none)1337
Hexadecimal160x0xFF
Table 2: List of Base Prefixes and Suffixes

Note the C language rules will interpret leading zeros as an octal (base 8) prefix. 

References

[1]E. Sakk, Understanding Floating Point Numbers: Concepts in Computer Systems, vol. 2, 2018.
[2]S. Tanna, Advanced Binary for Programming & Computer Science: Logical, Bitwise and Arithmetic Operations, and Data Encoding and Representation, 1st ed., Answers 2000 Limited, 2018.
[3]“avr-gcc – GCC Wiki,” gcc.gnu.org, 27 June 2021. [Online]. Available: https://gcc.gnu.org/wiki/avr-gcc. [Accessed 10 May 2022].

Important Notice: This article and its contents (the “Information”) belong to Unboxing-tomorrow.com and Voxidyne Media LLC. No license is granted for the use of it other than for information purposes. No license of any intellectual property rights is granted.  The Information is subject to change without notice. The Information supplied is believed to be accurate, but Voxidyne Media LLC assumes no responsibility for its accuracy or completeness, any error in or omission from it or for any use made of it.  Liability for loss or damage resulting from any reliance on the Information or use of it (including liability resulting from negligence or where Voxidyne Media LLC was aware of the possibility of such loss or damage arising) is excluded.