Question--something I 've always idly wondered: Is a byte ALWAYS 8 bits?

G+_Sean Miller · September 17, 2018

Question--something I've always idly wondered: Is a byte ALWAYS 8 bits? On a 64-bit operating system, shouldn't a byte be 64 bits?

I have always thought of a byte as a single character. And more bits allows for more characters, like drawing with a 64-crayon box instead of 16-crayons. If it's in terms of the size of your alphabet, UTF-8 is a 32 bit encoding scheme. It takes 32 bits to represent a single character. So if it's based on the size of the alphabet, then a byte would be 32 bits for utf-8 and 7 bits for ASCII.

Is a byte = 8 bits an outdated idea?

G+_Michael Hagberg · September 17, 2018

A byte is always 8 bits. 64 bits is a word. As is 16 or 32 bits depending on the CPU.

G+_J Miller · September 17, 2018

But a shave and a haircut are only 2 bits.

G+_Tailsthefox Pelissier · September 17, 2018

Bit means smallest pice of of information. Today that old war of 8bit, 16bit, 32 bit, 64 bit and 128 bit that bullied kids for years on school play grounds has no meaning. In the case of PCs and Macs it does have use-full option the RAM.

G+_Black Merc · September 17, 2018

Michael Hagberg is right. The 64bit processors really just means that it can access more addresses(such as more ram, remember the 4gb ram of days past) and the size of numbers it can crunch without taking more compute cycles. Some will mention 'instruction set' but that hasn't really changed.

G+_Paul Hutchinson · September 17, 2018

Michael Hagberg Actually a word is always 16 bits, 32 bits is dword (double word) and 64 bits is a qword (quad word).

I suspect you are conflating a word with a "C" integer. In standard "C" an integer is either a word, dword, or qword depending on the register size of the microprocessor.

G+_Sean Miller · September 17, 2018

THanks very much for the info.

G+_Jeff Gros · September 17, 2018

Paul Hutchinson I think you and Michael Hagberg are talking about different things. You seem to be talking about the WORD, DWORD programming definitions, Michael is talking about computer architecture.

In computer architecture, a word is the "largest natural size for arithmetic", which is the size of the registers. This is typically the size of the data bus, but doesn't have to be. When we refer to a 64 bit OS, we refer to the size of the data bus. The more data we can transfer at once, the more efficient the data transaction is.

On ARM Cortex-M, the word size is 32 bits. On MSP430 it is 16 bits. On PIC/Atmel, it is 8 bits.

I'm not 100% sure either way whether WORD, DWORD are actually part of the C standard, as they refer to int, long int, long long, etc. instead. I'll have to look in my copy of the "C Reference Manual" (Harbison and Steele) when I get to work.

Speaking of work, I need to stop typing and get out of here! Cheers!

G+_Paul Hutchinson · September 17, 2018

Jeff Gros Thanks for the info, didn't realize there are different definitions from different areas of computer science for the same term. I always thought data bus size, register size, and bit length didn't create any confusing contradictory definitions and word was only used for bit length = 16.

G+_Giligain I. · September 17, 2018

Song time:

[Cubico AR Kids Coding] Coding Song "Zero One Song~?"

G+_Jeff Gros · September 18, 2018

Paul Hutchinson If you want to be really confused, look back at the history of computing. This stuff evolved organically, and there were no standards. Computers were expensive, so every bit counted.

Modern computers use 8 bit bytes. It didn't used to be that way! You could have 4 bit bytes, 6 bit bytes, 7 bit bytes, or whatever else you could dream up.

I think the strangest device I've come across (that is still relatively modern) is an 4-bit microcontroller used to control a LCD display for a watch.

G+_Paul Hutchinson · September 18, 2018

Jeff Gros No wonder so many scientists from other branches say CS is NOT really a science. They can't even establish a consistent vocabulary and metrology.

G+_Jeff Gros · September 18, 2018

Sean Miller Also, looking back at the comments, I think we all glossed over that bit at the end of your question about byte encoding.

UTF-8 is actually not fixed at 32 bits, it is variable length. You are allowed to use 1 - 4 bytes in each character. This is done for backwards compatibility with 7-bit ASCII, and to allow larger character sets, as needed.

However, whether the character is always treated as a 4 byte character (even when only 1 is needed to represent it), is of course, application dependent. I wouldn't be surprised if applications just use 4 byte characters for everything for simplicity in implementation.

G+_Sean Miller · September 18, 2018

Thank you very much for the responses. So if I understand this correctly, a byte is always 8 bits, but a word can have different lengths depending on the operating system.

G+_Giligain I. · September 18, 2018

Common Computer Conversions

http://www.beesky.com/newsite/bit_byte.htm

• yep, 2 bits = 0.25 bytes

G+_Giligain I. · September 18, 2018

#huh

Nibble = half-byte

https://en.m.wikipedia.org/wiki/Nibble

In computing, a nibble (occasionally nybble or nyble to match the spelling of byte) is a four-bit aggregation, or half an octet. It is also known as half-byte or tetrade. In a networking or telecommunication context, the nibble is often called a semi-octet, quadbit, or quartet. A nibble has sixteen (2^4) possible values. A nibble can be represented by a single hexadecimal digit and called a hex digit.

G+_Paul Hutchinson · September 19, 2018

Sean Miller It appears that a byte can also be other lengths according to what Jeff Gros posted.

However in the 21st century it's usually:

Nibble = 4 bits

Byte = 8 bits

Word = 16 bits

Double Word = 32 bits

Quad Word = 64 bits

G+_Jeff Gros · September 19, 2018

I don't mean to pick nits here, but to be absolutely clear we need to specify whether we are talking about data types from a specific language, or computer architecture.

If you are using a windows machine, and a C/C++ compiler, then the size definitions defined below (which define WORD, DWORD, etc) are applicable:

docs.microsoft.com - Windows Data Types | Microsoft Docs

This would align with the sizes specified by Paul Hutchinson.

This differs from the computer architecture term of "word", which, as I explained previously could be anything, but is probably 8 bits, 16 bits, 32 bits, or 64 bits. If you are on a desktop, it is probably 32 or 64. If you are on a microcontroller, it could be any of them.

The possible ambiguity of word, dword, etc, is the reason that I don't use these definitions when writing code. I use the standard types instead (stdint.h): uint8_t, uint16_t, etc.

It is common for me to write microcontroller code that sends data to a windows app. If I use the standard type definitions in stdint.h, I don't have to worry as much about portability. I can share serial packet processing code, and it will just work.

Question--something I 've always idly wondered: Is a byte ALWAYS 8 bits?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites