Unit: Digital Information
ASCII Encoding
American standard code for information interchange. A standard encoding scheme for simple text files with nearly universal portability
Integer
Any number that can be written without a fractional component. The same term is used in both programming and in math, so hopefully it's familiar to you.
A programmer is designing a computer program for a weather application that will use a variety of data sources. They want to understand which data sources are analog so that they can think carefully about the process of converting analog data into digital data. Which of these data sources is analog? A. The predicted inches of snowfall on a winter day for each city in New York B. Thirty seconds of sounds from a thunderstorm in Miami, Florida C. Two years of weekly high temperatures for Houston, Texas D. The daily time of sunrise and sunset in Detroit, Michigan
B. Thirty seconds of sounds from a thunderstorm in Miami, Florida Why? Analog data has values that change continuously over time or space and must be sampled to be stored digitally. Sounds are an example of analog data that would need to be sampled for digital storage.
Converting Decimal to Binary
1. Draw dashes for each of the bits. If the number is less than 161616, draw 444 dashes. Otherwise, for numbers up to 255255255, draw 888 dashes. Bigger numbers than that require more bits and take a while to do by hand, so let's focus on the smaller numbers. 2. Write the powers of 222 under each dash. Start under the right-most dash, writing 111, then keep multiplying by 222. 3. Now start at the left-most dash and ask yourself "Is the number greater than or equal to this place value?" If you answer yes, then write a 111 in that dash and subtract that amount from the number. If you answer no, then write a 000 and move to the next dash.
Bytes
A bit is the smallest piece of information in a computer, a single value storing either 0 or 1. A byte is a unit of digital information that consists of 8of those bits. A byte represents different types of information depending on the context. It might represent a number, a letter, or a program instruction. It might even represent part of an audio recording or a pixel in an image.
Lossy Compression
A data compression techniques in which some amount of data is lost. This technique attempts to eliminate redundant information.
Which of these values can be stored in a single bit? A. 0 B. 11 C. [0,1,0] D. 4
A. 0 Why? A bit can store one of two values, either 0 or 1.
Merlin has a file with his favorite wizard spell: ABRACADABRA ALAKAZAM In a standard uncompressed representation, each letter would likely be represented with its ASCII encoding (A as 0100 0001, B as 0100 0010, etc.) The Huffman coding compression algorithm reduces the amount of bits required to represent data by choosing new bit codes for each source symbol, and choosing short codes for the most frequent source symbols. If the computer applies Huffman coding to the wizard spell, which letter is the most likely to be represented with a shorter bit code? A. A B. B C. R D. K
A. A
How many values can a binary digit store? A. A binary digit can store one of two values (0 or 1). B. A binary digit can store ten values at a time. C. A binary digit can store one of ten values (0-9). D. A binary digit can store two values at a time.
A. A binary digit can store one of two values (0 or 1). Why? A binary digit can either store 0 or 1.
Railways have devices called "axle counters" that count up how many train axles have passed by and helps decide if a train has fully passed part of a track. However, because of a bug in the design of the axle counter logic, a train that has exactly 256 axles will result in a count of 0, and its existence will be ignored. Thus, trains that run on these buggy railways must have less than 256 axles. What is the most likely cause of this bug? A. Integer overflow error B. Limited precision floating-point numbers C. Round-off error in floating-point arithmetic D. Incorrect use of integer instead of floating-point representation
A. Integer overflow error
Byte pair encoding is a compression algorithm that replaces repeated pairs of characters in a string with a character that isn't in the data, and creates a table of replacement mappings. Here's a quote from Dr. Seuss: "Think left and think right and think low and think high. Oh, the thinks you can think up if only you try!" Which of the following character pairs would the algorithm replace? A. in B. ft C. nl D. th
A. in D. th
Nora is learning to use image-editing applications and doesn't understand when to lower the quality setting for JPEG saving. What's a good use case for using a lower quality setting? A. Saving photos that you want to print out and frame B. Saving a photo of a textbook for easy reading offline C. Saving a photo of a multi-line bar graph with small text labels D. Saving a thumbnail that will link to a full-sized version
D. Saving a thumbnail that will link to a full-sized version Why? A thumbnail does not need to have all the original details since the goal is to give someone an idea of the original photo, which they can then click to see. This is a great use case for low-quality compression.
Binary Numbers
Number system with a base of 2, unlike the number systems most of us use that have bases of 10 (decimal numbers), 12 (measurement in feet and inches), and 60 (time). Binary numbers are preferred for computers for precision and economy. An electronic circuit that can detect the difference between two states (on-off, 0-1) is easier and more inexpensive to build than one that could detect the differences among ten states (0-9).
Lossless Compression
A data compression algorithm that allows the original data to be perfectly reconstructed from the compressed data.
Consider a computer that uses 4 bits to represent positive integers and uses all 4 bits to represent the value. Which of the following operations would result in integer overflow? A. 1+14 B. 15+1 C. 4x4 D. 3x3 E. 6+1 F. 15x1
B. 15+1 C. 4x4 Why? B. In a 4-bit system that uses all bits to represent the value, the highest decimal number that can be represented is 151515. This expression results in 161616, so it would result in integer overflow. C. In a 4-bit system that uses all bits to represent the value, the highest decimal number that can be represented is 151515. This expression results in 161616, so it would result in integer overflow.
A scientist is researching the effects of magnesium supplements on depression and is considering whether to publish the research in a conventional journal or an open-access journal. In terms of access, who would benefit the least from the decision to publish in an open-access journal? A. A high school mental health counselor B. A layperson with a history of depression C. A researcher at a top-tier university D. A doctor in a developing country
C. A researcher at a top-tier university Why? Top-tier universities typically provide a way for their students and researchers to access research published in conventional journals. Often, that involves the university paying a large yearly subscription fee to the journal.
Consider these files: A 3-second audio recording of a baby's first word. A 30-second video recording of a baby's first steps. Is it possible for the 30-second video recording to have a smaller file size than the 3-second audio recording? A. Yes, but only if the 30-second video is recorded with a smartphone. B. Yes, but only if the 30-second video is saved using a patented standard. C. Yes, if the 30-second video is compressed with a lossy compression algorithm. D. No, the 30-second video will always take up more space than a 3-second audio recording.
C. Yes, if the 30-second video is compressed with a lossy compression algorithm. Why? A lossy compression algorithm reduces file size by removing detail, so this a good explanation.
A startup is developing a new web browser with a focus on accessibility for visually impaired users. The startup founder is considering the benefits and drawbacks of releasing the code online under an open-source license. What would be a consequence of releasing the code with an open-source license? A. The startup would be giving up their exclusive rights and the code would be free of copyright restrictions. B. Other companies would be able to use the code inside their software and could license their version of the code under an open-source license. C. The startup would not be able to make money by selling the browser. D. Other companies and individuals would be able to view and use the code according to the open-source license conditions.
D. Other companies and individuals would be able to view and use the code according to the open-source license conditions. Why? Every open source license has a different set of conditions that describe how others are allowed to use, modify, and distribute the code.
Bits(binary digits)
Computers store information using bits. A bit (short for "binary digit") stores either the value 0 or 1. A single bit can only represent two different values. That's not very much, but that's still enough to represent any two-valued state.
Storing Text in Binary
Computers store more than just numbers in binary. But how can binary numbers represent non-numbers such as letters and symbols? As it turns out, all it requires is a bit of human cooperation. We must agree on encodings, mappings from a character to a binary number.
A musician decides to make a digital recording of their concert to share it with fans who cannot attend the concert in person. What is true about the process of converting the concert's audio waves into a digital recording? A. If the recording equipment uses a very small sampling interval, the digital recording will be a very good representation but will not contain every detail. B. B In order for the digital recording to be a more accurate representation of the concert, the recording should use fewer bits to represent each sample. C. To capture the most amount of details in the digital recording of the concert, the recording process must take samples at the largest interval possible. D. As long as the musician purchases a computer with enough storage capacity, the digital recording can record every detail of the original analog data.
A. If the recording equipment uses a very small sampling interval, the digital recording will be a very good representation but will not contain every detail. Why? An analog signal contains infinite detail, a value for every moment in time. There is currently no such thing as infinite storage, thus a digitized recording can never contain every detail. (It may contain enough details to fool human ears, however.)
Number limits, overflow, and roundoff
When computer programs store numbers in variables, the computer needs to find a way to represent that number in computer memory. Computers use different strategies based on whether a number is an integer or not. Due to limitations in computer memory, programs sometimes encounter issues with roundoff, overflow, or precision of numeric variables.
Which of these lists correctly orders the binary numbers from smallest to largest? A. 0001010100010101, 000100010001000100010001, 001001110010011100100111, 0100100001001000 B. 000100010001000100010001, 000101010001010100010101, 001001110010011100100111, 0100100001001000 C. 0001000100010001, 001001110010011100100111, 000101010001010100010101, 0100100001001000 D. 0001000100010001, 000101010001010100010101, 010010000100100001001000, 0010011100100111
B. 000100010001000100010001, 000101010001010100010101, 001001110010011100100111, 0100100001001000 Why? This list correctly orders the binary numbers (00010001 = 1700010001=1700010001, equals, 17, 00010101 = 2100010101=2100010101, equals, 21, 00100111 = 3900100111=3900100111, equals, 39, 01001000 = 7201001000=7201001000, equals, 72).
When writing binary data, we often put a space between each byte to make it easier for humans to read. Consider this binary data: 1011011010110001 Which option puts a space after each byte? A. 1 0 1 1 0 1 1 0 1 1 0 0 0 1 B. 10110110 10110001 C. 10 11 01 10 10 11 00 01 D. 1011 0110 1011 0001
B. 10110110 10110001 Why? A byte is 888 bits and a bit is a binary digit that stores either 000 or 111. This choice correctly puts a space after every 888 bits.
Consider this sequence of bits: 1011001010010010 How many bytes long is that sequence of bits? A. 16 B. 2 C. 4 D. 8
B. 2
Which of these lists correctly counts from 1 to 5 in binary? A. 0001, 0010, 0100, 0011, 0101 B. 0001, 0010, 0101, 0100, 0011 C. 0001, 0010, 0011, 0101, 0100 D. 0001, 0010, 0011, 0100, 0101 E. 0001, 0010, 0100, 0101, 0011
D. 0001, 0010, 0011, 0100, 0101