Discovering connections within Information 10-22
10) Which BEST describes the term metadata? A Data about other data. It is often compared to the information that previous generations saw in the card catalog in the library — not the book itself, but information about the book. B Information, often quantitative or qualitative, stored by a computer and used in its operations and calculations. C After or beyond. Often used to describe abstraction. D A collaborative meeting of statisticians, computer programmers, economists, and others at which data are discussed, analyzed and processed.
Choice 'A' is the correct answer. Metadata is data about data. For instance, if one takes a photo on a cell phone, the picture itself is the data. The cell phone also saves metadata about the photo. This may include information identifying phone on which the picture was taken, the date and time the photo was taken, the GPS location of the phone when the picture was taken, and so on. This other information about the photo is the metadata, as opposed to the data that is the picture itself. Explanation of Distractors Choice 'B' is incorrect because the stored information is data, not metadata. Choice 'C' is incorrect. Meta is Greek for "after" or "beyond." However, the term we are interested in is metadata. Choice 'D' is incorrect. Metadata is not a meeting about data
15) Wireless companies collect metadata for each phone call made on their network, including the number of the caller, the number called, the location of the cell tower used for your call, and the duration of each call. Which of the following can NOT be discovered from the metadata collected by cell phone companies? A What was said in your phone conversation. B When and for how long you spoke to your psychiatrist over the phone. C Whether you have called a suicide prevention hotline last winter. D Where you were when you called home last friday night.
Choice 'A' is the correct answer. The content of the conversation is the data being sent. The wireless are not collecting that data. Explanation of Distractors Wireless companies do, however, know who you called and for how long. Thus, the metadata from your calls can be used to determine when and for how long you spoke with your psychiatrist. The metadata can also be used to determine if one spoke with a specific phone number last winter. Thus, Choices 'B' and 'C' are not correct. Cell phone metadata also allows your wireless company to know your approximate location according to what cell tower was used for your call. They may know more precise locations using the GPS location sensors in most phones today. Thus, Choice 'D' is not correct.
18) Which of the following is an accurate statement of what this chart is showing? A Generally speaking, since 2004, more people have searched "Harvard University" than have searched "University of Texas at Austin", "University of Florida", "Texas A&M University", or "University of Miami" B Since sometime around 2012, Texas A&M University has become the favorite university of more people. C The general decline in the searched term "Harvard University" might be due to the rising cost to go to Harvard University. D Generally speaking, the volume of Internet searches is increasing over time because the number of people using the Internet is also increasing.
Choice 'A' is the correct answer. While the number of searches for "Harvard University" are on the decline, we can see from the vertical bar chart on the left that it averages more searches than any of the others. Explanation of Distractors Choice 'B' is incorrect because while there was an increase in the number of searches for "Texas A&M University" it does not mean that it is the favorite college. The increase of searches could come from interest due to new majors, news article, or sports team. Choice 'C' is incorrect because, while we can see a decline in the number of searches, we cannot say it is due to cost. We do not have that data and, even if we did, there is no way to prove it is causation instead of correlation. Choice 'D' is incorrect. If you look at the search term "University of Miami" you will notice a steady number of searches. This shows that, despite the growth
22) Which of the following statements are true in regards to the charts? A Last item purchased Average purchase amount (in dollars) Internal customer reference number B Last item purchased Product category preference Average purchase amount (in dollars) C Address Last item purchased Product category preference D Last item purchased Telephone number Average purchase amount (in dollars)
Choice 'B' is the correct answer. Any of the following pieces of data could be used to help identify an individual customer: Customer name Social security number Address Telephone number Internal customer reference number. Explanation of Distractors: Choice 'A' is an incorrect answer. "Last item purchased" and "Average purchase amount (in dollars)" are individually safe to use. However, "Internal customer reference number" could be used to look up the actual customer and link the customer to the other information. Choice 'C' is an incorrect answer. "Last item purchased" and "Product category preference" are individually safe to use. However, "Address" could be used to look up the actual customer and link the customer to the other information. Choice 'D' is an incorrect answer. "Last item purchased" and "Average purchase amount (in dollars)" are individually safe to use. However, "Telephone number" could be used to look up the actual customer and link the customer to the other information.
21) In order to encode colors, we use the RGB color model that uses varying intensities of Red, Green, and Blue light and are added together to reproduce a broad array of colors. In order to represent 16,777,216 different colors in fewer bits, we represent each color as a byte of information. Which of the following could be a color code in this system? A F1D B 09AC13 C 10AG3F D 91D3AB7
Choice 'B' is the correct answer. Since each color is represented as a byte of information and each byte can be expressed as two hexadecimal digits, the correct response contains six hexadecimal digits. The largest hexadecimal digit that we can use if F since 1111 = 15 and 15 in hexadecimal is F. The red, green, and blue use 8 bits each, which have decimal values from 0 to 255. This makes 256∗256∗256=16,777,216 possible colors. Explanation of Distractors: Choice 'A' is incorrect because this would be a color in a 12-bit system. In a 12-bit system, there are 4 bits reserved for each color and can be represented with a single hexadecimal digit. Choice 'C' is incorrect because, while it has the correct number of digits, the highest possible value is the hexadecimal value F. Since the maximum number that can be expressed in 4 bits is 1111 = 15, the then highest hexadecimal number we can use is F. Choice 'D' is incorrect because there are seven digits. Since each color is represented by a byte they are assigned two hexadecimal values each. Therefore seven digits is one more than the six digits required.
11) In June of 2015, the press reported that hackers had accessed data kept by the federal government on millions of current employees and retirees. The data breach was thought to include not only social security numbers, but also work history, salary, benefits, and training data. The biggest concern following these press reports concerns which of the following AP® Computer Science Principles standards? A The effective use of large data sets requires computational solutions. B Maintaining privacy of large data sets containing personal information can be challenging. C Software tools, including spreadsheets and databases, help to efficiently organize and find trends in information. D Transforming information can be effective in communicating knowledge gained from data.
Choice 'B'is the correct answer. The biggest concern in the press reports focused on the issues regarding protecting personal information in large data sets. Thus, Choice 'B' is the correct answer. Explanation of Distractors It is true that large data sets require computation, but that was not the major worry following this hacking incident. Thus, Choice 'A' is incorrect. Similarly, the data could be used to spot trends or communicate knowledge, but these again were not the biggest concern. Thus, Choices 'C' and 'D' are incorrect.
13) A powerful hurricane hit the coast of Alabama. In order to better analyze the situation, tweets sent by people in the state of Alabama during the brunt of the storm were accumulated and stored in a large database. The next day, meteorologists used this data to analyze the strength of the hurricane. After analysis of the tweets, the meteorologists concluded that the storm was fairly weak. However, this did not coincide with the devastating aftermath of the storm captured by their own cameramen. Which of the following would be the MOST likely explanations for the discrepancy between the tweets and the video footage? A The people tweeting during the brunt of the storm sent their tweets before they evaluated the damage to their property. B Since the tweets were captured during the brunt of the storm, they reflected reality more than the actual video taken by the cameramen. C The hurricane disrupted cell service for those people affected most during the brunt of the storm. D The people with the most damage were preoccupied during the brunt of the storm, so tweeting was not a high priority for them at that
Choice 'C' is a correct answer, since it is very likely that cell phone service would be disrupted under the extreme conditions of a hurricane. Therefore, a majority of the tweets are probably from people on the remnants of the hurricane, who were less affected by the storm. Choice 'D' is a correct answer. Those people who are fighting for survival and protecting their families are less likely to think about tweeting during the brunt of the storm. Therefore, a majority of the tweets are probably from people on the remnants of the hurricane, who were less affected by the storm. Explanation of Distractors: Choice 'A' is an incorrect answer. Although there is a possibility that some people may inadvertently tweet a positive message before realizing the damage to their property, this is less likely than the other stated possibilities. Most people tend to take stock of the situation and report the truth. Choice 'B' is an incorrect answer. A picture is worth a thousand words. It would be difficult to dispute actual video footage from the hurricane and ignore it in favor of tweets which could be based on false information.
19) A small t-shirt company sells its shirts manually by recording sales using paper and pencil. The company then adjusts its inventory daily using a spreadsheet saved locally on a computer's hard drive at its one store. Production of new t-shirts is based on information from this spreadsheet. Business, however, has exploded in recent months, and the company has decided to open another store across town. It also wants to make sure that it is equipped to expand further in the future. The company is prepared to make reasonable expenditures to expand its business. Which of the following would be the best approach for the company to manage its inventory and allow it to expand in the future? A The company should replace its current spreadsheet with a new spreadsheet stored on a cloud that can be accessed remotely by its other stores. B The company should provide separate inventory spreadsheets at each store location and combine the information into the master company inventory spreadsheet on a nightly basis. C The company should replace its current spreadsheet with new cash registers connected to a centralized inventory management system that can be accessed remotely by all of its stores. D To save money, the company should continue to use its current spreadsheet approach and have each store communicate their adjusted inventory levels at the end of each week
Choice 'C' is the correct answer Since the company is expanding fairly rapidly and wants to look into future expansion plans, it would be logical to scrap the current approach of recording sales on paper and entering inventory into a spreadsheet. A more reasonable approach would be to upgrade their system by using cash registers that are integrated with an online inventory management system. One advantage of this new approach is that inventory levels would remain up-to-date throughout the day as sales are being made, and therefore the company could make more timely decisions on what needed to be restocked. Explanation of Distractors Choice 'A' is an incorrect answer. This approach would be an improvement over the current approach, since the new spreadsheet could be accessed by all stores. However, it would pose other issues such as how to limit a store to only their individual inventory levels. Also, there is no mention of improving the "paper and pencil" approach to recording sales which would be very cumbersome as the company is expanding. Choice 'B' is an incorrect answer. This approach could possibly work, but it isn't much of an improvement over the current approach. Also, it would be labor-intensive and possibly error-prone to combine the spreadsheets from each store each night. Additionally, there is no mention of improving the "paper and pencil" approach to recording sales which would be very cumbersome as the company is expanding. Choice 'D' is an incorrect answer. This approach is basically keeping the existing spreadsheet and updating it with the inventory information from each store on a weekly basis. Since the data is merged on a weekly basis, the central inventory levels are not up-to-date throughout the week, and it would be difficult to schedule production of new t-shirts. Also, there is no mention of improving the "paper and pencil" approach to recording sales which would be very cumbersome as the company is expanding.
12) Which of the following is the most correct statement about what the above chart is displaying? A In general, from 2012 through 2015, the consumer demand for apples was very erratic. B In general, from years 2007 to 2010, oranges were a more popular fruit item than apples. C From 2012 to 2015, there is more variation in the number of searches for "apple" than the number of searches for "orange", "banana" or "peach". D The relatively level line for the search term "banana" is probably because bananas are eaten year-round.
Choice 'C' is the correct answer. The chart shows multiple peaks and valleys from 2012 through 2015 for the search term "apple", which means that there is a considerable amount of variation. However, the line for search term "orange" generally shows a gradual decline in the number of searches, without a great deal of peaks and valleys. The lines for search terms "banana" and "peach" are fairly horizontal, which means that there is very little variation in the number of times these search terms were entered. Choice 'A' is an incorrect answer. Google Trends shows the frequency of different search terms during a time period. It does not explain why the search terms were used. Therefore, this graph does not tell you anything about the demand for apples or any of the other fruit terms. Choice 'B' is an incorrect answer. Google Trends shows the frequency of different search terms during a time period. It does not explain why the search terms were used. Therefore, this graph does not tell you anything about the popularity of these fruit terms. For example, the search term "orange" may have been entered more frequently because it is a color of a sports team and not because it is a popular fruit item. Choice 'D' is an incorrect answer. Google Trends shows the frequency of different search terms during a time period. It does not explain why the search terms were used. Therefore, this graph does not tell you anything about the number of bananas eaten throughout the years.
9) Richard sent a letter to Juanita. It was an old-fashioned, snail-mail letter sent via the US Postal Service. Which of the following is best described as the metadata about this letter? A The US Postal Service, who picks up the letter from the blue box at the end of Richard's street and delivers it to Juanita's mailbox a couple of days later. B The letter itself, in which Richard says he misses Juanita terribly and is looking forward to seeing her over the next school break. C The envelope, which has both Richard's and Juanita's addresses, a stamp, and the postmark indicating from where the letter was sent. D The mail carrier, who realized that the envelope had been sorted into the wrong pile and got it into the correct mailbox despite the poor handling at the post office.
Choice 'C' is the correct answer. The data about the data is metadata. In this case, the information on the envelope is the metadata. The envelope contains information about the letter, that is, the sender, the intended recipient, and a postmark indicating approximately where the letter was mailed from. Thus, Choice 'C' is the correct answer. Explanation of Distractors Choices 'A' and 'D' The Postal Service and the mail carrier are very important in the whole process, but not really significant to this question. That is, they transport the data, but are neither the data being sent nor the metadata about the data being sent. Choice 'B' is incorrect. The letter itself is the data being sent, but not the metadata.
8) The Bureau of Economic Analysis collects data on the size of the US economy. The most widely followed measure of the size of the economy is the Gross Domestic Product, which attempts to measure the value of all goods and services produced in the US in a given year. The spreadsheet below summarizes the data for 2004-2014. Year Gross Domestic Product 2004 13,774 2005 14,234 2006 14,614 2007 14,874 2008 14,830 2009 14,419 2010 14,784 2011 15,021 2012 15,369 2013 15,710 2014 16,086 Note: Billions of dollars, adjusted for inflation, 2009 dollars. Which of the following is BEST supported by the data in the spreadsheet? A The GDP is larger every year. B The GDP will never go above 20 trillion dollars. C The GDP is trending upwards. D The GDP only falls during even years.
Choice 'C' is the correct answer. The trend is clearly upwards as GDP increases from 2004 to 2014. Choice 'B' is incorrect. We cannot use the values in the spreadsheet to estimate a maximum value over time. Choice 'A' and 'D' are incorrect. The spreadsheet shows the GDP falling in both 2008 and 2009, an even and an odd year.
20) Based solely on the patterns and trends of this data, which of the following predictions would be the LEAST logical? A Sales of blenders in July, 2016 will be approximately $64,000. B The profit on blender sales in June, 2016 will be approximately $16,000. C The profit on toaster sales in June, 2016 will be positive. D Sales of toasters in July, 2016 will be approximately $20,000.
Choice 'D' is the correct answer. Looking at toaster sales across months, there does not seem to be a discernable trend or pattern. Therefore, it would be a pure guess to predict sales of $20,000 in July, 2016. Choice 'A' is an incorrect answer, since it is a logical prediction. When evaluating blender sales, the sales amounts are roughly doubling every month. Therefore, it is logical to predict that blender sales in June, 2016 would be about $32,000 and sales in July, 2016 would be approximately $64,000. Choice 'B' is an incorrect answer, since it is a logical prediction. When evaluating blender profit, it appears that profit is consistently 50% of sales dollars. Since sales of blenders is approximately doubling every month, June sales should be about $32,000 and profit on these sales would be about $32,000 X .5 = $16,000. Choice 'C' is an incorrect answer, since it is a logical prediction. When evaluating toaster profitability, it appears that the company was struggling in the first few months and losing money on toaster sales. However, the negative profitability improved ste
7) One of the features of Gapminder is a search feature that allows users to highlight a certain country. The country's data is tracked throughout the years, so at the end of the animation, the user can see the country's path. Here's the 2015 graph again, this time with the search feature used to find Cambodia: Note, the data point labeled 1910 is the first data point available for Cambodia. What trends are evident in the graph above that would NOT be identifiable without the search feature? Select TWO answers. A There were only two stretches of time when Cambodia followed the general trend of other countries: as income increases, life expectancy increases. B A significant event in Cambodia's history must have impacted its life expectancy. C In 2015, the life expectancy in Cambodia was about 70 years old. D In 2015, the income per person in Cambodia was about 3500. Incorrect
Choices 'A' and 'B' are both correct. The search feature allows the user to see Cambodia's path on the graph through time as opposed to one moment in time. From this path, we can see that the upward trend occurred for two stretches of time. We can also see a dramatic dip in life expectancy, indicating a significant historical event. Explanation of Distractors: Choices 'C' and 'D' are both true facts, but they could be seen by a viewer without the search feature enabled. The search feature give users a path as opposed to a snapshot of data.
17) On December 31, a television meteorologist boldly concludes that next year's total snowfall will be approximately 5.0 inches. Of the following data sets, which best supports that prediction? Note: Actual snowfall numbers for each year are recorded in inches.
Choices 'A' and 'B' are the correct answers. Choice 'A' is correct, because the snowfall amounts for the four most current years appear to show a trend. Each year's snowfall amount is approximately 75% of the prior month's snowfall amount. Since the current year's snowfall amount is 6.7 inches, then 6.7 X .75 = 5.03. Therefore, approximately 5 inches would be a logical prediction. Choice 'B' is correct, because the snowfall amounts for the five years appear to show a pattern. Every other year has a snowfall amount of around 14 inches and the remaining alternating years tend to have a snowfall level of about 5 inches. The pattern is 14, 5, 14, 5, 14, etc. Since the current year had a snowfall amount of 14, the pattern would tend to suggest that next year's snowfall level would be about 5 inches. Explanation of Distractors: Choice 'C' is an incorrect answer. The snowfall amounts do not show a discernible pattern or trend. Using this data set to predict a total snowfall level of 5 inches next year would be an outlandish guess. Choice 'D' is an incorrect answer. There is a slight pattern here, but it does not support a snowfall level of 5 inches next year. The pattern shows that each year shows a slight but steady increase from two years prior. Using this data set, the meteorologist might conclude that next year's snowfall amount might be in the 8-10 inch range, since CY -1 has a snowfall amount of 6.8 inches and two years before that had a snowfall amount of 4.6 inches.
16) Companies spend a lot of time and effort collecting information into large data sets which they interrogate to search for patterns in the data. Which of the following are examples where searching for patterns in the data could help answer a question or verify a hypothesis? A A high school principal analyzes the current grades across subjects of all junior students to try to determine which students will likely enroll at the local college. B The county election board analyzes the election's mayoral vote count to determine the winner and runner-up of the election. C An online search engine analyzes the buying habits of its users to determine what types of products the user may be interested in purchasing in the future. D An online book clearing house analyzes its monthly sales history to determine which books had no sales during the past month.
Choices 'A' and 'C' are the correct answers. Choice 'A' is a correct answer. The principal is taking into account the grades per subject for all juniors and using this information to try to determine which students are the most likely to enroll at the local college. In other words, the principal is trying to discover patterns in the data that would signal the likelihood of students going to the local college. Choice 'C' is a correct answer. The online search engine is capturing data on its users and attempting to find patterns in the data that help predict which products the users are interested in purchasing in the future. Explanation of Distractors: Choice 'B' is an incorrect answer. The election board is simply tallying up the vote counts of the mayoral candidates to determine the winner and the runner-up. They do not have to investigate patterns in order to do this task. Choice 'D' is an incorrect answer. The book clearing house is looking at sales and filtering the data to only show the books with sales equal to zero last month. It is not necessary to look at patterns in order to get this information.