Unit 5: Big Data AP Practice Questions
Both small and big businesses can benefit from using big data in their organization. Which of the following are ways businesses could use big data to their advantage?
- Big data could be used to give the company a competitive edge by looking at data that can tell them where they are lagging behind in comparison to other companies, allowing them to make changes accordingly. - Big data could help keep track of how well a promotional offer worked for a company and can determine if it would be worthwhile to run the promotion again. - Big data regarding customer behavior could be collected and analyzed in real time and then used to implement real time solutions. (ALL OF THESE)
Web browsers such as Google Chrome and Mozilla Firefox allow users to browse the web in an anonymous session. When the session is ended, any browsing history and cookies created from the session are deleted. Which of the following statements best describes the security situation when browsing using anonymous mode.
Although local browsing data will not be stored, websites can see your IP and track your activity if you log in to accounts on these sites.
Consider the following relational database which contains census data. Which of the following would be the result if a user were to query this database for any "City" with a population between 100,000 and 1,000,000?
Anaheim, Austin, Charlotte, Tempe
A drug company is developing a new drug, and has already achieved some success with laboratory experiments on tissue samples. Before conducting trials of the drugs in living organisms the company will use these results, and medical knowledge about the body, to develop a simulation of how the drugs might affect the entirety of a person's body when administered. Which of the following is NOT a valid reason why the drug company might wish to do this?
The simulation will completely eliminate the need to conduct tests in living organisms before the drug is released to market, saving the company money.
A web browser uses locally cached data to speed up load-times for recently visited websites by a user. Which of the following is a likely negative consequence of this feature?
The usable storage space of the device on which the browser is running will decrease
There are many computer applications that have been designed to help people search through large data sets to find patterns. However, not all questions require a search for a hidden pattern. Seeking the answer to which of the following questions is least likely to require an investment in software:
Which contestants took the top three prizes in a talent show at a neighborhood block party?
Students are using data collected from a non-profit organization to try to convince the school board that their school should be in session year-round with several week-long breaks as opposed to the usual 9 months on and 3 months off. Information that was collected by this organization was as follows. The location of the school (city and country) The number of students at the school Whether it was year-round or had the normal 3-month summer break Scores on standardized tests (AP, SAT, ACT, etc) The student handbook of rules and regulations Results from a survey from teachers and students about happiness level and motivation level They decided to make an infographic in order to try to easily display the data they had analyzed. Which of the following would be the best information to put on their infographic to try to convince the school board to change the schedule? Select two answers.
1. Association rules showing links between motivation and happiness levels and the type of schooling students were receiving. 2. A regression analysis of standardized tests scores comparing the two different types of schooling.
In Fantasy Football, participants compete against one another by choosing certain players from different NFL teams that they think will do the best on any particular week. Top Fantasy Football players spend hours every day looking at huge databases of statistics related to the players and the teams often using spreadsheets and software tools to gain new insights and choose the best players. This process could be considered an example of which of the following?
Data mining
Many universities have multiple campuses which students can attend. For example, The Pennsylvania State University has 24 total campuses. Although there are different campuses, some staff and employees have access to student records from all campuses in a large database. Which of the following is NOT a relevant factor which should be considered by a University in the development and creation of a database of this type?
How to ensure that there is a complete copy of the database stored at every campus
For situations that may be too dangerous, costly, or otherwise too difficult to test in the real world, what do computer scientists create in order to help discover new knowledge and create new hypotheses related to the situation they are studying?
Simulations
The table below shows the time a computer system takes to complete a specified task on the customer data of different-sized companies. Based on the information in the table, which of the following tasks is likely to take the longest amount of time when scaled up for a very large company of approximately 100,000 customers?
Sorting data
A highway has just been enlarged to consist of two lanes in each direction. The government body responsible for the highway is considering two different sets of rules for drivers on the newly upgraded highway. One set of rules will allow drivers to overtake using any of the lanes, while the other will allow drivers to overtake only using the left-hand lane. To help understand the effect these two sets of rules will have on traffic flows and congestion, the government contracts a company to build a computer simulation. Which of the following statements about this simulation is true?
The simulation will likely require some simplifications and assumptions to be made about the behavior of drivers on the road.
When NASA (National Aeronautics and Space Administration) first tried to launch a man into orbit, there were a lot of factors that had to be considered when making the calculations (at the time these were done by hand) for the launch. Some factors that needed to be included are the following weather features - temperature, wind speed, humidity and dew point. Today, these calculations are done with computers, but this data still needs to be accessed for the computer to complete these calculations. Suppose that this data is to be inputted manually by the astronauts using information from the site weather.com. How should this information be presented to the astronauts so that they can enter it into the computers easily and correctly?
A structured table of only the variable and value pairs needed to be entered into the computers from weather.com's forecast for the city from which they are launching.
A messaging company keeps track of the identities of the sender and receiver of every message sent, as well as the content of the message being sent. Which of the following could fall under the category of metadata? I - How many words were in a message?II - Was there was an emoji used in a message?III - What time was a message received?
I, II and III
A popular restaurant collects data on the food their patrons are ordering. They hope that this will allow them to be better informed about what items they need to order in preparation for the next week. What would be the best way for the restaurant to collect this data with that end goal in mind?
Record the major meat, vegetable and fruit components of the meals each customer ordered (steak - corn - apples, fish - peas - peaches, etc).
Which of the following is NOT a benefit of making digital information and scientific databases openly available across the internet?
Inaccurate and misleading data can be more easily disseminated to scientific researchers.
Which of the following terms describes the conversion of data, formatted for human use, to a format that can be more easily used by automated computer processes?
Screen Scraping
A recent computer science graduate is looking to design a computer software tool that helps with big data analysis. Which of the following features would be useful to include in their program?I - A sort tool that can organize the data in numerical or alphabetical orderII - A search tool that helps the user to quickly locate specific information from the dataIII - A graphing tool that will create bar graphs and scatter plots
I, II and III
An infographic displays the relative frequencies of the 100 most common emojis used in text messaging for each of the last 12 months. Which of the following conclusions cannot be drawn from such a representation of emoji usage?
You can determine the average age of emoji users based on emoji use
Which of the following tasks best shows an example where the searching and sorting techniques of big data may be involved? There are TWO correct answers.
1. Creating a seating chart for a classroom based on an alphabetized list of student Names 2. Keeping track of all employees' email use to see how many personal or work-related emails are sent during work time to check for productivity
A local high school recently won the girls volleyball championships and has been rewarded with $5,000 to purchase merchandise for the team. The coach is trying to surprise the players with the merchandise, so instead of asking them what sizes they want, he will attempt to figure out how many items of each size need to be ordered based on the information from the volleyball program. The following information is stored in the program for each player: Name Age Grade Height Weight Jersey number Position What would be the best way for the coach to use this information in order to order sizes that work for the majority of the team?
Sort the data by height and weight and order smaller sizes for the girls that are shorter and weigh less and order larger sizes for the girls that are taller and weigh more.
Google has access to a lot of data. One way to make use of the data collected by Google is to examine the relative popularity of search terms using the Google Trends feature. This tool allows users to identify trends across geographies and time, and within categories like Real Estate, Sports, Shopping, Pets & Animals, Books & Literature and Arts & Entertainment. Google Trends would be most helpful for determining which of the following?
Which week is the best week to send advertisements to parents who want Fidget Spinners or other popular toys for their children around the holidays.