ITM Final
Natural Language Processing NLP uses
- Artificial Intelligence AI - Machine Learning ML - Language Processing LP - Deep Learning DL
Which of the following are all the same value in a normal distribution?
- Median - Mean - Mode
When using colors to visually distinguish between dimensions on visualization in tableau, which of the following use of color is most commonly applied?
Categorical
Color and formatting should be used in Tableau to:
Draw attention to relevant data
Tree maps and heats maps use which of the following to show proportional size of values?
Size and Color
Which of the following applies to many to many relationships but not to one to many relationships
You need a third table to create the relationship
Deep Learning DL - Current Uses
- Autonomous vehicles - Computer vision - Language recognition - Translations - Generating image captions - Adding color to black and white photos
Five common characteristics of quality data
- Complete - Accurate - Unique - Timely
Machine Learning ML - DATA
- Data is gathered through human inputs - The ML has access to either view or view and edit the database - Quality data is required for the ML to be accurate - The confidence score is critical for the ML component to work properly - Machine Learning (ML) works best with a large data set
Which of the following are properties of primary keys?
- Each Tuple must have a unique primary key - Several distinct attributes could be used together to form a primary key - is the candidate key that is chosen as the principle means of identifying tuples within a relation
The three factors of the variety of data are:
- Form - Function - Source
All organizations need to understand and govern PII through which of the following?
- Identifying all sources of created, received, maintained, or transmitted PII - Evaluating all external sources of PII - Identifying all human, natural, and environmental threats to PII
Which of the following examples could cause a butterfly effect in an organizations data?
- Inaccurate customer records - Incomplete purchasing history - A cascading spelling mistake
Which of the following is true about a view:
- It can be used within a database to store table relationships for users to access - It conceptually contains the results of a query - If the underlying data in the tables and relations change, so will the results of the query
Natural Language Understanding NLU uses
- Knowledge Discovery (What are the implications?) - Forensics and Sociolinguistics (Who said it, and why?) - Computational Linguistics (How is it said?) - Information Extraction (What is being said?)
Which of the following is a technology challenges for big data?
- Managing huge volumes of data - Managing streams at an extremely fast and variable pace - Managing a variety of forms and functions of data - Processing data at a huge speed
Machine Learning ML
- Must be used with Artificial Intelligence (AI) - Allows the AI to get more intelligent over time - Reduces the time needed to train an AI model - Can increase the size of the database over time - Increases accuracy with little to no human involvement
Deep Learning DL - Future Uses
- Predicting earthquakes - Data augmentation - Brain cancer detection - Stock market predictions - Crime pattern analysis
NLP / NLU
- Required for speech recognition - Required for transcription - Often required for translation - Allows for natural conversations - Required anytime "emotion" is expressed by the AI - Used in both text and audio conversations cases
Which of the following is explained as the reason humans retain comparative advantage over artificial intelligence when addressing uncertainty and equivocality in decision making?
- Superior intuition - Imagination - Creativity
Machine Learning ML - Human Training
- The ML organizes the data and attempts to classify it - Humans must train that data for any changes to be made to the database - The goal of Machine Learning (ML) is not to eliminate the need of human training, but to reduce the time it takes to train large data sets.
Neural Networks NN
- Utilizes AI, ML, and often DL - Can utilize NLP and/or NLU - Models the human brain through the use of artificial neurons - Often requires large computing power (sometimes quantum computing) - Most complex use of Artificial Intelligence - Neural networks are intentionally built to be highly dynamic
Which of the following may be indicators of big data?
- Velocity - Veracity - Variety - Volume
Natural Language Understanding (NLU) asks which of the following questions in determining the context of the input provided?
- What are the implications? - How is it said? - Who said it, and why? - What is being said?
When considering the colors to use in visualization, which of the following should be considered?
- Whether the color adds value - The manner in which the color schemes may be interpreted - The accessibility / readability
The options for order by when writing a SQL statement are:
- asc - desc
Deep Learning DL
- is the most specific form of AI - is a subcategory of Machine Learning - allows us to model the human brain - allows us to teach the AI context of the situation - is difficult to use as it requires large computing power (quantum computers in some cases)
The three basic clauses of a SQL statement to select data are:
- select - from - where
Below are the clauses for a SQL statement. Put the SQL clauses in the order they should appear
1- Select 2- From 3- joins 4- where 5- group by 6- having 7- order by 8- ;
By what year does Ray Kurzell predict that machines will be able to achieve the intelligence of human beings?
2029
Which of the following does not describe unstructured data?
A defined length, type, and format
What is a data lake?
A storage repository that holds a vast amount of raw data in its original format until the business needs it
Why might some use the left join above versus an inner join?
A student may not have a major declared yet
Joe is doing an analysis of his investment portfolio. His data contains variables that are change due to factors outside the data- generating process and are independent of all other variables in the data. Which of the following predictive analytics uses describes the type of prediction he is doing?
Active Prediction
You created a scatterplot in Tableau that contains plotted data showing the number of class periods attended for a course vs the grade assigned for students. You are trying to see if there is a positive relationship between the two. Which feature / function will best aid you in this?
Adding trend lines to the scatterplot
Which of the following is a true statement about confidence scores?
Confidence scores are set for an Artificial Intelligence AI using the necessary confidence level that the AI output is the valid output to provide based upon its analysis.
The assurance that messages and information remain only to those authorized to view them
Confidentiality
Which of the following refers to artificial intelligence understanding the context of the input provided?
Connotation
Joe is working with accounts Receivable data. blah blah blah, The database may be suffering from integrity issues due to which of the following quality characteristics?
Consistency
When loading data into Tableau, which of the following types of data usually appears under measures?
Continuous
The ZN function in Tableau can be used to do which of the following?
Convert null values for a field in a data set to a value of 0
which aggregation function shows the number of records that meet a set of criteria?
Count
Which of the following would be an example of predictive analytics?
Creating an analysis of the number of cars that passed through a segment of freeway each day of the past two years to attempt to determine
Eric was asked to setup a visualization summarizing data on patients staying at the hospital based around the number of days they have been there. He has a data set that contains information on patients, which includes the date the patient was admitted (fieldadmittedDate). Doing some research, he found that Tableau uses TODAY() to represent the current date. Which of the following calculated fields in Tableau would identify the number of days that patients have been at the hospital?
DATEDIFF('day',[admittedDate], TODAY())
Collecting information from many sources and storing them together into a single location is referred to as:
Data Aggregation
Which of the following is the collection of data from various sources for the purpose of data processing?
Data Aggregation
Coastal Operations has been collecting forms submitted by patients. results of their labs, photos, and x-rays of all the patients and their lab results. among other structured and unstructured data. They have taken all the data and stored it in a central location in its original raw format. The data can then be extracted, cleansed, and utilized by their lab technicians and engineers for analysis of viruses and impacts to patients. The solution that they implemented to centralize all the data in it's original form would be referred to as which of the following?
Data Lake
Which of the following is the process of analyzing data to extract information not offered by the raw data alone?
Data Mining
Tools used to find patterns and relationships in large volumes of information that predict future behavior and guide decision making are referred to as:
Data Mining Tools
Which of following is a type of visualization in which you are presenting findings to an audience?
Declarative visualization
Which of the following is a type of visualization in which you are presenting findings to an audience?
Declarative visualization
Which of the following. fields in a data set would usually be found in the dimensions area in Tableau?
Departments
Bill runs a report of all the sales for the past quarter and puts it into a visualization to show his boss the results. This is an example of what type of analysis?
Descriptive
A summary of interpretation of a data set is an example of:
Descriptive Analytics
Which of the following keywords when used in a SQL select statement will remove duplicate records from the results?
Distinct
Jackson has a set of data that lists profits for a particular product based upon each of the states in which the products are sold. Which of the following color schemes would be the best option for him to use for the profits for each state?
Diverging
During which of the following processes does information cleansing usually occur?
ETL Processes
Monique is a contractor with the federal government. She wants to make sure that all messages she sends have confidentiality and authenticity/proof of origin. Which of the following would she use to send her messages?
Encrypt the messages twice using asymmetric encryption, once with her private key and then again with the recipients public key
Companies uses data warehouses for each of the following except:
Enter and process invoices real-time as they are received
The principles and standards that guide our behavior toward other people
Ethics
Which of the following can be described as using if-then statements to capture human knowledge?
Expert systems
A data set is a collection of organized or unorganized data
False
A diverging color scheme would be one that shades of a single color diverge
False
Box and Whisker plots are used for identifying correlation between two variables
False
Contemporary database systems provide a three-level hierarchy for naming relations. The top level of the hierarchy consists of schemas, each of which contain catalogs
False
Danielle sent a message to Bert using asymmetric encryption. The key used to encrypt the file is Bert's public key. Because his public key was used, Bert is able to validate that the file came from Danielle (Proof of origin)
False
Data Models show the details of the physical view information for a database
False
Data within a view is a duplicate copy of the data that is in the underlying tables related to the view
False
Deep learning is an alternative to machine learning once natural language understanding has been implemented for an artificial intelligence
False
Discrete data can take on any value within a range
False
IBM's Watson can only analyze structured data
False
If you are sending a message to a friend and you want to ensure confidentiality, you would encrypt with your public key and they would use your private key to decrypt the message
False
In most organizations, the managers on the operational areas would be more interested in less granular information, whereas the executive officers of the organization would be requesting more granular information.
False
Intuitive approaches to decision making rely on depth of information, analytical approaches focus on breadth by engaging a problem with a holistic and abstract view
False
Natural Language Processing may use a combination of artificial intelligence, iterative digital learning and deep learning
False
PKE (Public key encryption) uses a single common key between the sender and recipient of a message to encrypt and decrypt the message
False
The intersect operation does not remove duplicates. to remove duplicates, intersect all must be utillized
False
The only cause of poor quality of data is human error
False
The problem solving ability of AI is more useful for supporting intuitive rather than analytical decision making
False
The technique of organizing data into distinct segments that are defined before the analysis begins is referred to as cluster analysis
False
The validation set of training data for an advanced neural network is used only to test the final solution in order to confirm the actual predictive power of the network
False
Unstructured data extracts information from data and uses it to predict future trends and identify behavioral patterns
False
You should never share your public key with anyone
False
in a SQL query if You were averaging grade points from a table of students grades for each of the classes they took (field studentClass.grade) and you want to list those equal to or above 3.0 You would utilize the following line in your query: where avg(studentClass.grade) >= 3.0
False
What would be the output from a query if the following wildcard pattern were used? select locations.cityName where locations.cityName like %or% from schema.locations;
Finds any cities that have "or" in any position
Which of the following describes Anscombe's Quartet?
Four data sets with nearly identical descriptive statistics (slope, average, median, etc.) yet have very different distributions and appear very different when graphed
Which of the following would violate a foreign-key constraint?
Having a value in the attribute for a foreign key that does not correspond to a value in the table which the foreign key is coming from
HIPAA is a regulation that applies to which industry?
Healthcare
You received a visualization from a colleague that contains a white grid on a black background. At the intersections of the white grid, grey blobs appear as an optical illusion. However they disappear when you focus on the intersection. This is an example if which of the following?
Hermann Effect
Which of the following are better at making decisions when there is uncertainty?
Humans
Scott has the data that contains a field showing the high temp in F in his town by day. He wants to be able to show the temp for each day in one of two categories: a) Normal < 80 b) Above Normal >= 80 What would be the best way fro him to accomplish this is tableau?
IF(temperature) < 80 THEN "Normal" ELSE "Above Normal" END
Governance of the ethical and moral issues arises from the development and use of information technologies as well as the creation, collection, duplication, distribution and processing of information
Information Ethics
Which of the following is decreased when using a relational database?
Information Redundancy
Which of the following refers to the measure of the quality of information?
Information integrity
Which of the following is NOT a component of Artificial Intelligence?
Intuition Engine
A digital certificate:
Is a data file that identifies individuals or organizations online
Artificial Intelligence AI
Is a technology, not a specific software or hardware. It requires software and a database to communicate with it, and hardware to run on
Jill is creating a visualization in Tabeau that is plotting points on a map. She decides to use the size mark in her visualization. What does this accomplish?
It differentiates the points based upon the values of the measures used by making larger values visually bigger points
What is the role of a foreign key?
It is an attribute that is the primary key of one table that appears as an attribute in another table. It acts to provide a logical relationship between the two tables
Which of the following describes a full outer join?
It preserves tuples in both relations
Which of the following charts are good for showing data changes over time?
Line Chart
Carli is creating a system using artificial intelligence. She has given it a large dataset and performed a lot of training for the artificial intelligence during the creation of the system, but she would like to reduce the time going forward for on-going training of the artificial intelligence. Which of the following should she implement to help her?
Machine Learning
Which of the following describes how machine learning aids in the process of training an artificial intelligence
Machine learning takes inputs that it could not analyze to provide a suitable output for and classifies them in categories. Those categories can then be assigned an output by a human
Which of the following is a challenge with symmetric encryption?
Management and distribution of keys
Eric created a visualization with a lot of parallel bars, each of the same color. A pattern in the bars emerges making the visualization hard to see due to the visual noise created by the effect of the bars. Which of the following describes this scenario?
Moire Effect
The most complex form of artificial intelligence that is designed to be very dynamic is:
Neural Networks
The type of qualitative data that cannot be ranked, but can be used to count, group, and take proportion is:
Nominal
Which of the following is the first line of defense in securing information?
People
What does PII stand for?
Personally Identifiable Information
Joey is creating a model based upon past stock trading information. The purpose is to indicate to management of the best stock derivative arrangements and when to enter into them. This would be an example of what type of analysis?
Prescriptive
What uses techniques that create models indicating the best decision to make or course of action to take?
Prescriptive Analytics
Which of the following is used to uniquely identify a row or tuple in a table?
Primary Key
The right to be left alone when you want to be
Privacy
Nominal Data and Ordinal Data both are types of:
Qualitative Data
What feature of tableau would you utilize to label the percent of total that a slice of a pie chart makes up?
Quick Table Calculation
When using diverging colors on a diagram, which of the following is one of the least desirable color schemes when considering the ability for those with color blindness to be able to effectively read / use the visualization?
Red-Green Diverging
Which of the following is referred to as the use of social skills to trick people into revealing access credentials or other valuable information?
Social Engineers
If Joey wanted to encrypt a very large file (2 terabytes), which would be the best option for him to use?
Symmetric encryption
The pattern of reading that was originally based upon eye tracking behavior on websites but is applied to visualizations in general when determining the best layout for a dashboard is referred to as:
The F Pattern
Artificial Neural Networks are designed after which of the following?
The human brain
Size, Color, Label , and detail, are all examples of tableau features that are found where?
The marks card
Pattern Discovery Is:
The process of identifying distinctive relationships between observations in a data set
When artificial neural networks are referred to as black boxes, which of the following is being referred to?
The provide little guidance on the intuitive logic behind their predictions
A null value means:
The value is unknown or does not exist
Early systems of AI used deterministic hard-coded logic. Which of the following describes why this method of creating AI became tenuous?
The worlds store of information kept growing
Information itself has no ethics. Therefore who is responsible for developing ethical guidelines about how to manage it?
Those who own the information
When Creating a histogram, what is the purpose of using the create bins feature within tableau?
To group together bands of values into buckets for measures that represent continuous data
What is the where clause in a SQL statement used for?
To select only those rows in the result relation of the from clause to satisfy a specified predicate
Which of the following sets of data are used in machine learning to adjust the weights on the neural network?
Training Set
What chart type would best show the hierarchical nature of data
Tree Map
A person can act legally but not be acting ethically
True
A schema diagram is a pictorial depiction of the schema of a database that shows the relations in the database, their attributes, and primary keys and foreign keys.
True
Big data is growing at an exponential rate
True
Companies that are using analytics to automate processes in the business are gaining through employees having more time to work on higher-value-added tasks
True
Deep Learning is a subset of Machine Learning
True
Dumpster diving is a method of obtaining information from users by going through discarded items
True
Human-AI symbiosis is effective because it allows for a blend of both analytic and intuitive approaches to decision making.
True
In a SQL statement, union is used to join two queries together
True
In a full outer join, all the records from the right and left tables that meet the criteria of the query will appear. This would include records from each table where there are no related records (tuples) in the other table
True
One example of continuous data is distance
True
One example of continuous data is height
True
Organizations may have inconsistent data definitions between their production systems / databases. This may be a reason for the organization to utilize a data warehouse
True
Qualitative data is categorical data
True
Tableau allows for connections to live data in a database for purposes of having dashboards that can be refreshed periodically at a predetermined frequency
True
Text in a novel is an example of unstructured data
True
The primary role of artificial intelligence is to interpret and analyze input to return an output
True
The select clause of the statement is used to list the attributes desired in the result of a query
True
The use of the and logical connective is to find tuples that meet two or more criteria
True
When utilizing an AI that conducts a conversation, either written of verbal, there would be a natural language processor (NLP) involved
True
With enough training through machine learning, a neural network can learn enough to begin to match the predictive accuracy of a human expert
True
in order to perform any actions on a database, a user must first connect to a database
True
using the as statement in the select clause for a query will label the column or attributes header in the results with the specified text. For example: select people.personName as 'Name' would return 'Name' as the column header rather than personName
True
Which of the following describes the veracity characteristic of big data?
Uncertainty and or untrustworthiness of data
Jeff has been collecting electronic memos written within his department for the past year. He is storing them in a data lake. What type of data has he collected?
Unstructured
In a dashboard, you would usually place the visualization that requires the most emphasis in which quandrant?
Upper Left
Zoey is sending a large file to Jade containing critical information for an actuarial analysis. Zoey wants to make sure that only Jade can read the file (Confidentiality) and that Jade can validate the integrity of the file (Complete and accurate). Which of the following would she use to send the file?
Use symmetric encryption to encrypt the file. Use asymmetric encryption to encrypt the symmetric key and hash digest of the file with Jade's public key
Donovan is creating a chart that utilizes a map. He wants to have the map show the borders of the different counties within each state. Where would he go to enable this on the map?
Use the Map Styles Menu Option
Which describes prescriptive analytics?
Uses techniques that create models indicating the best decision to make or course of action to take
Which of the following is not one of the the five common characteristics of quality data?
Valid
Which of the following be a reason to utilize a one-one relationship
When you have attributes about tuples (records) for which not every tuple may have information about people and did not record physical characteristics such as height for every person, you may create a 1-1 relationship
When should you use multiple colors?
When you need to differentiate types of data
Which of the following charts is described in the chapter functioning well for showing proportions (vs quantitative data)?
Word Clouds
Which type of encryption requires the use of two keys?
asymmetric encryption
if the following join statement is used to join two tables in a query, which of the following tables would all of the tuples in the relation appear in the results? full outer join schema.customers on invoices.customerID = customers.customersID
both customers and invoices
the as clause:
can be used to rename attributes in the results of the query
if the following join statement is used to join two tables in a query, which of the following tables would all of the tuples in the relation appear in the results? right outer join schema.customers on invoices.customerID = customers.customerID
customers
Which of the following describes the phenomenon where there is an incentive to record everything?
datafication
AI Moves from
denotation to connotation
Matt created a hash digest of a message he is sending. He encrypts the digest with his public key. He encrypts the message using the recipients public key he used which of the following as part of this:
digital signature
The purpose of integrity constraints is to:
ensure that changes made to the database do not result in a loss of data consistency.
Regression Models are used to:
estimate the relationships among variables
Jacyln is writing a query that sums the profit by business line for her organization. For the sum to performed against each business line, she would use which of the following to specify the level of aggregation that sum should be executed against? Note: The field for business line is table.businessLine and the field for profit is table.profit
group by table.businessLine
Mel is the owner of a prominent campground in Michigan. Which of the following is the correct clause to filter for that criteria?
having sum(reservations.nights) > 10
Primary Role of AI
interpret, analyze and respond to data
Which of the following is the human capacity to analyze alternatives with deep perception, transcending ordinary-level functioning based on simple rational thinking?
intuitive intelligence
if the following join statement is used to join two tables in a query, which of the following tables would all of the tuples in the relation appear in the results? Left outer join schema.customers on invoices.customerID = customers.customerID
invoices
Which of the following would not be considered part of the accurate characteristic of high quality information?
is aggregate information in agreement with detailed information?
Which of the following is used in a SQL statement where clause to show all records where a particular attribute has null values
is null
When creating a view, the data that is returned from querying the view:
is stored in the tables that view queries
Encryption:
is used to scramble information into a alternative form that requires a key to read it.
What is needed to train a neural network?
large amounts of data
SELECT games.title, games.releaseDate, games.platform, highScores.score, highScores.player FROM schema.games WHERE highScores.score > 100000 He wants his query to pull all the games, even if there are no high scores listed in the table highScores. What would be the join statement below that Robert would utilize?
left join schema.highScores on highscores.gameID = game.gameID
Details about the data is referred to as:
metadata
The integrity constraint that requires that an attribute in a tuple not be blank is:
not null
AI must have
parameters defined that control and limit the actions of the AI
the operator like in a SQL statement is used for:
pattern matching
Which of the following describes the basic premise of how artificial neural networks?
receive inputs, process the inputs, provide an output
The concept that a value that appears in one relation for a given set of attributes must appear for a set of attributes in another relation is:
referential integrity
What should be your focus when designing your visualization?
the audience
What is supply chain management SCM?
the management of information flows between and among activities in a supply chain to maximize total supply chain effectiveness and corporate profitability
Which of the following is characterized as a lack of information about all alternatives or their consequences ?
uncertainty
The integrity constraint that requires that no two tuples can have the same value for an attribute is:
unique
Big data is mostly, over 90 percent :
unstructured data
AI Moves from
using literal language to figurative language
Which of the following describes the speed of data?
velocity
when is having used instead of where?
when groups are present throughout the use of an aggregate function (such as avg, count, etc.) and conditions need to be applied to the groups