ITM final exam questions

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

measures and dimensions of cross tabs/text tables

1 or more dimensions 1 or more measures

Jill is creating a visualization in tableau that is plotting points on a map. She decides to use the 'size' mark in her visualization. what does this accomplish? A.) it differentiates the points based upon the values of the measures used by making larger values visually bigger points B.) it differentiates the points based upon the values of the dimensions used by making larger values visually bigger points C.) it allows her to adjust the size of the map that she is plotting to D.) it allows her to change whether she is summing or averaging the data used

A

Name the color scheme used in the following visualization: United States Map; Different shades of Brown, the darker meaning the higher percentile A. Sequential B. Diverging C. Categorical D. Exponential

A

in a right outer join between two table in a query, which table would show all the types (records) that match the query criteria even if there were not related records in the other table. example: select table1.field1, table2,field2 from schema.table1 right join schema.table2 on table2.field3 = table1.field3 A.) the table in the right outer join clause (ex. table2) B.) the table in the from clause (ex. table1) C.) the table with the first field in the select clause(ex. table1) D.) none of the above

A

look at screenshot 4 for tables A.) inner join pharmacy.prescriptions on prescriptions.patientID = patients.patientID B.) inner join pharmacy.patients on prescriptions.patientID = patients.patientID C.) inner join pharmacy.patients on patients.patientID = prescriptions.patientID D.) inner join pharmacy.prescriptions on prescriptions.rxNumber = patients.rxNumber E.) Inner join pharmacy.prescirptions on prescriptions.rxNumber = patients.patientID

A

what chart type would be the best to show the hierarchical nature of data (i.e. how sub-components build up to their parent components)? A.) tree map B.) bar chart C.) scatterplot D.) bubble chart

A

Anscombe's Quartet

Four datasets that have very similar descriptive statistics (mean, variance, ... ), yet appear very different when graphed. Therefore, data visualization is a must-do task! Found in Lecture D Part 2 - Data Science

what does from do

From clause identifies the table from which to pull the data

dimensions

Tableau treats any field containing qualitative, categorical information as a dimension. This includes any field with text or dates values. Dimensions contain qualitative values such as names, dates, or geographical data

data dictionary

compiles all of the metadata about the data elements in the data model

Percent of total

computes a value as a percentage of all values within the table structure

Strengths of Asymmetric Cryptography

confidentiality access control authenticity and non-repudiation integrity no key distribution problem

Metadata

data that describes other data a) Examples: size of number, number of colors, or resolution (how clear the image is) details about data

Predictive Analytics

extracts information from data and uses it to predict future trends and identify behavioral patterns predicting the spread of contagious diseases like Covid-19, predicting the probability of a person to affect by the disease. Weather - to forecast the temperature, rainfall, and cyclones. Finance - to predict fraudulent transactions, risk assessments in giving loans.

advantages of symmetric encryption

fast don't need to know anything about a second key

data aggregation

in its simplest form, data aggregation is the process of compiling typically [large] amounts of information from a given database and organizing it into a more consumable and comprehensive medium.

data model

logical data structures that detail the relationships among data elements using graphics or pictures

Moire effect

occurs when viewing a set of lines or dots that is superimposed on another set of lines or dots, where the sets differ in relative size, angle, or spacing. This is seen when looking through window screens at another screen or background.

explain a one to one relationship (1:1)

one attribute connected to one attribute physical attributes (height, weight, eye color) least common to find one record only relates to one record on the other side one person can only relate to one row of physical attributes same primary key

-

represents a range of characters C[a-b]t finds cat and cbt

_

represents a single character H_t finds hot, hat, and hit

^

represents any character not in the brackets h[^oa]t finds hit, but not hot and hat

[ ]

represents any single character within the brackets H[oa]t finds hot and hat, but not hit

%

represents zero or more characters bl% finds bl, black, blue, and blob

symbol map

uses dots or symbols on a map encoding data with position. Size and color can also be used to encode other data

diverging bar

uses height or length from a baseline with the bars diverging from the center. Provides a precise quantitative comparison for each diverging segments Diverging stacked bar charts are great for showing the spread of negative and positive values, such as Strongly Disagree to Strongly Agree (without a Neutral category) and because they align to each other around the midpoint,

what are the characteristics of big data

velocity, veracity, volume, and variety

dimensions and measures of line chart - discrete

1date 0 or more dimensions 1 or more measures

Unlike continuous date values, discrete date values can I. Be sorted II. Not be sorted III. Include month or year IV. Include month and year A. I and IV only B. I and III only C. II and IV only D. II and III only

B

Which function would be used in tableau to show a line that represents the relationships between a set of data points that have been plotted (i.e. regression) A.) slope B.) trend lines C.) line chart D.) calculated fields

B

refer to screenshot 1 for question

C

Which of the following would be an example of predictive analytics? A.) creating a map of subway systems in a city showing the most used routes B.) creating a chart of sales by product for the prior year to identify which is the highest seller C.) creating a scatterplot of the batting averages for all outfielders in Major League Baseball D.) creating an analysis of the sales of a product for the prior year to identify what sales may be in the upcoming year

D

true or false - based upon the manner the tables are designed, a traveler could fly from the same departing airport to the same arriving airport multiple times on the same date

False

true or false - the additional table used to create a many to many relationship will have a primary key that only consists of a single unique attribute in all cases

False

true or false - when creating relationships between tables, foreign keys are always optional. Only primary kets are needed in each table

False

WHERE CustomerName LIKE '%a'

Finds any values that ends with "a"

WHERE CustomerName LIKE '%or%'

Finds any values that have "or" in any position

best practices with pie charts

Use angle, arc, and area to show comparison.

Percent difference

calculated the percent difference from the previous column, across a table

Bins are created by taking a ____________________ and transforming it into a discrete dimension. Data Continuous measure Dimension Discrete

continuous measure

Categorical Color Scheme

contrasting colors for individual comparison Categorical colors help users map non-numeric meaning to objects in a visualization. These are designed to be visually distinct from one another. The Spectrum categorical 6-color palette has been optimized to be distinguishable for users with color vision deficiencies. bar chart = completely different colors (red, blue,yellow,purple,pink,green)

highlight table

encodes a data table using color to highlight the differences in the table numbers

pie chart

encodes data using angle, area, and arc to show a part-to-whole comparison

stacked bar chart

encodes data using height or length of bar and color by segment and shows categorical and part-to-whole comparisons. Use height or length from a common baseline with another bar plotted on top

line chart

encodes data using position and is a good chart to show trends over time. It is best to keep the time series on the x-axis and having the oldest time period on the left going to the next time period on the right

Tree Map

encodes data using size and color and is useful for hierarchical data or when there are a very large number of categories to compare. Uses a series of rectangles nested within each other to show data as a proportion to the whole.

bubble chart

encodes data using size of circle to show comparisons which is difficult for making precise quantitative comparison

asymmetric encryption - proof of origin

encrypt with private key of originator decrypt with public key of originator

Hermann Effect

optical illusion that ischaracterized by "ghostlike" grey blobs perceived at the intersections of a white or light colored grid on a black background. The grey blobs will then disappear when looking directly at an intersection

what key will a sender never use to encrypt a message

receivers private key

look at screenshot 6 for questions

screenshot 7 and 8 for answers C

standard sql format

select table.field1 as 'label' table.field2 as 'label2' table.field3 as 'label3' from schema.table where table.field1 operator criteria where table.field2 operator criteria where table.field3 operator criteria order by table.field1, table.field2;

Weaknesses of Asymmetric Cryptography

slow - should only be used for small files computationally intensive

Prescriptive Analytics

techniques that create models indicating the best decision to make or course of action to take Prescriptive analytics is a type of data analytics—the use of technology to help businesses make better decisions through the analysis of raw data. ... It can be used to make decisions on any time horizon, from immediate to long term. Prescriptive analytics goes beyond simply predicting options in the predictive model and actually suggests a range of prescribed actions and the potential outcomes of each action. ... Google's self-driving car, Waymo, is an example of prescriptive analytics in action.

Big Data

the huge and complex data sets generated by today's sophisticated information generation, collection, storage, and analysis technologies

highlight

to emphasize

shaded map

uses color to encode quantitative data or categorical data

Explain a one-to-many relationship (1:n)

we only have one element on this side that relates to many elements on the other side a customer with amazon --> there are multiple addresses in my amazon account the address can only relate to one customer the primary key of the 'one' table is posted as the foreign key in the 'many' table

measures and dimensions of a pie chart

1 or more measures 1 or 2 dimensions

requirements for symbol map

1 geographic dimension 0 or more dimensions 0 to 2 measures

dimensions and measures of histogram

1 measure

dimensions and measures of stacked bar chart

1 or more dimension 1 or more measure

dimensions and measures of tree map

1 or more dimensions 1 or 2 measures

continuous

"forming an unbroken whole, without interruption"; green Continuous field values are treated as an infinite range. Generally, continuous fields add axes to the view.

discrete

"individually separate and distinct." blue Discrete values are treated as finite. Generally, discrete fields add headers to the view.

Volume (Big Data)

An inherent quality of big data that infers that big data contain a large amount of data.

what does select do

Reads data Divided into 2 clauses Select clause identifies which columns to return

inner join explain what is happening... SELECT Orders.OrderID, Customers.CustomerName FROM Orders INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;

The INNER JOIN keyword selects all rows from both tables as long as there is a match between the columns. If there are records in the "Orders" table that do not have matches in "Customers", these orders will not be shown! The INNER JOIN keyword selects records that have matching values in both tables.

In Tableau, which of the following best describes a filter?

Used to limit what data is displayed on the sheet.

histogram

a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval. encodes data using height and shows a distribution. When you are trying to find the frequency of events within a population, you are looking at the distribution. If you are showing the number of respondents to a survey by age, or the frequency of incoming calls by day. A bin is similar to the idea of putting data into categories

scatterplots

a graphed cluster of dots, each of which represents the values of two variables. The slope of the points suggests the direction of the relationship between the two variables. The amount of scatter suggests the strength of the correlation (little scatter indicates high correlation). uses position to show the relationship between two variables, typically measures that are plotted on a quantitative scale. An additional measure can be encoded using size. Use a scatter plot when investigating the relationship between different variables. Effective way to give a sense of trends, concentrations and outliers

Change Saturation

change in the intensity of the shade of color

what are characteristics of quality data

all cows can't urinate together Accurate complete consistent unique timely ACCUT

left join explain what is happening... SELECT Customers.CustomerName, Orders.OrderID FROM Customers LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID ORDER BY Customers.CustomerName;

all of the data will be shown from customers and if there is matching data from the orders table than that information will show The LEFT JOIN keyword returns all records from the left table (Customers), even if there are no matches in the right table (Orders). The LEFT JOIN keyword returns all records from the left table (table1), and the matching records from the right table (table2). The result is 0 records from the right side, if there is no match.

right join explain what is happening... SELECT Orders.OrderID, Employees.LastName, Employees.FirstName FROM Orders RIGHT JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID ORDER BY Orders.OrderID;

all of the data will be shown from employees table and if there is data that matches to the orders table then that information will show The RIGHT JOIN keyword returns all records from the right table (Employees), even if there are no matches in the left table (Orders). The RIGHT JOIN keyword returns all records from the right table (table2), and the matching records from the left table (table1). The result is 0 records from the left side, if there is no match.

quick table calculations

allows you to apply a built in calculation to your visualization using pre-set configurations for the calculator type you choose. Performed after the results have been pulled from the data. Filters can be applied. Could be manually created

diverging color scheme

appropriate for data that varies from a norm or otherwise important class break; places emphasis on extreme high, extreme low, and middle range values Place a different moderately-dark hue at each of the four corners of the legend. ... The remaining colors are lighter than the corners, because they contain the midpoint of one of the two variables, and they are transitional hues that lie between their adjacent hues. "Sequential tells a different story." If your story emphasizes the highest (=darkest) values, go for a sequential color scale. If your story is about the lowest and highest values, go for a diverging scale.

YTD growth

calculates percentage change from the same time period in the previous year and then calculates a running total over a year

Moving average

calculates the average value based on a range around the current value

Compound growth rate

calculates the current value as a percentage from the first value

difference

calculates the difference from the previous column, across a table

Rank

calculates the integer rank of the value across the table

YTD total

calculates the running total from the beginning of the year across the table

Percentile

calculates the statistical percentile of the value across the table

In using best practices in Tableau, dimensions are mostly _______ while measures are mostly _____________.

categories, numbers

Changes in hue

change in color

changes to the value

change in the brightness of the color

simple hybrid system operation

encrypted with a symmetric key encrypt asymmetric key --> receivers public key decrypted with a receivers private key now have symmetric key and can decrypt message confidentiality and use with a large file

asymmetric encryption - proof of origin and confidentiality

encrypts twice encrypt with private key of originator encrypt with receivers public key decrypt with receivers private key decrypt with originators private key ct m1 --> ct m2 --> ct m2 --> ct m1

dimensions and measures of highlight table

1 or more dimensions 1 measure

dimensions and measures for bubble chart

1 or more dimensions 1 or 2 measures

dimensions and measures of diverging bar charts

1 or more dimensions 1 or more measures

dimensions and measures of line chart - continuous

1 date 0 or more dimensions 1 or more measures

requirements for a shaded map

1 geographic dimension 0 or more dimensions 0 or 1 measure

in determining the aver sales per unit sold, why is the aggregate function (a) sum(sales) / sum(quantity) the correct function to use instead of (b) and then averaging the aggregate function one added to the chart sales / quantity

(a) takes the total sales and total quantity and determines the average across all units sold, where (b) determines the average only for each line item. (b) considers each line item of equal weight regardless of whether or not it has more or less quantity than other line items

two requirements of primary keys

- must be unique for each row (record in the table) - can be an aggregation of more than one field (also called concatenated, composite, or compound key)

disadvantages of symmetric encryption

- requires separate key for everyone who wishes to communicate - must find secure way to share the secret key with other party -may not ensure authenticity - may not ensure confidentiality - does not ensure integrity

dimensions and measures of bar charts

0 or more dimensions 1 or more measures

dimensions and measures for scatterplots

0 or more dimensions 2 to 4 measures

dimensions and measures of line chart - dual lines

1date 0 or more dimensions 2 or more measures

which of the following use of predictive analytics has variables that are changed due to factors outside the data-generating process and are independent of all other variables? A.) active prediction B.) multi-variable prediction C.) intervenors prediction D.) passive prediction

A

Order by

A SQL clause that is useful for ordering the output of a SELECT query (for example, in ascending or descending order). Takes the output from the select clause and order the query results according to the specification within the order by clause The ORDER BY keyword is used to sort the result-set in ascending or descending order. The ORDER BY keyword sorts the records in ascending order by default. To sort the records in descending order, use the DESC keyword.

candidate keys

A candidate key is a column or a set of columns that can qualify as a primary key in the database. Any attribute (column) in the table with unique values.

Marks Card

A card to the left of the view where you can drag fields to control mark properties such as type, color, size, shape, label, tooltip, and detail.

combo chart

A chart that combines two chart types, such as column and line, to plot different types of data, such as quantities and percentages. Chart that contains two chart types, such as column and line, to depict two types of data, such as individual data points and percentages.

dual-axis chart

A chart that has one series plotted on a secondary axis. Useful when comparing data series that use different scales or different types of measurements. A chart that uses the left side (or top) of the chart as one value axis (y-axis) and the right side (or bottom) of the chart as a second value axis (y-axis)

Having

A clause applied to the output of a GROUP BY operation to restrict selected rows. Similar to where but is concerned with groups, not individual rows ● If group by clause is used, the having clause is applied to the groups created by the group by clause ● If a where clause is used and no group by clause is used, the having clause is applied to the output of the where clause and the output is treated as one group ● If no where clause and no group byare used, the having clause is applied to the output of the from clause and that output is treated as one group ● A where clause cna receive input only from a from clause, but a having clause can receive input from a group by, where, or from clause The HAVING clause was added to SQL because the WHERE keyword cannot be used with aggregate functions.

Primary Key

A field (or group of fields) that uniquely identifies a given entity in a table

data warehouse

A logical collection of information - gathered from many different operational databases - that supports business analysis activities and decision-making tasks A data warehouse is a central repository of information that can be analyzed to make more informed decisions. Data flows into a data warehouse from transactional systems, relational databases, and other sources, typically on a regular cadence. Business analysts, data engineers, data scientists, and decision makers access the data through business intelligence (BI) tools, SQL clients, and other analytics applications.

measures

A measure is a field that is a dependent on value of one or more dimensions. Tableau treats any field containing numeric (quantitative) information as a measure. you can measure this

marks in tableau

A part of the view that visually represents one or more rows in a data source. A mark can be, for example, a bar, line, or square. You can control the type, color, and size of marks.

entity

A person, place, thing, transaction, or event about which information is stored (rows in a table contain entities)

foreign key

A primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tables

Concatenated Key

A primary key that is composed of multiple columns is called a _____. Also known as composite primary key is a combination of two or more column values used to define a key in a table. ... Sometimes a single column does not contain sufficient information to distinguish one record from every other record in the table.

extract data

A saved subset of a data source that you can use to improve performance and analyze offline. You can create an extract by defining filters and limits that include the data you want in the extract.

bin

A user-defined grouping of measures in the data source.

running total

A(n) ___ is a summation that is continually adjusted to take account of items as they are "seen" by a program. adds totals across the table

Aggregate functions

AVG - calculates the average of a set of values. COUNT - counts rows in a specified table or view. MIN - gets the minimum value in a set of values. MAX - gets the maximum value in a set of values. SUM - calculates the sum of values. performed on a group of records.

live data

Actual data that is processed by the operational system. Takes place when the system has been installed to ensure it is operating as expected. Live allows you real-time data

Symmetric Encryption

An encryption method whereby the same key is used to encode and to decode the message

Veracity (Big Data)

An inherent aspect of big data that infers that big data users must evaluate the accuracy and reliability of the collected data.

Velocity (Big Data)

An inherent quality of big data that infers that data are collected and can be analyzed and accessed quickly.

consider the following left outer join that queries students and their majors: select students.studentID, students.name, majors.majorName, majors.collegeName from university.students left join university.majors on majors.majorName = students.majorName; why might someone use the left join above versus an inner join? A.) a major may have multiple colleges that it relates B.) a student may not have declared a major yet C.) there may be majors that are no longer offered by the college D.) none of the above

B

which of the following would be mot likely to contain the most unstructured data? A.) stuinfo B.) your personal music library C.) today's currency exchange rate table D.) student grade listings maintained by the registrar

B

benefits of a data warehouse

Benefits of a data warehouse include the following: Informed decision making Consolidated data from many sources Historical data analysis Data quality, consistency, and accuracy Separation of analytics processing from transactional databases, which improves performance of both systems

what is the purpose of bins

Bins combine a set of data into groups of equal size which makes the data and the view systematic. Bins play an essential role in data analysis as they provide a systematic data range which helps the user organise information in a better way. Bins are never used in calculations.

Which type of calculation could be used to distinguish data in a visualization that satisfies a certain criterion? A. A "Sum" calculation. B. A distinguish statement calculation. C. A calculation using "If" and "Then" statements. D. A calculation using a "Vlookup" statement.

C. A calculation using "If" and "Then" statements.

AJ wants to send Bryan a small message securely. He wants to make sure that only Bryan can read the message, thus ensuring confidentiality. Which of the following encryption methods would he use? A.) asymmetric encryption, with the message encrypted using his private key B.) symmetric encryption, with the message encrypted using his private key C.) asymmetric encryption, with the message encrypted using Bryan's public key D.) using casing and asymmetric encryption, create a digital signature

C

For the next two questions... consider the following left outer join that queries students and their majors: select students.studentID, students.name, majors.majorName, majors.collegeName from university.students left join university.majors on majors.majorName = students.majorName; which of the following is the left table that will return each record (tuple) from the table, regardless of whether there is a corresponding record in the right table? A.) majorName B.) majors C.) students D.) univeristy E.) none of the above

C

Jeff is preparing an analysis of sales year over year to determine what sales may be in the upcoming year based upon the relative seasonal sales cycles that his company experiences. This would be an example of what type of data analysis? A.) prescriptive analysis B.) descriptive analysis C.) predictive analysis D.) variation cycle analysis

C

When inner joining two tables with a many to many relationship, how many inner join clauses are needed in your query? A.) 0 B.) 1 C.) 2 D.) 3 E.) greater than 3

C

jerod received data on sales for a new gadget that his company has been selling. He wants to calculate the average number of units sold by each of their sales representatives for each of the three sales territories they have defined for the united states (east, central, & west). If a sales representative has not sold any units, the value in the data set is null (i.e. blank) what would be the most efficient way in tableau for him to deal with values that are null in the data set? A.) export the data to Microsoft excel or google sheets, sue fine and replace, then import the data back into tableau B.) use the 'interpret null values; feature within the analysis menu C.) create a calculated field using the ZN function D.) she does not have to do anything. tableau will automatically consider null values to represent 0 units

C

look at screenshot 5 for the next two questions if you had the tables above, what would the third table (labeled above with ???????) best represent A.) a food order from a table of customers at a restaurant B.) the bill of materials for manufacturing a car C.) recipes for meals D.) the ingredients that a person has in their pantry E.) the products in inventory at a grocery store

C

which of the following SQL statements will provide all the tuples (records) and attributes from the table 'employees' which the individuals in the table are less than 40 and that make more than $150,000 A.) select * from hr.employees where employees.age < '40' and employees.income > '150000' B.) select employees.age < 40, employees.income > 150000 from hr. employees C.) select * employees.age < 40 and employees.income > 150000 D.) select all from hr.employees where employees.age < 40 and employees.income > 150000

C

which of the following is a characteristic of a data lake? A.) the data seeps over to other tables B.) the data is pooled with other data to find patterns in aggregation C.) the data is stored in raw form until needed for processing or analysis D.) the data is stored in multiple locations for correlation

C

with reference to data granularity, which of the following groups of individuals would typically want to see information at the least granular (i.e. more course level)? A.) sales manager for a region B.) plant managers in charge of manufacturing operations C.) the board of directors D.) regional finance directors

C

What does Granularity mean in Tableau? A. The extent of the quality of the graph type. B. The visualization's picture definition. C. The extent to which the view is broken down. The less aggregated, the more granular. D. The extent to which the view is broken down. The more aggregated, the more granular.

C. The extent to which the view is broken down. The less aggregated, the more granular.

Bee works for a large auto dealership in their service department. She has a data set that contains information on services provided, which includes the date a vehicle came in for service (field dateIn) and the date service was completed (field dateComplete). Which of the following calculated fields in tableau would identify the number of days that it took to complete a service? A.) DAYCALC([dateComplete] - [dateIn] B.) COMPARE('count',[dateIn],[dateComplete]) C.) [dateComplete] - [dateIn] D.) DATEDIFF('day',[dateIn],[dateComplete])

D

if the concatenated primary key of table 3 did not include departingAirportCode and arrivingAirportCode, which of the following would become TRUE about the design of tables? A.) only one traveler could fly each day from one airport to another B.) a traveler could fly from the same departing airport to the same arriving airport multiple times a day C.) all travelers could only travel between two airports D.) a traveler could only travel once a day E.) none of the above

D

refer to screenshot 3 for tables what do the tables above show A.) airports and the employees / pilots that work at them B.) traveler frequent flyer information C.) primary airports for a set of travelers D.) travelers and flights they have taken or will take

D

what is the difference between the following SQL select clauses? Query 1: select employees.fullName as "Name" from hr.employees where employees.office = "Chicago" Query 2: select employees.fullName from hr.employees where employees.office = "Chicago" A.) neither query will return any records as 'chicago' should not be in quotes B.) query 2 will return all employees where query 1 will only return those values where an individual has only a first name listed C.) Query 1 will error as the syntax is not correct for the from clause D.) the column heading for the results on query 1 will show 'Name' vs. query 2 that will show 'fullName'

D

what is the relationship between meals and ingredients A.) many to one B.) multiple to one C.) one to many D.) many to many E.) one to one

D

unstructured data

Data does not exist in a fixed location and can include text documents, PDFs, voice messages, emails, music library nonnumeric information that is typically formatted in a way that is meant for human eyes and not easily understood by computers

what does where do

Defines a condition that must be met in order for data to be returned Uses "like" to filter records NOT FOR AGGREGATE FUNCTIONS

structured data

Data that (1) are typically numeric or categorical; (2) can be organized and formatted in a way that is easy for computers to read, organize, and understand; and (3) can be inserted into a database in a seamless fashion. Examples of structured data include names, dates, addresses, credit card numbers, stock information, geolocation, and more. Structured data is highly organized and easily understood by machine language.

Which of the following is a common characteristic of quality data? A.) complete B.) accurate C.) unique D.) timely E.) all of the above

E

WHERE CustomerName LIKE '"_r%"

Finds any values that have "r" in the second position

WHERE CustomerName LIKE 'a%'

Finds any values that start with "a"

WHERE CustomerName LIKE "a_%_%"

Finds any values that starts with "a" and are at least 3 characters in length

WHERE ContactName LIKE "a%o"

Finds any values that starts with "a" and ends with "o"

group by

Keyword used to indicate a column or set of columns in which to bundle values together (and must be used when combining aggregate and non-aggregate values in a select clause) Used to group together types of information in order to summarize related data

Variety (Big Data)

Manage the complexity of multiple relational and non-relational data types and schemas different forms of structured and unstructured data

Gesalt Principles

Principles that describe the brain's organization of sensory information into meaningful units and patterns. series of web design principles which include proximity, closure, similarity, continuity, perception, organization, and symmetry

downfalls of pie charts and why they aren't preferred

Problematic because its difficult to make precise quantitative comparisons using angle, arc and area. This can distort the comparison of the data. Its easier to see a single slice in a pie chart or two slices when one is being compared to the other, but the more it's sliced the more difficult it becomes

AS (sql) what is going on here... SELECT column_name AS alias_nameFROM table_name;

SQL aliases are used to give a table, or a column in a table, a temporary name. Aliases are often used to make column names more readable. An alias only exists for the duration of that query. An alias is created with the AS keyword.

sequential

Sequential color schemes are used to highlight ordered data such as income, temperature, elevation or infection rates. A well designed sequential color scheme ranges from a light color (representing low attribute values) to a dark color (representing high attribute values).

what is show me menu in tableau

Show Me is used to apply a required view to the existing data in the worksheet. Those views can be a pie chart, scatter plot or a line chart.

Crosstabs

Statistical technique that establishes an interdependent relationship between two tables of values, but does not identify a causal. we see them in movie times, price on a menu, store catalogs, sports score, reference tables, phone list etc. Really good for a list of precise values that can be looked up. Not good at seeing trends and patterns

In Tableau, which of the following best describes a tooltip? A. Used to limit what data is displayed on the sheet. B. Text boxes that appear when hovering over a mark on a sheet to give more information. C. Where you determine which variables will go on what axis. D. Controls most of the visual elements in a sheet. Allows you to switch between different chart types (bar, line, symbol, filled map, and so), change colors and sizes, add labels, and change the level of detail.

Text boxes that appear when hovering over a mark on a sheet to give more information.

full outer join / full join explain what is happening... SELECT Customers.CustomerName, Orders.OrderID FROM Customers FULL OUTER JOIN Orders ON Customers.CustomerID=Orders.CustomerID ORDER BY Customers.CustomerName;

The FULL OUTER JOIN keyword returns all matching records from both tables whether the other table matches or not. So, if there are rows in "Customers" that do not have matches in "Orders", or if there are rows in "Orders" that do not have matches in "Customers", those rows will be listed as well. The FULL OUTER JOIN keyword returns all records when there is a match in left (table1) or right (table2) table records.

data element

The smallest or basic unit of information

trend lines

Trend lines show correlation between two variables. When enabled they show the formula for the line that best describes the data (also referred to as regression)

difference between data warehouse and a data lake

Unlike a data warehouse, a data lake is a centralized repository for all data, including structured, semi-structured, and unstructured. A data warehouse requires that the data be organized in a tabular format, which is where the schema comes into play. The tabular format is needed so that SQL can be used to query the data.

bar charts

Used when data is divided into categories (discrete data) The bars are separated to show different categories encodes data using height/length of bar and shows categorical comparisons. Shows relative size or value of two or more discrete itemsHorizontal bars

F Pattern for Visualizations

Users first read in a horizontal movement, usually across the upper part of the content area. This initial element forms the F's top bar. Next, users move down the page a bit and then read across in a second horizontal movement that typically covers a shorter area than the previous movement. This additional element forms the F's lower bar. Finally, users scan the content's left side in a vertical movement. Sometimes this is a slow and systematic scan that appears as a solid stripe on an eyetracking heatmap. Other times users move faster, creating a spottier heatmap. This last element forms the F's stem.

color - marks menu

When multiple discrete fields are added to color on the Marks card, then Tableau Desktop will automatically nest the colors. The first discrete field on Color is assigned a different color (blue, orange, green, etc...) and then every value for the next discrete field is given a different shade of those colors.

detail - marks menu

When you drop a dimension on Detail on the Marks card, the marks in a data view are separated according to the members of that dimension. Unlike dropping a dimension on the Rows or Columns shelf, dropping it on Detail on the Marks card is a way to show more data without changing the table structure.

size - marks menu

When you place a discrete field on Size on the Marks card, Tableau separates the marks according to the members in the dimension, and assigns a unique size to each member. Because size has an inherent order (small to big), categorical sizes work best for ordered data like years or quarters.

labels - marks menu

You can add labels to the data points in your visualization. For example, in a view that shows product category sales over time as a line, you can label sales next to each point along the lines.

record

a collection of related data elements

data lake

a storage repository that holds a vast amount of raw data in its original format until the business needs it

explain a many-to-many relationship (n:m)

a student is in multiple classes but the class can have more than one student requires a third table which has two foreign keys that combined are a unique combination (i.e. primary key) pets can have many owners and owners can have many pets

asymmetric encryption - confidentiality

a type of cryptographic based on algorithms that require two keys -- one of which is secret (or private) and one of which is public (freely known to others). encrypt with receivers public key decrypt with receivers private key

Attribute (field, column)

the data elements associated with an entity (columns in each table contain entities)

Information Granularity

the extent of detail within the information refers to the extent of detail within the information (fine and detailed or coarse and abstract) low level of granularity = high level of detail high level of granularity = low level of detail

picture superiority effect

the fact that concepts learned as pictures are more memorable than concepts learned as words subjects reviewed over 600 photos, remembered 98%

Descriptive Analytics

the use of data to understand past and current business performance and make informed decisions Company reports tracking inventory, workflow, sales and revenue are all examples of descriptive analytics. Other examples include KPIs and metrics used to measure the performance of specific aspects of the business or the company overall.

alert

to get attention


Ensembles d'études connexes

ECON Chapter 4: Demand and supply

View Set

Chapter 7 Segmentation, Target Marketing, and Positioning

View Set

CISSP 2020 SYBEX TEST PREP 475 QUESTIONS

View Set

Drugs and Behavior - Test 1: Chapter 3: Drug Policy

View Set