Compsci - Data Analytics

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What are the limits of computation?

"Imponderable" Questions like what is the answer to life.

What are the 5 phases computers have gone through since there creation in the 1900's?

1) 1900 - 1935: Electromagnetical Computers (Punchcard) 2) 1935-1940: Relay 3) 1942-1960: Vacuum Tube 4) 1960-1973: Transistors 5)1973-Present: Integrated Circuit on Logarithmic Plots

What are the 8 steps in the data pipeline?

1) Collect or Create 2) Validate 3) Filter 4) Clean 5) Store 6) Combine 7) Analyze 8) Visualize

What are the 2 ways we use programming languages and what do each mean?

1) Compiled - Programs are written in high level languages and translated into machine code or other low level languages using a compiler (a type of program). (Java and C) 2) Interpreted - An interpreter ( a type of program) is used to read the program and do what the instructions in it say to do. (Ruby and Python)

What are the 2 types of quantitative data? Explain them.

1) Continuous - numerical values that are infinite along a range 2) Discrete - Countable numerical values that are finite within a range

What categories can we use to tell if an algorithm is good and what does each factor mean? (2)

1) Correctness - Does the algorithm execute properly for a number of inputs. 2) Time Complexity - How many basic steps or operations are performed by the algorithm. Or relative to the input size. Normally we care about the worst-case.

What does the Collect or Create step of the data pipeline consist of? (3)

1) Databases 2) Sensors/Measurements 3) Transactions

What are the two steps you need to complete when using variables?

1) Declare the variable - declare the symbolic variable name and the data type 2) Define the variable - set a value to the variable

Where does Data come from (4)?

1) Measurements 2) Transactions 3) Computations 4) Databases

What are the 2 types of qualitative data? Explain them.

1) Nominal - Values in categories with no order 2) Ordinal - Values in a category with an ordering

What are the 4 kinds of data discussed in class?

1) Numbers/Values 2) Audio/Sound 3) Text/Words 4) Images/Pictures

What are the two large categories for data types? Explain them.

1) Quantitative - Numerical values that can be measures and ordered 2) Categorical - Values (normally labels/text) belonging to the same category.

What are the parts of a function?

1) There is the function name (SUM, AVG etc.) 2) There are the parameters (number1, number 2 etc.) 3) There are also arguments which are specific entries (A1, 32, 5.124, B$5, $C$6)

What are some of the pros of text files? (3)

1) They are easy to construct and to read 2) Easy for people to check and debug 3) Used for both human readable files and files for programs to read.

What are the two main aspects of cells?

1) What they contain 2) How it looks

How do humans represent data? (4)

1) Writing 2) Drawings 3) Speech 4) Body Language

What does the unsigned interger 1101(subscript)2 equate to? How do you arrive at this conclusion?

13 subscript 10 You calculate by doing 1 x 2^n where n is the number of times 1 appears in the cycle

How many bytes is in a pixel?

3 bytes or 24 bits

Give an example of a pixel coded into a pixel

79 base 10, 38 base 10, 131 base 10 = purple

How is an IF statement written>

=IF(condition, return_if true, return_if_false)

How is the INDEX cell written? What is the range, what is the index, and what is the column number?

=INDEX(Range, Index, [column_number]) Range = the range of cells that will be searched Index = the cell number in the range for which the value will be returned column_number = is the column of the range to be searched if the range spans multiple columns (is 2d)

How is the MATCH function written? What is the range, and how to match part of the function arguments?

=MATCH(value_to_find, range, how_to_match) range is the range of cells that will be searched. the how_to_match is the search mode 0 = find the first value equal to value_to_find 1 = Find the largest value <= to value_to_find -1 = find the smalled value >= to value_to_find

What is a program?

A Sequence of instructions stored in a computers memory.

What does the Boolean logic function do? What responses does it give? and What are the main operations used in boolean functions?

A boolean logic function determines based on arguments what to display in the event of a true or false. They only values it uses are true and false. The main operators are AND, OR, and NOT. The values given from false or true can be numbers.

What is the data data type?

A calendar date.

What type of file is used during the store part of the data pipeline? What other part is this file used in?

A database, this is also used in the combine, analyze and visualize steps of the data pipeline

What are _IF functions?

A number of excel functions have an _IF variant that allows for criteria to determine if value should be counted in calculation.

What general trend in regards to the number of transistors has circuit complexity followed since the 70's?

A positive upwards trend, meaning there have been progressively more and more transistors in a circuit.

What does is a range, how is it displayed and how is it useful?

A range is a range of cell coordinates, it is displayed with a colon like so: B3:B6, B3 being the starting cell and to any cell in between b3 ending at b6. It is useful because it is easier then giving each cell its own argument.

What is the assignment of result to return?

A statement in the body of the function that says that the name of some function or variable will return the information found through the steps found in the result to return (to the left).

What is an algorithm?

A step by step description of how to solve a problem.

What is a subroutine? How is it denoted in VBA?

A subroutine is code that does not return a specific result. It is denoted with the keyword sub in VBA.

What is a variable?

A variable is a named storage location in a computer program.

What are physical objects?

Addition by measuring and grouping.

What does the SUMIF function do?

Adds values that meet the given critera

How is data represented in files? Is this different from memory? How so?

Also in bits, so the same way as memory but the storage and interpretation may be different.

What is the difference between an Algorithm and a Program

An Algorithm is a general procedure to compute the solution to a problem. Given an input it ALWAYS produces an output. A program is a particular set of instructions in some language.

What kind of revolution is the world undergoing right now? Is this an influential revolution? If so, what is it comparable to ?

An Information Revolution, yes similar to the industrial revolution.

What follows the = sign in a cell?

An expression follows

What does the analyze part of the data pipeline consist of?

Analyzing the data to find trends and other useful information

What is a programming language?

Artificial languages for giving instructions that people can understand. Like Java, VBA, Python, C, C++ etc.

How is an image encoded?

As pixels

Why do we need Data Analytics?

Because data is constantly being generated but not effectively used. Society needs people who can make sense of this data and visualize it for others.

Why do we use programming languages?

Because they can contain more abstract ideas and more powerful tools.

What are Text files? Are text files always machine readable?

Bits encoded as ASCII, UTF or Unicode. No they are not always machine readable.

What must occur an AND boolean function? When can this operator be useful?

Both values of and must be true for the result to be true. It can be useful to help check if a number is in a range.

How does one begin to add a program in VBA

By creating a Module, which opens the IDE

In what ways can a cell be formatted?

By fonts, sizes, styles colors and more.

How are column names given?

By letter

How are row names given?

By number

How does one make a cell reference absolute?

By putting a $ in front of the column name or the row number or both. So for instance if you wanted to keep the formula calculating using cells a1 you would write the formula with a $A$1

How does one tell the cell to use a formula?

By starting the contents of the cell with an = sign

How does one use the program made in VBA on a spreadsheet?

By writing =name_of_function(B1,B2,B3).

What can the contents of a function be?

Cell Constant Values, Cell references, or Cell ranges.

What does the clean part of the data pipeline consist of?

Cleaning the data up so it is easy to use

What does the combine part of the data pipeline consist of?

Combining useful data into one document

What is a CSV file?

Comma-Separated Values file

What can XML files represent?

Complex data objects

What is electronic computation

Computation via electronics?

What is a spreadsheet?

Computer program to organize and analyze data in tabular form

When is the best time to use conditional statements?

Conditional statements are best used when we have multiple paths to follow. Or use them when you want to perform parts of code only sometimes.

What do coordinates refer to on a spreadsheet? How do they help us refer to specific cells?

Coordinates are the name for a cell. A cell can be referred to by its column name and row number

How does the COUNTIF function work?

Counts the number in a range that match the given criteria.

What do cells contain?

Data or Computed Values

Can data values be presented in a small amount of ways or a lot?

Data values may be presented many ways. (currency, raw data, text, percentage etc.)

What does the validate part of the data pipeline consist of?

Deleting invalid or untrusted data

What does the filter part of the data pipeline consist of?

Deleting unneeded and redundant data

What is dim short for?

Dim is short for dimension.

What does the row and column functions do?

Displays the current row and column if no arguments are given or gives the location of another cell if arguments are given.

End of Lecture 1

End of Lecture 1

What is Data?

Facts and statistics collected together for reference or analysis

What does the correl function do?

Finds the correlation between two sets of data and displays it according to the Pearson product-moment correlation -coefficient (-1-01)

What does the percentile function do?

Finds the k-th percentile of a values in a range.

What do the max and min functions do?

Finds the max number or the in number in a range of cells.

What does the pv and fv function do?

Finds the present value or future value of a cells

What does the sqrt function do?

Finds the square root of a number

What is Moore's law and who was it established by?

Gordon E. Moore noticed in 1965 that integrated circuit speeds had been doubling every year for the past many years.

What is Hello in base 10?

H = 72 base 10 E = 101 base 10 L = 108 base 10 L = " " O = 111 base 10

What is data in base 16 called?

Hexadecimal

In logic, when is a set of questions considered "decidable"

If there is an algorithm that correctly returns true or false for all questions in the set.

What does the IF function do?

If this, then return that, other then that return this.

What are some examples of image, video and audio file types that are binary? What additional file type is also binary?

Images: PNG, JPG, GIF, BMP Video: MP4, AVI, FLV, WMV Audio: MP3, WAV, AAC, FLAC Additionally, databases are also binary files.

In what form does the data present in cells come?

In numerics or text

What does IDE stand for?

Integrated Development Environment

Does using copy and paste copy the format and conents of a cell or only one?

It can coy and paste both the contents and the formatting of cells from one place to another.

What are the 3 factors that make a solution an algorithm?

It is a sequence of steps that is: 1) Unambiguous - No assumptions are required to execute the algorithm and the algorithm uses precise instructions. 2) Executable - The algorithm can be carried out in practice 3) Terminating - The algorithm will eventually come to an end, stop or halt.

What does a variable do?

It is used in programs to store values

What does the square bracket in a funciton explanation mean?

It means that parameter is optional

What does the INDEX function do?

It returns the value of a cell at a given index in a range.

What is a variable?

Items that can hold different values of the same "type"

What does HLOOKUP do?

Like VLOOKUP but searches rows rather than columns.

What is the long data type?

Like an integer but has a longer range

How is data represented by computers? in Location, Representation and Interpretation?

Location : in Memory, in files Representations: Bits Interpretations: Binary

What does LOOKUP do?

Look up data in one colum and return a value from another. Only works with data sorted in ascending order.

What are Human Computations?

Mental or pencil and paper arithmetic.

What are nested functions?

Nested functions are when the argument of a function can be the return of another function directly.

are variable names case-sensitive?

No, and by default the interpreter adjusts the names of all variables with the same letters so that their case matches that of the variable declaration.

Is the MATCH function very useful on its own?

Not really.

How many hours of video is uploaded to youtube every second? Why is this an issue?

Now it is closer to 5 hours per second? This is an issue because we don't have the manpower to review it all

What can expressions involve? (5)

Numbers, Cell names, arithmetic operators, ranges of values, or function calls (like log).

How can one put a range of cells in a funciton?

One can either manually type the range in or drag and highlight all the required cells with their pointer.

What is a boolean data type?

Only has two values true and false

What is the double data type?

Positive and negative numbers that may have decimal points.

What is the integer data type?

Positive and negative whole numbers without decimal points.

How does a spreadsheet present data and what kinds of data can be placed in the spreadsheet?

Presents a grid of cells, each of which can contain text or numerical data, or results of a computation.

What are the pros (4) and cons (3) of binary files?

Pros 1) Store data more compactly 2) Less space required to store data 3) Faster for computers to read and write 4) Faster to send over network Cons 1) Unforgiving format 2) Harder to program and read (for humans) 3) Usually specific to one program or family of programs

What are the pros (3) and cons (2) of XML files?

Pros 1) Structured data 2) Easy for humans to read and work with 3) Machine readable Cons 1) More space required than binary or text files 2) Parsing (reading by computer) can be slower

What does the visualize part of the data pipeline consist of?

Putting the data in a form that is easy to interpret the valuable information that can be gathered from it

When pasting formulas, are cell references relative or absolute?

Relative

What does each byte represent? Which specific ones?

Represents a color? Red, Green and Blue

What does the OFFSET function do?

Returns a reference to a range that is a specified number of rows and columns.

What does the LARGE and SMALL functions do?

Returns the kth, smallest, or largest, value in a the range

What does the rank function do?

Returns the rank of a number in a list of numbers.

What does VLOOKUP do?

Similar to LOOKUP but works with a whole table of data (multiple columns)

What is a function on an excel spreadsheet? What is the difference between a function in math and a function on excel?

Similar to but the not the same as functions in math. There are less rules for functions in computer science.

What is a TSV file?

Tab-Separated Values

What does the end of the function denote?

Tells VBA that this is the end of our function.

Give some examples of file formats discussed in lecture one (3)

Text Binary XML

What is the string data type?

Text and strings of characters.

What does a relative cell reference mean?

That the row and column are taken as offsets from where the original formula was placed.

What is the funciton of & in a formula?

The & symbol works to add strings together. So if we put ="Wednes" & "day" in a cell it would result in wednesday

What is the table for converting numbers to characters called?

The American Standard Code for Information Interchange (ASCII)

What function makes the MATCH function significantly more useful?

The INDEX function.

What was the first punch-card computer and when was it invented?

The Jacquard Loom in 1804.

What does the MATCH function do?

The MATCH function looks for a value and returns it's index (its number in the row or column)

Within the parameters of a function, what are the two parts?

The Parameter name (a, b ra etc.) and the Parameter type.

What was the first phase of computers and how long was it used for?

The Punch Card phase and it went form the 1800's to 1980's.

What significant advancement was made in computing during the 1940's to 1960's?

The Vacuum Tube

What is Data Analytics

The collection and analysis of data and casting results in usable form.

What is the stipulation put in place when using the -1 and 1 how to match filters in the MATCH function?

The data must be sorted in correct order for modes 1 and -1

What are the parts of a function?

The header: Describes the function and its parameters. The Function Keyword: Tells VBA we are making a new functions. The Function name: The name that will refer to this function. The parameters: Parameters the function will take. (ra As Double, d As Double, P As Double). And the function return type, which states how the response of the function will appear.

What is the function body

The part underneath the function statement that describes the steps, operations and calculations that the function will take

What is programming and what does it involve?

The process of creating an executable computer program to address a problem. Programming involves converting an algorithm into a set of precise instructions that a computer can read.

What is the most recent and major computer technological advancement and when was it created?

The transistor in 1950's

What is digital electronic computing?

The use of circuits where a voltage threshold is met or not. Represents 0 or 1. Voltages on many lines can together represent numbers.

What do If-Then-Else Statements allow us to do?

They allow us to make decisions based on the outcome of a logical statement (Boolean logic)

What are _IFS (KEY: the added S) functions? What kind of arguments do these kinds of functions have? What are the restrictions to these functions?

They are _IF functions but with more criteria. The arguments in _IFS statements are range, range1, criterion1, range 2, criterion2,...) The restrictions are that the size of the ranges must be the same and it selects when all of the critera are satisfied.

What are the values encoded into a pixel converted into?

They are converted into light and mixed together to form a single color

What are binary files?

They are files that define their own format/coding. They are not human readable.

What is the primary difference between ASCII and UTF/Unicode?

They are newer and more modern systems but they take up more space (bits). They also offer more characters: - Other languages - Math Symbols - Emoji's - and More

What are XML Files?

They are text files that are both human and machine readable

What do the _A variants of functions do?

They are variants of functions like average, count, min , max that allow argument to be non-members. So for instance, if you were finding an average donation and inserted none into a cell, the average a function would treat text as zeros, whereas the average function would ignore text.

What primary commonality is there between both CSV and TSV files?

They both can be read by excel but contain no formatting or formulas just data.

What restriction do variables have in regards to what they can be named?

They cannot be named the reserved keywords. Also the name must start with a letter but following characters can have numbers or a_

In what form must programs be written?

They must be hand written in assembly code which gives a human-readable form of machine instructions. OR Programs can also be written in higher-level languages that get translated into machine code.

How can the value of a variable be changed?

Through the execution of the program

What are the purpose of the invisible characters in text files?

To denote things like spaces and line breaks

What is the goal of data analytics?

To gain knowledge and communicate conclusions drawn from data.

What can user-defined functions be used to do?

To perform complex calculations and return results for use in the worksheet. It also can be used to perform almost any action that you would do by hand on the worksheet.

What are XML files designed for?

To represent and transport data

How does one declare a variable?

To tell VBA that you wish to use a variable you give the variable a name and then type in the dim declaration.

Will Moore's law hold true forever?

Unlikely due to physics related issues.

What are unsigned integers and how are they represented?

Unsigned integers are represented in binary (base 2) and are only positive

What does the IF statement allow us to do?

Use different formulas or values based on the results of a boolean statement.

What are natural computations?

Using ants, DNA, molecules optional reading

What is VBA? What does it stand for?

VBA is the programming language used within excel to develop functions, subroutines and macros. It stands for Visual Basic for Applications.

When does an or function result in a true?

When one value of the or argument is true.

When does an not function result in a true?

When the value given to not becomes the opposite. It effectively inverts the boolean value.

When is a question considered undecidable? What is a classic example of an unsolvable problem

When there is proven that there is no algorithm. The most classic example is given an arbitrary program and finite input, decide whether the program finishes running or will run forever.

What are easier to learn and work with, XML files or Binary files?

XML Files

Do XML files have structured or unstructured data and what does this mean?

XML files have structured data which means that there are strict specified format for the data.

Can the way a cell looks be adjusted? What would the purpose of editing the look of a cell be?

Yes to help the reader follow the logic of the spreadsheet or to be purely decorative.

Since finding Moore's law has this principle remained true?

Yes! But now it has become more so a self-fulfilling prophecy.

Do other encoding systems exist other then ASCII, what are they?

Yes, UTF and Unicode

When should one use MATCH vs. INDEX?

You should use match when you know the value you are looking for. You should use index when you know the position/location.

What kind of file does the worksheet need to be saved as in order for macros to be enabled?

as .xlsm files

What does the AVERAGEIF function do?

averages values if they meet critera

What is the average function?

finds the average of a range of cells values

Under what common base are colors often represented?

hexadecimal

How many programming languages are there?

hundreds

What does the stdev (and stdev.s or stdev.p) function do?

stdev = standard deviation of a range of cells stdev.s = standard deviation of a sample stdev.p = standard deviation of a population


Ensembles d'études connexes

NURSE CH8 Online Exam Review Questions

View Set

pediatric success chapter 7 immunological

View Set