MISY 160 Exam 3 University of Delaware

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

How to classify as big data

5 Vs: Volume, Velocity, Variety, Veracity, and Value

Veracity

Accuracy and trustworthiness of the data

Skill data analysts need to have

Analytical skills

What does a database administrator do?

Architecting how data is shared

Velocity

Big data is generated at a very high speed

Frameworks to store and process big data

Cassandra, Hadoop, Spark

Second phase in data science project

Data acquisition, which is gathering and scraping data from multiple sources like web servers, log files, databases, APIs, online

Roles/positions of a data scientist

Data analyst; Machine learning engineer; Deep learning engineer; Data engineer, Data scientist

Fifth phase in data science project

Data modeling

Third phase in data science project

Data preparation, which involves data cleaning and data transformation

What data cleaning is

Dealing with inconsistent data types, misspelled attributes, missing and duplicate values

Seventh phase in data science project

Deploying and maintaining the model

What does a software developer do?

Develop applications

Values of data science

Discovering the best time and route to ship for logistics companies like DHL, Fedex and predicting employee attrition

What Map Reduce is

Doing parallel processing: Lengthy task is broken into smaller tasks, one machine takes up one task, multiple machines complete

Example of structured data

Excel records

Fourth phase in data science project

Exploratory data analysis

How Hadoop distributed file systems

File is broken into smaller chunks, making copies of them, and storing them in various machines

What does a quality assurance analyst do?

Finding problems within the applications, testing the applications to find problems - what the system is supposed to accomplish

Benefits of big data

Improving users experience in the game industry, predicting a hurricane landfall earlier in weather forecasting and disaster

Example of semi-structured data

Log files

What does a data analyst do?

Look at data to find meaning to solve real business problems

What does a project manager do?

Managing people, keeping track of schedules, communicating risks

Big data

Massive amounts of data

First phase in data science project

Meeting with clients, asking relevant questions, understanding and defining objectives for the problem that needs to be tackled

Tools for data modeling

Python, R, SAS

Various data types

Structured, semi-structured, unstructured data

Tools for data visualization and communication

Tableau, Power BI, QlikView

Tools for complex data transformation

Talend, Informatica

Value

The benefit from analyzing the data

Skill software developer needs to have

Understanding progragramming languages and databases

What does a business analyst do?

Understanding what the business is trying to accomplish, the processes, the stakeholders, the goals, and the current system

Sixth phase in data science project

Visualization and communication

Example of unstructured data

X-ray images


Ensembles d'études connexes

PrepU ch 40 Fluid, Electrolyte and Acid-Base Balances

View Set

Chapter 48: Management of Patients With Intestinal and Rectal Disorders NCLEX

View Set

Chapter 15- Euk. Gene Regulation

View Set