Intro to Data Science

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Similarities and differences between data science and artificial intelligence (AI)

Data Science is a broad term that encompasses the entire data processing methodology while AI includes everything that allows computers to learn how to solve problems and make intelligent decisions. Both AI and Data Science can involve the use of big data. That is, significantly large volumes of data.

True or False: When data are missing in a systematic way, you can simply extrapolate the data or impute the missing data by filling in the average of the values around the missing data.

False

True or False: The discussion section is where you introduce the research methods and data sources used for the analysis

False. This is the methodology section.

True or False: Adding a list of references and an acknowledgment section are examples of housekeeping, according to the author.

True

What is unstructured data?

Unstructured data is basically data that is coming mostly from the web where it's not tabular. It is not, it's not in rows and columns. It's text. Sometimes it's video and audio, so you would have to deploy more sophisticated algorithms to extract data. A lot of times we take unstructured data and spend a great deal of time and effort to get some structure out of it and then analyze it.

The 5 V's of Big Data

Velocity Volume Variety Veracity Value

What is deep learning?

a specialized subset of machine learning that uses layered neural networks to simulate human decision-making. Deep learning algorithms can label and categorize information and identify patterns. It is what enables AI systems to continuously learn on the job and improve the quality and accuracy of results by determining whether decisions were correct.

What is Machine Learning?

a subset of AI that uses computer algorithms to analyze data and make intelligent decisions based on what it is learned without being explicitly programmed. Machine learning algorithms are trained with large sets of data and they learn from examples. They do not follow rules-based algorithms. Machine learning is what enables machines to solve problems on their own and make accurate predictions using the provided data.

What are three important qualities of a data scientist?

curious judgmental argumentative

In the final deliverable, a common role of a ____________ is to use analytics insights to build a narrative to communicate findings to stakeholders.

data scientist

What is Big Data?

data sets that are so massive, so quickly built, and so varied that they defy traditional analysis methods such as you might perform with a relational database. described in terms of the 5 V's

In the report structure, regardless of the length of the final deliverable, the author recommends that it includes a cover page, table of contents, executive summary, a methodology section, and a what?

discussion section

What is a Neural Network?

in AI is a collection of small computing units called neurons that take incoming data and learn to make decisions over time. often layer-deep and are the reason deep learning algorithms become more efficient as the data sets increase in volume, as opposed to other machine learning algorithms that may plateau as data increases.

What are Artificial Neural Networks?

often referred to simply as neural networks, take inspiration from biological neural networks, although they work quite a bit differently.

The ______ of a data mining exercise largely depends on the quality of the data. a. input b. output c. results d. difficulty

output

In the report structure, an introductory section is always helpful in doing what for the reader who might be new to the topic?

setting up the problem

What are the ten main components of a report that would be delivered at the end of a data science project?

1. cover page - includes the title, authors, affiliations, contacts, date, etc. 1. table of contents - provides a roadmap of where this report will lead them. This consists of topics, sub-topics, charts, graphs, etc. 3. abstract, or executive summary - provides a brief summary of the report. This section should be short, but can vary in length depending on how long the report is. 4. introductory section - provides a background for the reader. 5. methodology section - introduces the research methods and data sources used during the analysis and throughout the report. Also, in this section, it is important to refer to literature that support the methods and data. 6. results section - empirical findings are presented by using descriptive statistics and attention grabbing, illustrative graphics. 7. discussion - expanding on the main components and communicating the results and thesis to the reader. 8. conclusion - where everything ties together, concluding the outcomes and any future developments. 9. references used throughout the report. 10. acknowledgements.

What is structured data?

Structured data is more like tabular data things that you're familiar with in Microsoft Excel format. You've got rows and columns and that's called structured data.

What is a data scientist?

someone who finds solutions to problems by analyzing data using appropriate tools and then tells stories to communicate their findings to the relevant stakeholders. use data analysis to add to the knowledge of the organization by investigating data, exploring the best way to use it to provide value to the business can use powerful data visualization tools to help stakeholders understand the nature of the results, and the recommended action to take analyze structured and unstructured data from many sources, and depending on the nature of the problem, they can choose to analyze the data in different ways. Using multiple models to explore the data reveals patterns and outliers; sometimes, this will confirm what the organization suspects, but sometimes it will be completely new knowledge, leading the organization to a new approach When the data has revealed its insights, the role of the data scientist becomes that of a storyteller, communicating the results to the project stakeholders

In the final deliverable, the ultimate purpose of analytics is to communicate findings to what people to formulate policy or strategy?

stakeholders

The results section is where you present...

the empirical findings

What is data science?

the field of exploring, manipulating, and analyzing data, and using data to answer questions or make recommendations. the art of uncovering the insights and trends that are hiding behind data. a field about processes and systems to extract data from various forms of whether it is unstructured or structured form. the study of large quantities of data, which can reveal insights that help organizations make strategic choices. use to discover optimum solutions to existing problems.

Data Science

the process and method for extracting knowledge and insights from large volumes of disparate data. It's an interdisciplinary field involving mathematics, statistical analysis, data visualization, machine learning, and more. It's what makes it possible for us to appropriate information, see patterns, find meaning from large volumes of data and use it to make decisions that drive business. Data Science can use many of the AI techniques to derive insight from data. For example, it could use machine learning algorithms and even deep learning models to extract meaning and draw inferences from data.

What is Data Mining?

the process of automatically searching and analyzing data, discovering previously unrevealed patterns. involves preprocessing the data to prepare it and transforming it into an appropriate format. Once this is done, insights and patterns are mined and extracted using various tools and techniques ranging from simple data visualization tools to machine learning and statistical models.


संबंधित स्टडी सेट्स

Chapter 2: Configure a Network Operating System

View Set

Org Communication (Chapter 4 Quiz)

View Set

Comptia Exam PKO-004 Study Guide 4, 5, 6

View Set

Test 8: Property Rights, Estates and Tenancies

View Set

Chapter 14: Personality Psychopathology

View Set

3.1.1.2 The Water - *Drainage basins as open systems* + *The water balance*

View Set