DP-203 quiz
Azure data store services- relational database management
1. Azure SQL Database 2. Azure database my My SQL 3. Azure database for postgreSQL 4. Azure database for MariaDB 5. Oracle
Azure data store services- object storage
1. Azure blob storage( Azure storage account) 2. Azure data lake storage Gen2 ( Azure storage account)
Azure data store services- column - family databases
1. Azure cosmos DB Cassandra API 2. Hbase in HDInsight
Azure data store services- key/ value stores
1. Azure cosmos DB Table API 2. Azure cache for Redis 3. Azure table storage( Azure storage account)
Azure data store services- graph database
1. Azure cosmos DB gremlin API 2. SQL Server
Azure data store services- data analytics
1. Azure synapse Analytics 2. Azure data lake storage Gen2 ( azure storage account)
With a default azure account, what are the different services that you can use
1. Blob service 2. Table service 3. File service 4. Queue service
What aspect do we look at when it comes to data engineering
1. Data storage 2. Data processing 3. Visualizing your data
You could have data in various format
1. JSON - based file 2. CSV - based file 3. Parquet - based file 4. Avro - based file 5. ORC - based file 6. Videos, images, etc
How you will use your data
1. Operations 2. Performance
What are the classifications/format of data
1. Structured data 2. Semi structured data 3. Unstructured data
Azure data store services- Document database
Azure cosmos DB SQL API
Azure data store services- shred files
Azure files(Azure storage)
Azure data store services- search engine databases
Azure search
Azure data store services- time series database
Azure time series insights
The Aggregate functions
COUNT function MAX function MIN function AVG function SUM function
JOIN Clause
If you want to get the information of tables that are joined by foreign keys, so you have a common columns between this tables
* FROM
If you want to return all the columns of information from with a table
SELECT Statement
If you want to see the data within a table, you can go ahead and execute the SELECT statement
COUNT function
If you want to see the numbers of rows with the table as a result of the query
ORDER BY clause
It can be use to sort the result set, the data that is coming in from the table, based on your select statement in ascending or descending order. Note by default, records are actually sorted in ascending order.
What is azure SQL database
It is a Microsoft SQL server deployed in a platform as a service on the azure cloud. It is used for storing application data.
What is blob service
It is a object storage used to upload services such as your audio file, video, images etc When it comes to the azure storage account, we have basically the Blob services that maps on to containers, we can create file shares, queues, and tables.
Data processing
It is a technology that involves computer program to process and organize data, typically a large volume of numerical data. It can be use for handling, analyzing, measuring, sorting, and storing data
DFS
It is use for azure data lake Gen2 storage account when using PowerBI to view your data. It gives powerBI the ability to access my logs CSV files
Azure data factory
It is used to convert our CSV file into a parquet based file.
GROUP BY clause
It is used to group rows into summary rows and this can be used along with your aggregate functions
Foreign key
It is used to link multiple tables together. They reference a primary key with another table
Azure data lake Gen2 storage account
It is when you have a lot of data available, you are looking at working with larger data sets, dat arriving in large volumes and arriving at a very fast rate. You need to have a service that has the capability of hosting something known as a "Data Lake". The entire idea of having a data lake in place is having the ability to store large amount of data in its native raw format This service itself is built on top of the azure blob storage
Primary key
The use of primary key is to help uniquely identify rows within a particular table.
What is a structured date
These are data that are represented by rows and columns in a database
What is a semi structured data
These are data that resides in other format, and not in a database. For example, Java Script Object Notation (JSON)
What is unstructured data
These are information that either does not have a pre-defined data model or is not organized in a pre-defined manner. They are all binary objects, such as your audio files, videos, and images etc
What is a File service
This are file share that could be connected to different systems in the azure storage account
What is azure storage account
This is a resource that acts as a container that groups all the data services from Azure storage. It is a service on the azure platform that allows storage of information on the cloud
Hierarchical namespace
This is complemented by data lake Gen2 endpoint, enable files and directory semantics, accelerate big data analytics workloads, and enable access control list (ACL) Once you enable the hierarchical namespace, you get the feature that makes the storage account behave as an azure data lane Gen2 storage account
What is Queue service
This is used if there are messages that need to be sent by different component of the distribute application
What is a table service
This is where you can actually store structured NoSQL data
What to know about azure data lake Gen2 storage account
When you create an azure data lake Gen2 storage account, you are basically creating an azure account that just have the feature of data lake within it. This gives the ability for an enterprise to host a data lake on the azure platform. You also get a feature called Hierarchy namespace on top of azure blob storage. This helps you to organize your objects and files into our hierarchy of directories for efficient data access.
To Assign a column name
You can use the AS keyword to assign a column
Explain the internals of a database engine
Your data is stored on a disk in its raw format, but it is the database engine that actually manages the data accordingly. It understands which data is part of the table, etc. and with the help of the structured query language (SQL), with the help of the client tool, we can go ahead and work with the data accordingly.
WHERE condition
it can be use to extract only the those records from within a table to fulfill a specified condition
HAVING CLAUSE
it is used to filter information even further. So the WHERE clause can be use to, first of all, filter the information that is being returned from the table, from the server. Once we have the GROUP BY condition in place, if you want to filter the information even further when it comes onto what has been returned by the aggregate function, we can use the HAVING clause.