SQL Interview Questions
What is a table called, if it has neither cluster nor non-cluster index?
Heap or unindexed table.
How do you get data from a database on another server?
If you need to have a consistent link with another database, then it is best to use the OPENQUERY function. If you need to only do it once, then it is best to use the OPENDATASOURCE function. Those methods only work for T-SQL queries. Otherwise, you will have to use the import/export data wizard.
What is the maximum size per database for SQL Server Express?
10GB
What is a covered index?
A covered index is a non-clustered index that already has all the columns requested by a query for a single table. This saves sql the hassle of actually looking up the table itself. It only has to look at the index, and that saves a lot of time and processing power.
What is a fact table?
A fact table contains business information, and is located in the center of a star schema. It contains data that is waiting to be analyzed by the dimension tables surrounding it.
What is a FOREIGN KEY?
A foreign key is a column within a table that is the primary key of another table. It is used to connect tables to each other. The table that contains that uses the column for its primary key is called the parent table, and the table containing the foreign key is the child table.
What is the difference between a local temporary table and a global temporary table?
A local temporary table is visible only to its creator during that instance of SQL. Once that instance is terminated and disconnected, that temporary table is deleted. A global temporary table, once created, is visible to any user on any connection. The table is not deleted until every user accessing the table disconnects from their SQL instance.
What is bidirectional transactional replication?
A method to keep two separate servers in sync with one another. If I add data to server A, the change will be made to server B automatically. If I add data to server B, data will be added to server A. This is done without creating duplicates or accidental deletions because of loopback detection, which stops the synchronization process from being mirrored incorrectly.
What is PIVOT and UNPIVOT?
A pivot basically rotates a table based off of one column. It takes a column, and for every unique value in that column it creates another column. It essentially turns rows into columns. UNPIVOT does the opposite by putting all of those new columns back into one column, thus re-rotating the table by making columns back into rows.
What is Service Broker?
A service broker adds asynchronous transactional message processing to SQL Server. If I send a task from database 1 to database 2, usually I would need to wait for confirmation from database 2 that the task is completed before I could proceed. But with a service broker, I don't need to wait for that confirmation, because data can now be processed asynchronously. Messages can now be queued up. This is great because it allows developers to use sequel server with applications that are distributed, such as with cloud computing or applications across multiple machines.
What is the difference between a HAVING clause and a WHERE clause?
Aggregates cannot be included in the WHERE clause, but they can be included in the HAVING clause. A WHERE clause is used before any groups are created, and is used as a sort of filter by assigning conditions to each individual row. On the other hand, if you want to do any aggregate functions on groups of rows, you will have to use the HAVING clause. The HAVING clause acts on groups. The WHERE clause acts on rows individually.
Why does the LOGON trigger fire multiple times during a single login in SSMS?
Because SSMS logs in for the query window, object explorer, and intellisense.
Why can there be only one clustered index per table and not more?
Because a clustered index contains the information on how a data table is ordered. A data table can only be ordered one way, so you can only have one clustered index per table. For example, my clustered index could state I want all my rows in my table ordered by Student ID number. I could not have a second cluster index that orders by name, because that would conflict with the previous index that orders by Student ID number.
Why can't I run TRUNCATE TABLE on a published table?
Because the truncate command does not include any logs nor does it use any triggers.
What is SQL Profiler?
Breadcrumbs for sequel server. By creating "traces", you can use it to keep track of all the changes that are going on, as well as log everything going on under the hood. You can then go back and see what happened if something goes wrong, and try to isolate the problem using the logs.
What is change Data Capture (CDC) in SQL Server 2008?
CDC is used to track changes you make to your data within sql server (specifically INSERTs, UPDATEs, and DELETEs). It accomplishes this by creating relational change tables that contain the columns of the original table in addition to the metadata that was created by the changes. Essentially it is used to record and keep track of changes to your data.
What is collation?
Collation controls how string values are sorted. For example, chlorine comes before color in alphabetical order. It can be changed to consider different characteristics such as capitalization or accents.
Which TCP/IP port does the SQL Server run on? How can it be changed?
Default port: 1433. To change it, you use SQL Server Configuration Manager. Go to console pane, go to SQL Server Network Configuration, Protocols, then TCP/IP. Select the IP address you want to configure, then type the port number you want to use. Restart sequel server.
What is transparent data encryption?
Encryption for data and log files at rest. No matter how securely you design your database with firewalls around the servers, somebody could still steal the actual hard drives, plug them in, and access secure data. The con is that the extra encryption will cause a slight decrease in performance.
What is ETL?
Extract, transform, load. The extract refers to obtaining a subset of data from a database. Transform manipulates and changes the data into a desirable state. This is accomplished using rules or combining data with other data. Load refers to putting that correctly transformed data into your target database.
What is Filestream in SQL Server?
Filestream allows you to incorporate large unorganized file structures into sequel server, such as videos, images, or text documents. It accomplishes this by storing the data in a varbinary(max) datatype.
What is the difference between a user-defined function (UDF) and a stored procedure?
Functions cannot change the data within sequel server, whereas a UDF can update or alter the data within sequel server. Functions must return a value, whereas a UDF does not have to. Stored procedures are "stored" so that they can be accessed over and over again across multiple programming languages, which adds a level of convenience. UDFs can also be utilized in the WHERE/HAVING/SELECT clause, but stored procedures cannot.
What are GROUPING SETS?
GROUPING SETS is an operator that can be used to expand the functionality of the GROUP BY clause. It allows you to see different combinations of the groups that you have made. Normally with GROUP BY, you can only perform aggregations on that one group. GROUPING SETS allows you to perform aggregations on multiple groups at once.
What is the use of data-tier application (DACPAC)?
It allows you to save all your sequel server data into a file that can then be transferred to another sequel server. It allows developers to make changes to a database, pack it up with all its tables, views, objects, etc. into a file, hand that off to a database administrator to deploy it to other databases.
Define HIERARCHYID datatypes
It is a datatype that is stored based on hierarchy. You can use it to create a sort of tree diagram within sequel server. For instance, this person at the top is the CEO, and these managers underneath report to the CEO. Further below are employees that report to the managers. The HIERARCHYID will use a VARBINARY to represent what position the data is in.
What is PowerPivot for Excel?
It is an add-in for Excel that dramatically increased Excel's functionality. It adds business intelligence to Excel. It makes Excel a much more useful business intelligence tool because it can process greater amounts of data more quickly and show more business solutions.
What is PAD_INDEX?
It turns index padding on or off. Padding is off by default. However, if index padding is turned on, the fillfactor percentage is used to determine how much space should be left open for the index to grow as new data is added. If it is off, intermediate pages are filled up to but not including the final row.
What is log shipping?
Log shipping allows you to "ship" or transfer transaction log files from one sequel server instance to another. It is an automated process, and allows changes you make in one database to be made in another backup database periodically. It is a safeguard and used for disaster recovery, so that if and when you screw something up in your primary database, you have backup databases ready to go.
What is the maximum row size for a table?
No limit, except for your hard drive storage capacity.
Can we insert data if the clustered Index is disabled?
No. If you disable the clustered index, the data can no longer be accessed. Why? Because the clustered index IS the ordered table, and disabling the index disables access to the table also. Therefore, it is impossible to insert data when the clustered index is disabled.
What is the CHECKPOINT process in SQL Server?
Periodically, SQL Server will write all the "dirty data file pages", or data file pages that have been changed or modified, from the buffer to the disc. Transaction log records are also transferred to the physical disc. The goal is to make recovery from failure quicker and easier. Each time the pages and logs are transferred, that creates a "checkpoint" you can recover from.
How is SQLCMD different form OSQL?
SQLCMD has everything that OSQL has to offer. In addition, SQLCMD contains more features because it is newer. SQLCMD communicates with the database using OLE DB, and OSQL uses ODBC.
What is a scheduled job and what is a scheduled task?
Scheduled jobs/tasks allow you to automate processes that need to be done over and over again. This can save a user from having to type queries over and over each day.
What is OLTP (Online Transaction Processing)?
Software that is specifically designed to handle large volumes of transactions on web applications. Industries with high volumes of customers and transactions rely on this software, such as banks or airports. The goal for OLTP is to be quick, and also to maintain data integrity (you can imagine the importance of data consistency in shopping carts and airplane ticketing).
What is the use of the Dedicated Admin Connection (DAC)?
The DAC is a sort of back door for administrators to troubleshoot problems with the server, even if the server is down and inaccessible for other non-administrative users.
What is the NOLOCK hint?.
The NOLOCK hint is a query that allows you to bypass any "locks" on the data you are wanting to access. When data is going through a process or being updated, it is locked so that no other processes can use the data until it is finished working on it. If you use NOLOCK, it is like eating cookies before they are through cooking... you grab the data before its ready. This can lead to terrible complications, since you are grabbing whatever data is available on the disc rather than data that has been properly processed (dirty reads).
What is Business Intelligence (BI)?
The collection and analyzation of data in order to make better business decisions. In other words, turning raw data into something useful that a business can actually use to make informed and sound decisions.
Where are SQL Server usernames and passwords stored in SQL Server?
The master database.
What is de-normalization?
The point of de-normalization is to speed up data processing. It accomplishes this by adding redundant data. In a relational database system that is normalized, data redundancy is avoided by using many separate tables and relationships between tables. This is great for organization. However, this can cause very slow processing speeds. The risk with de-normalization is that your redundant data can come out of sync, putting data integrity at risk. Database designers can use constraints to make sure that de-normalization does not create inconsistent data.
What is XPath?
The query language for XML documents. It is used to select which nodes are to be processed in XML. The selections it makes for the data to be processed is called a "node set". Node sets are most often selected using the location path expression.
What are DMVs and DMFs used for?
They are used for diagnosing issues within your SQL server instance. They can be used to monitor performance (sessions, memory, etc) and find bottlenecks. They are convenient because before they were introduced in 2005, this information was only available if you were good at hacking.
What is the difference between VARCHAR and VARCHAR (MAX) datatypes?
Varchar(n) can store a maximum of 8000 characters, whereas VARCHAR(MAX) can store a maximum of 2,147,483,647. This is possible because VARCHAR(MAX) allows the data to spill on to separate pages within SQL Server as a TEXT datatype. A pointer is created, and the pointer to that large TEXT page replaces the old data in the row. It is generally cleaner and simpler to use VARCHAR(n) if you know your variables will never exceed 8,000 characters, and perhaps slightly faster.
What is a view?
Various ways that data can be pulled up according to the needs of the user. Views do not hold the data in and of themselves, rather they act as a liaison between the user and the data tables by creating virtual views or tables. Views are created by using queries. The result of the query creates a view.
What is the maximum number of columns a table can have?
Wide table - 30,000 Non-wide table - 1,024
Can we rewrite sub-queries into simple select statements using joins and CTEs?
Yes. You can essentially use a CTE as a temporary view of a result set that lasts only for the duration of the query.
How can you validate a backup copy of your database?
You use the RESTORE VERIFYONLY command will check if the backup you have is valid, or that it is working. If it is working, you will receive a message telling you your backup is valid and useable. This does not GUARANTEE anything, it is possible that under the hood an error still occurred, but the likelihood is much lower.
What is a deadlock? How can you identify and resolve a deadlock?
a. A deadlock occurs when 2 processes cannot be completed because they are dependent on one another, and they are both locked. This occurs when two separate sessions lock some data, then proceed to try and lock each other's data. But since that data is already locked, the session cannot be completed. It is like two people going outside, locking their house doors behind them, and then trying to get into each other's houses, but they cannot because the doors are locked. This results in an endless cycle of waiting. To resolve deadlock, SQL should automatically kill one of the processes randomly. However, you still may to have SQL create an error log to show you where deadlock is happening if you activate the correct trace flags. You can also set rules, i.e. if a session attempts to request data and is unable to retrieve it in under ten seconds, the data must be locked so the session automatically cancels.
What is a subquery? Explain the properties of a subquery?
a. A subquery is a query the is nested within an original query. Subqueries are necessary for performing multiple aggregate functions, one after the other. For example, if I wanted to add all the values in a table and then afterwards take the average, I would need to do a subquery. This works because the inner subquery first comes up with an answer, then the outer query uses that answer within its own function. Subqueries must be enclosed with parentheses. They are most often found under the WHERE clause. They must also be placed on the right side of a comparison operator, i.e. > or <. Subqueries may NOT contain an ORDER-BY clause, and finally, only one subquery may be nested under a query at a time.
What is RDBMS?
a. Relational Database Management System. This is a method of managing a database using the relational model. This system relies on relations (entities, tables) consisting of tuples (rows) and attributes (columns). Tables are related to each other via primary keys that uniquely identify each row, and foreign keys (when a key visits another table). The RDBMS is advantageous because it does all the complex relationship management under the hood, allowing the user to easily see tables related to one another. It also allows the user to create queries that pull meaningful information from the tables.
How do you rebuild the master database?
a. The best thing to do would be to restore from a backup from the same version of sequel server. You simply restore the backup onto another sequel server instance. If you don't have a backup... b. First you insert the installation media that came with SQL server into your disc drive. Next, go to command prompt and enter the following command. - Setup /QUIET /ACTION=REBUILDDATABASE /INSTANCENAME=InstanceName /SQLSYSADMINACCOUNTS=accounts [ /SAPWD= StrongPassword ] [ /SQLCOLLATION=CollationName] Check that everything went well using the Summary.txt log file.
