Introduction to SQL Server PolyBase

Learn SQL with Udemy

For an instructor lead, in-depth look at learning SQL click below.


Structured Query Language (SQL) is a must-know language for any data analyst or database developer. One of the powerful features of SQL Server is PolyBase, a technology designed to process queries that access both semi-structured and relational data stored in Hadoop or Azure Blob Storage.

What is Polybase?

PolyBase allows you to use Transact-SQL statements to access data stored in Hadoop or Blob Storage, diverting processing to Hadoop. In other words, you can manipulate data stored in Hadoop or Azure Blob Storage just like the data in your SQL Server instances.

Working with PolyBase

To start working with PolyBase, you will need to install it during the SQL Server installation process. You then need to configure and enable PolyBase, which we will outline below.

Creating an External Data Source

The first step to using PolyBase is to create an external data source. Below is an example of the T-SQL necessary to create an external data source.

Creating an External File Format

After the data source is created, you will need to create an external file format if one does not already exist. This format specifies the type of data stored in the external data source.

Creating an External Table

The final step to accessing external data is to create an external table that maps to the data on your external data source.

With these steps, you will have the necessary structure to connect to and query data stored in an external source through SQL Server’s PolyBase feature.

Leave a Comment