Search notes:

SQL Server: PolyBase

PolyBase was added to SQL Server in version 2016. It then allowed to query data that was stored in HDFS and file systems such as HortonWorks, Cloudera, Azure Blob Storage using standard T-SQL queries.
SQL Server 2019 extended the capabilities of PolyBase with connectors to query data from

Push down optimization

Push down optimization tries to execute most work on the source system rather than the SQL Server instance because it can reduce the amount of data that needs to be transmitted over the network. This notably includes operations such as

Linked Server vs PolyBase

Linked server PolyBase
Instance wide Database wide
Requires an provider Uses ODBC
Read/write operations Read-only (limitation might be lifted in future)
Single threaded ?
Separate configuration required for each instance in Always On Availability Group No separate configuration required

See also

PolyBase might render traditional ETL dead because of data virtualization.

Index