Assume that you create a PolyBase external table that uses a PARQUET file as data source in Microsoft SQL Server 2016. The PARQUET file is split into multiple files in Hadoop Distributed File System (HDFS), and each file is greater than the block size of HDFS. In this situation, when you query data from this external table, duplicate rows may be returned.
Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.
Article ID: 4019840 - Last Review: 2017, മേയ് 15 - Revision: 6