Drill Introduction

Drill is an Apache open-source SQL query engine for Big Data exploration. Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data coming from modern Big Data applications, while still providing the familiarity and ecosystem of ANSI SQL, the industry-standard query language. Drill provides plug-and-play integration with existing Apache Hive and Apache HBase deployments.

What's New in Apache Drill 1.3 and 1.4

These releases fix issues and add a number of enhancements, including the following ones:

Enhanced Amazon S3 support
Hetrogeneous types
Support for columns that evolve from one data type to another over time.
Text file headers
Sequence files support
Enhancements related to querying Hive tables, MongoDB collections, and Avro files

What's New in Apache Drill 1.2

This release of Drill fixes many issues and introduces a number of enhancements, including the following ones:

Support for JDBC data sources, such as MySQL, through a new JDBC Storage plugin
Improvements in the Drill JDBC driver including inclusion of Javadocs and better application dependency compatibility
Enhancements to Avro file formats
- Support for complex data types, such as UNION and MAP
- Optimized Avro file processing (block-wise)
Partition pruning improvements
A number of new SQL window functions
- NTILE
- LAG and LEAD
- FIRST_VALUE and LAST_VALUE
HTTPS support for Web Console operations
Performance improvements for querying HBase, which includes leveraging ordered byte encoding
Optimized reads of Parquet-backed, Hive tables
Read support for the Parquet INT96 type and a new TIMESTAMP_IMPALA type used with the CONVERT_FROM function decodes a timestamp from Hive or Impala.
Parquet metadata caching to improve query performance on a large number of files
DROP TABLE command
Improved correlated subqueries
Union Distinct
Improved LIMIT processing

What's New in Apache Drill 1.1

Many enhancements in Apache Drill 1.1 include the following key features:

SQL window functions
Partitioning data using the new PARTITION BY clause in the CTAS command
Delegated Hive impersonation
Support for UNION and UNION ALL and better optimized plans that include UNION.

What's New in Apache Drill 1.0

Apache Drill 1.0 offers the following new features:

Many performance planning and execution improvements.
Updated Drill shell now formats query results.
Query audit logging for getting the query history on a Drillbit.
Improved connection handling.
New Errors tab in the Query Profiles UI that facilitates troubleshooting and distributed storing of profiles.
Support for a new storage plugin input format: Avro

In this release, Drill disables the DECIMAL data type, including casting to DECIMAL and reading DECIMAL types from Parquet and Hive. You can enable the DECIMAL type, but this is not recommended.

Apache Drill Key Features

Key features of Apache Drill are:

Low-latency SQL queries
Dynamic queries on self-describing data in files (such as JSON, Parquet, text) and HBase tables, without requiring metadata definitions in the Hive metastore.
ANSI SQL
Nested data support
Integration with Apache Hive (queries on Hive tables and views, support for all Hive file formats and Hive UDFs)
BI/SQL tool integration using standard JDBC/ODBC drivers

Quick Links

If you've never used Drill, visit these links to get a jump start:

← Getting StartedWhy Drill →