Oracle Big Data SQL - Key Features

Wednesday, October 1, 2014

Oracle Big Data SQL delivers unprecedented integration of Big Data and Oracle Database


Access Big Data anywhere with a single SQL query: Use a single SQL query to access data in Hadoop and your data warehouse.


Minimize Big Data Movement and run queries faster


Integrate Hadoop data with your existing applications


Oracle Big Data SQL lets you use SQL queries to seamlessly access data stored in Hadoop, relational databases, and NoSQL stores.


Oracle Big Data SQL is an innovation from Oracle only available on Oracle Big Data Appliance. It is a new architecture for SQL on Hadoop, seamlessly integrating data in Hadoop and NoSQL with data in Oracle Database. Using Oracle Big Data SQL, organizations can:


  • Combine data from Oracle Database, Hadoop and NoSQL in a single SQL query

  • Query and analyze data in Hadoop and NoSQL

  • Integrate big data analysis into existing applications and architectures

  • Extend security and access policies from Oracle Database to data in Hadoop and NoSQL

  • Maximize query performance on all data using Smart Scan


Oracle Big Data SQL radically simplifies integrating and operating in the big data domain through two powerful features: newly expanded External Tables and Smart Scan functionality on Hadoop.


Using new external table types, data in Hadooop and NoSQL is exposed to Oracle Database users. These tables, once defined, automatically discover Hive metadata including data location and data parsing requirements (i.e. SerDes and StorageHandlers). This enables SQL queries to access the data in its existing format leveraging native parsing constructs.


Oracle’s unique Smart Scan capability brings the proven storage processing innovations of Oracle Exadata to Oracle Big Data Appliance. The biggest performance penalties in data processing are typically the result of excess data movement. Instead of sending all scanned data to the compute resources, Smart Scan on Hadoop radically minimizes data movement to the compute nodes by applying the following techniques at the storage level:


  • Data-local scans:

    • Hadoop data is read using native operators at the source

  • Column projection

    • Only relevant columns are returned from the source

  • Predicate evaluation

    • Only relevant rows are returned from the source

  • Complex function evaluation

    • SQL operators on JSON and XML types applied at the source

    • Model scoring and analytical operators evaluated at the source


Smart Scan coexists with other Hadoop services and does not require any changes to Hadoop itself, thus staying in line with the open environment Oracle Big Data Appliance provides.