CDH 5.3 Hadoop cluster using VirtualBox and QuickStart VM : Hive DB Table and Query

bogotobogo.com site search:

note

Continued from CDH 5.3 Hadoop cluster using VirtualBox and QuickStart VM and QuickStart VMs for CDH 5.3 II - Testing with wordcount, in this chapter, we'll explore Hue and Hive:

Hue

Open a browser and go to Hue UI:

What Is Hive?

We can run one or more MapReduce jobs in the background by using SQL like Hive's HQL.

"Hive is a data warehousing infrastructure based on Hadoop. Hive is designed to enable easy data summarization, ad-hoc querying and analysis of large volumes of data. It provides a simple query language called Hive QL, which is based on SQL and which enables users familiar with SQL to do ad-hoc querying, summarization and data analysis easily. At the same time, Hive QL also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis that may not be supported by the built-in capabilities of the language."

-from Hive Tutorial

What Is Hive it Not?

"Hadoop is a batch processing system and Hadoop jobs tend to have high latency and incur substantial overheads in job submission and scheduling."

"As a result - latency for Hive queries is generally very high (minutes) even when data sets involved are very small (say a few hundred megabytes). As a result it cannot be compared with systems such as Oracle where analyses are conducted on a significantly smaller amount of data but the analyses proceed much more iteratively with the response times between iterations being less than a few minutes. Hive aims to provide acceptable (but not optimal) latency for interactive data browsing, queries over small data sets or test queries. Hive is not designed for online transaction processing and does not offer real-time queries and row level updates. It is best used for batch jobs over large sets of immutable data (like web logs)."

-from Hive Tutorial

Hive QL

While based on SQL, HiveQL does not strictly follow the full SQL-92 standard. HiveQL offers extensions not in SQL, including multitable inserts and create table as select, but only offers basic support for indexes.

Also, HiveQL lacks support for transactions and materialized views, and only limited subquery support. Support for insert, update, and delete with full ACID functionality was made available with release 0.14.

Internally, a compiler translates HiveQL statements into a directed acyclic graph of MapReduce or Tez, or Spark jobs, which are submitted to Hadoop for execution.

- from http://en.wikipedia.org/wiki/Apache_Hive.

Creating Hive DB table

We'll create a table with two columns of data: life expectancy and country. The syntax looks like this:

create table expectancy_country (expectancy int, country string) row format delimited fields terminated by ',';

Press Execute button. We can check the log, and can see it's OK:

15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO parse.ParseDriver: Parsing command: create table expectancy_country (expectancy int, country string) row format delimited fields terminated by ','
15/03/21 12:42:03 INFO parse.ParseDriver: Parse Completed
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
15/03/21 12:42:03 INFO parse.SemanticAnalyzer: Creating table expectancy_country position=13
15/03/21 12:42:03 INFO ql.Driver: Semantic Analysis Completed
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
15/03/21 12:42:03 INFO ql.Driver: EXPLAIN output for queryid hive_20150321124242_9ce37a9a-009f-4a14-8a1f-e7a69f6fc2a5 : ABSTRACT SYNTAX TREE:
  
TOK_CREATETABLE
   TOK_TABNAME
      expectancy_country
   TOK_LIKETABLE
   TOK_TABCOLLIST
      TOK_TABCOL
         expectancy
         TOK_INT
      TOK_TABCOL
         country
         TOK_STRING
   TOK_TABLEROWFORMAT
      TOK_SERDEPROPS
         TOK_TABLEROWFORMATFIELD
            ','


STAGE DEPENDENCIES:
  Stage-0 is a root stage [DDL]

STAGE PLANS:
  Stage: Stage-0
      Create Table Operator:
        Create Table
          columns: expectancy int, country string
          field delimiter: ,
          input format: org.apache.hadoop.mapred.TextInputFormat
          output format: org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
          serde name: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          name: expectancy_country


15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO lockmgr.DummyTxnManager: Creating lock manager of type org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
15/03/21 12:42:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=quickstart.cloudera:2181 sessionTimeout=600000 watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@2658e945
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO ql.Driver: Starting command: create table expectancy_country (expectancy int, country string) row format delimited fields terminated by ','
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO ql.Driver: OK
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO ZooKeeperHiveLockManager:  about to release lock for default
15/03/21 12:42:03 INFO log.PerfLogger: 
15/03/21 12:42:03 INFO log.PerfLogger:

CSV data

Here is our csv data:

loading table

Now, we want to read the data into Hive table:

load data local inpath '/tmp/expectancy_country.csv' overwrite into table expectancy_country;

Click Execute, and we get the following log:

We can also check the input file from UI:

15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO parse.ParseDriver: Parsing command: load data local inpath '/tmp/expectancy_country.csv' overwrite into table expectancy_country
15/03/21 14:13:47 INFO parse.ParseDriver: Parse Completed
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO common.FileUtils: Creating directory if it doesn't exist: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-13-47_408_2193979710479077128-1
15/03/21 14:13:47 INFO ql.Driver: Semantic Analysis Completed
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
15/03/21 14:13:47 INFO ql.Driver: EXPLAIN output for queryid hive_20150321141313_ff34b34f-bdfd-4d15-8865-da66bd71aec9 : ABSTRACT SYNTAX TREE:
  
TOK_LOAD
   '/tmp/expectancy_country.csv'
   TOK_TAB
      TOK_TABNAME
         expectancy_country
   local
   overwrite


STAGE DEPENDENCIES:
  Stage-0 is a root stage [COPY]
  Stage-1 depends on stages: Stage-0 [MOVE]
  Stage-2 depends on stages: Stage-1 [STATS]

STAGE PLANS:
  Stage: Stage-0
    Copy
      source: file:/tmp/expectancy_country.csv
      destination: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-13-47_408_2193979710479077128-1/-ext-10000

  Stage: Stage-1
    Move Operator
      tables:
          replace: true
          source: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-13-47_408_2193979710479077128-1/-ext-10000
          table:
              input format: org.apache.hadoop.mapred.TextInputFormat
              output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              properties:
                COLUMN_STATS_ACCURATE true
                bucket_count -1
                columns expectancy,country
                columns.comments ï¿½
                columns.types int:string
                field.delim ,
                file.inputformat org.apache.hadoop.mapred.TextInputFormat
                file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                location hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country
                name default.expectancy_country
                numFiles 1
                numRows 0
                rawDataSize 0
                serialization.ddl struct expectancy_country { i32 expectancy, string country}
                serialization.format ,
                serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                totalSize 100
                transient_lastDdlTime 1426972319
              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              name: default.expectancy_country

  Stage: Stage-2
    Stats-Aggr Operator


15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO lockmgr.DummyTxnManager: Creating lock manager of type org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
15/03/21 14:13:47 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=quickstart.cloudera:2181 sessionTimeout=600000 watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@1d064f96
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO ql.Driver: Starting command: load data local inpath '/tmp/expectancy_country.csv' overwrite into table expectancy_country
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO exec.Task: Copying data from file:/tmp/expectancy_country.csv to hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-13-47_408_2193979710479077128-1/-ext-10000
15/03/21 14:13:47 INFO common.FileUtils: Creating directory if it doesn't exist: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-13-47_408_2193979710479077128-1/-ext-10000
15/03/21 14:13:47 INFO exec.Task: Copying file: file:/tmp/expectancy_country.csv
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO exec.Task: Loading data to table default.expectancy_country from hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-13-47_408_2193979710479077128-1/-ext-10000
15/03/21 14:13:47 INFO common.FileUtils: deleting  hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/expectancy_country.csv
15/03/21 14:13:47 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
15/03/21 14:13:47 INFO common.FileUtils: Moved to trash: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/expectancy_country.csv
15/03/21 14:13:47 INFO metadata.Hive: Replacing src:hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-13-47_408_2193979710479077128-1/-ext-10000/expectancy_country.csv, dest: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/expectancy_country.csv, Status:true
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO exec.StatsTask: Executing stats task
15/03/21 14:13:47 INFO exec.Task: Table default.expectancy_country stats: [numFiles=1, numRows=0, totalSize=100, rawDataSize=0]
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO ql.Driver: OK
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO ZooKeeperHiveLockManager:  about to release lock for default/expectancy_country
15/03/21 14:13:47 INFO ZooKeeperHiveLockManager:  about to release lock for default
15/03/21 14:13:47 INFO log.PerfLogger: 
15/03/21 14:13:47 INFO log.PerfLogger:

query hadoop data

Now we can query the data in hdfs:

select * from expectancy_country;

Another query for a list of countries with expectancy > 75:

select * from expectancy_country where expectancy>75;

Our Hive query turned into MapReduce job. We can see how MapReduce jobs were done from the log. Our Hive query executed compiled into a MapReduce job executed on the cluster:

15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO parse.ParseDriver: Parsing command: select * from expectancy_country where expectancy > 75
15/03/21 14:36:56 INFO parse.ParseDriver: Parse Completed
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Get metadata for source tables
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Get metadata for subqueries
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Get metadata for destination tables
15/03/21 14:36:56 INFO common.FileUtils: Creating directory if it doesn't exist: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-36-56_062_722336335773199805-1
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Set stats collection dir : hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-36-56_062_722336335773199805-1/-ext-10002
15/03/21 14:36:56 INFO ppd.OpProcFactory: Processing for FS(20)
15/03/21 14:36:56 INFO ppd.OpProcFactory: Processing for SEL(19)
15/03/21 14:36:56 INFO ppd.OpProcFactory: Processing for FIL(18)
15/03/21 14:36:56 INFO ppd.OpProcFactory: Pushdown Predicates of FIL For Alias : expectancy_country
15/03/21 14:36:56 INFO ppd.OpProcFactory: 	(expectancy > 75)
15/03/21 14:36:56 INFO ppd.OpProcFactory: Processing for TS(17)
15/03/21 14:36:56 INFO ppd.OpProcFactory: Pushdown Predicates of TS For Alias : expectancy_country
15/03/21 14:36:56 INFO ppd.OpProcFactory: 	(expectancy > 75)
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable
15/03/21 14:36:56 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans
15/03/21 14:36:56 INFO parse.SemanticAnalyzer: Completed plan generation
15/03/21 14:36:56 INFO ql.Driver: Semantic Analysis Completed
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO exec.ListSinkOperator: Initializing Self 22 OP
15/03/21 14:36:56 INFO exec.ListSinkOperator: Operator 22 OP initialized
15/03/21 14:36:56 INFO exec.ListSinkOperator: Initialization Done 22 OP
15/03/21 14:36:56 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:expectancy_country.expectancy, type:int, comment:null), FieldSchema(name:expectancy_country.country, type:string, comment:null)], properties:null)
15/03/21 14:36:56 INFO ql.Driver: EXPLAIN output for queryid hive_20150321143636_a87c2c0b-db60-478c-845e-df0d8aca6810 : ABSTRACT SYNTAX TREE:
  
TOK_QUERY
   TOK_FROM
      TOK_TABREF
         TOK_TABNAME
            expectancy_country
   TOK_INSERT
      TOK_DESTINATION
         TOK_DIR
            TOK_TMP_FILE
      TOK_SELECT
         TOK_SELEXPR
            TOK_ALLCOLREF
      TOK_WHERE
         >
            TOK_TABLE_OR_COL
               expectancy
            75


STAGE DEPENDENCIES:
  Stage-1 is a root stage [MAPRED]
  Stage-0 is a root stage [FETCH]

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: expectancy_country
            GatherStats: false
            Filter Operator
              isSamplingPred: false
              predicate: (expectancy > 75) (type: boolean)
              Select Operator
                expressions: expectancy (type: int), country (type: string)
                outputColumnNames: _col0, _col1
                File Output Operator
                  compressed: false
                  GlobalTableId: 0
                  directory: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-36-56_062_722336335773199805-1/-ext-10001
                  NumFilesPerFileSink: 1
                  Stats Publishing Key Prefix: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-36-56_062_722336335773199805-1/-ext-10001/
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                      properties:
                        columns _col0,_col1
                        columns.types int:string
                        escape.delim \
                        hive.serialization.extend.nesting.levels true
                        serialization.format 1
                        serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                  TotalFiles: 1
                  GatherStats: false
                  MultiFileSpray: false
      Path -> Alias:
        hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country [expectancy_country]
      Path -> Partition:
        hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country 
          Partition
            base file name: expectancy_country
            input format: org.apache.hadoop.mapred.TextInputFormat
            output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
            properties:
              COLUMN_STATS_ACCURATE true
              bucket_count -1
              columns expectancy,country
              columns.comments ï¿½
              columns.types int:string
              field.delim ,
              file.inputformat org.apache.hadoop.mapred.TextInputFormat
              file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              location hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country
              name default.expectancy_country
              numFiles 1
              numRows 0
              rawDataSize 0
              serialization.ddl struct expectancy_country { i32 expectancy, string country}
              serialization.format ,
              serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              totalSize 100
              transient_lastDdlTime 1426972427
            serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          
              input format: org.apache.hadoop.mapred.TextInputFormat
              output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              properties:
                COLUMN_STATS_ACCURATE true
                bucket_count -1
                columns expectancy,country
                columns.comments ï¿½
                columns.types int:string
                field.delim ,
                file.inputformat org.apache.hadoop.mapred.TextInputFormat
                file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                location hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country
                name default.expectancy_country
                numFiles 1
                numRows 0
                rawDataSize 0
                serialization.ddl struct expectancy_country { i32 expectancy, string country}
                serialization.format ,
                serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                totalSize 100
                transient_lastDdlTime 1426972427
              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              name: default.expectancy_country
            name: default.expectancy_country
      Truncated Path -> Alias:
        /expectancy_country [expectancy_country]

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink


15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO lockmgr.DummyTxnManager: Creating lock manager of type org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
15/03/21 14:36:56 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=quickstart.cloudera:2181 sessionTimeout=600000 watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@57cc44e7
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO ql.Driver: Starting command: select * from expectancy_country where expectancy > 75
15/03/21 14:36:56 INFO ql.Driver: Total jobs = 1
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO ql.Driver: Launching Job 1 out of 1
15/03/21 14:36:56 INFO exec.Task: Number of reduce tasks is set to 0 since there's no reduce operator
15/03/21 14:36:56 INFO ql.Context: New scratch dir is hdfs://quickstart.cloudera:8020/tmp/hive-hive/hive_2015-03-21_14-36-56_062_722336335773199805-4
15/03/21 14:36:56 INFO mr.ExecDriver: Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
15/03/21 14:36:56 INFO mr.ExecDriver: adding libjars: file:///usr/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.3.0.jar,file:///usr/lib/hbase/hbase-hadoop-compat.jar,file:///usr/lib/hbase/hbase-client.jar,file:///usr/lib/hbase/hbase-hadoop2-compat.jar,file:///usr/lib/hbase/hbase-protocol.jar,file:///usr/lib/hbase/hbase-server.jar,file:///usr/lib/hbase/lib/htrace-core.jar,file:///usr/lib/hbase/lib/htrace-core-2.04.jar,file:///usr/lib/hbase/hbase-common.jar
15/03/21 14:36:56 INFO exec.Utilities: Processing alias expectancy_country
15/03/21 14:36:56 INFO exec.Utilities: Adding input file hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country
15/03/21 14:36:56 INFO exec.Utilities: Content Summary not cached for hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country
15/03/21 14:36:56 INFO ql.Context: New scratch dir is hdfs://quickstart.cloudera:8020/tmp/hive-hive/hive_2015-03-21_14-36-56_062_722336335773199805-4
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO exec.Utilities: Serializing MapWork via kryo
15/03/21 14:36:56 INFO log.PerfLogger: 
15/03/21 14:36:56 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/127.0.0.1:8032
15/03/21 14:36:56 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/127.0.0.1:8032
15/03/21 14:36:56 INFO exec.Utilities: No plan file found: hdfs://quickstart.cloudera:8020/tmp/hive-hive/hive_2015-03-21_14-36-56_062_722336335773199805-4/-mr-10004/fca63f1a-a74c-4c4d-821b-ec569dca189c/reduce.xml
15/03/21 14:36:56 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/03/21 14:36:57 INFO log.PerfLogger: 
15/03/21 14:36:57 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country; using filter path hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country
15/03/21 14:36:57 INFO input.FileInputFormat: Total input paths to process : 1
15/03/21 14:36:57 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 0
15/03/21 14:36:57 INFO io.CombineHiveInputFormat: number of splits 1
15/03/21 14:36:57 INFO log.PerfLogger: 
15/03/21 14:36:57 INFO mapreduce.JobSubmitter: number of splits:1
15/03/21 14:36:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1426957621065_0003
15/03/21 14:36:57 INFO impl.YarnClientImpl: Submitted application application_1426957621065_0003
15/03/21 14:36:57 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1426957621065_0003/
15/03/21 14:36:57 INFO exec.Task: Starting Job = job_1426957621065_0003, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1426957621065_0003/
15/03/21 14:36:57 INFO exec.Task: Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1426957621065_0003
15/03/21 14:37:12 INFO exec.Task: Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
15/03/21 14:37:12 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
15/03/21 14:37:12 INFO exec.Task: 2015-03-21 14:37:12,490 Stage-1 map = 0%,  reduce = 0%
15/03/21 14:37:21 INFO exec.Task: 2015-03-21 14:37:21,396 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.69 sec
15/03/21 14:37:23 INFO exec.Task: MapReduce Total cumulative CPU time: 690 msec
15/03/21 14:37:23 INFO exec.Task: Ended Job = job_1426957621065_0003
15/03/21 14:37:23 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-36-56_062_722336335773199805-1/_tmp.-ext-10001 to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/expectancy_country/.hive-staging_hive_2015-03-21_14-36-56_062_722336335773199805-1/-ext-10001
15/03/21 14:37:23 INFO log.PerfLogger: 
15/03/21 14:37:23 INFO log.PerfLogger: 
15/03/21 14:37:23 INFO ql.Driver: MapReduce Jobs Launched: 
15/03/21 14:37:23 INFO ql.Driver: Stage-Stage-1: Map: 1   Cumulative CPU: 0.69 sec   HDFS Read: 347 HDFS Write: 69 SUCCESS
15/03/21 14:37:23 INFO ql.Driver: Total MapReduce CPU Time Spent: 690 msec
15/03/21 14:37:23 INFO ql.Driver: OK
15/03/21 14:37:23 INFO log.PerfLogger: 
15/03/21 14:37:23 INFO ZooKeeperHiveLockManager:  about to release lock for default/expectancy_country
15/03/21 14:37:23 INFO ZooKeeperHiveLockManager:  about to release lock for default
15/03/21 14:37:23 INFO log.PerfLogger: 
15/03/21 14:37:23 INFO log.PerfLogger: