About

Edit photo

Monday, April 4, 2016

Managed vs External tables in Hive


Click here for Part 1

Managed Tables: Called as Internal Tables.

Default table in hive, all managed tables data is getting stored in hive default location
hdfs --> /user/hive/warehouse/<tablename>/<data>

How to enter into Hive?

Type "hive" and hit enter button to enter into Hive Shell.

How to create a table in Hive?

create table <table name>(<schema-def>)
row format delimited
fields terminated by <delimiter type>
lines terminated by <delimiter>
stored as <file-format>;


NOTE (Difference): in SQL you can insert data after creating table, but in hive, you can insert data before creating table, this can be done vice versa.
External Table: they are external to the HIVE warehouse system, framework will not provide the responsibility to the tables.

create external <table name>(<schema-def>)
row format delimited
fields terminated by <delimiter type>
lines terminated by <delimiter>
stored as <file-format>
location "< HDFS location>";


NOTE: if table deleted in managed, both data and table is deleted but in external, only table will delete from warehouse, but not from the HDFS. Better to use external table.


How to load data into table?

Can be loaded in two ways, i.e. local (LFS) and HDFS

local-
From LFS path to default HIVE warefouse location
hive> load data local inpath <file path> into table <table name>;

HDFS-
Data will be taken from HDFS path and once it's loaded into table, that will be part of new HDFS HIVE warehouse path.
hive> load data inpath <filepath> into table <table name>;


Click here for examples.

                                                                                                               Posted By: SsaiK