Skip to content →

How to Determine the S3 Bucket Associated with a Hive Table

At my job, we have a bunch of Hive “databases” and “tables” that are backed by S3. I was wondering how to determine which S3 bucket a table is associated with.

SHOW CREATE TABLE <database>.<table>

SHOW is a command specific to Hive (docs). I have used this command to list all the databases in the Hive metastore (simply, SHOW DATABASES).

But SHOW CREATE TABLE is different. It shows the CREATE TABLE statement that creates a given table, or the CREATE VIEW statement that creates a given view.

The response looks something like this

```CREATE EXTERNAL TABLE `<database>`.`<table>`(`id` STRING, `isdeleted` BOOLEAN, `accountid` STRING, `recordtypeid` STRING, `name` STRING, `description` STRING, `stagename` STRING, `amount` DOUBLE, ... A LOT MORE COLUMNS)
PARTITIONED BY (`dt` STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' WITH SERDEPROPERTIES ( 'serialization.format' = '1' ) STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://com.example.foo.bar.us-east-1.baz/hello/public/<database>/<table>' TBLPROPERTIES ( 'transient_lastDdlTime' = '1592316662' )

The LOCATION part shows the S3 bucket!

Published in Today I Learned