Monday, October 23, 2017

Google BigQuery & Apache Hive

Google BIGQUERY is a fast, economical and fully-managed enterprise data warehouse for large-scale data analytics. Details of querying your custom table in BigQuery:

https://cloud.google.com/bigquery/quickstart-web-ui


The Apache Hive™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage and queried using SQL syntax. Built on top of Apache Hadoop™, Hive provides the following features:
  • Tools to enable easy access to data via SQL, thus enabling data warehousing tasks such as extract/transform/load (ETL), reporting, and data analysis.
  • A mechanism to impose structure on a variety of data formats
  • Access to files stored either directly in Apache HDFS™ or in other data storage systems such as Apache HBase™
  • Query execution via Apache Tez™, Apache Spark™, or MapReduce
  • Procedural language with HPL-SQL
  • Sub-second query retrieval via Hive LLAP, Apache YARN and Apache Slider.

More details on getting started: https://cwiki.apache.org/confluence/display/Hive/GettingStarted

No comments:

Post a Comment