Computes an approximate histogram of a numerical column using a user-specified number of bins. The output is an array of (x,y) pairs as Hive struct objects that represents the histogram’s bin centers(x value) & the histogram height(y value). Even though this function creates a histogram with non- uniform bin widths but to some extent its […]
January 14, 2018
The Artistic Guide to Big Data: Hadoop/Spark We love...
January 6, 2018
Is the Docker and Container are same? I thrilled in this new...
December 31, 2017
25 Free Must-Read Books in New Year 2018 on Open Source,...
December 24, 2017
Apache Spark is Superstar; but it’s Supernova on Azure for...
Apache Flink Apache Hadoop Apache Hive Apache Spark big data Bigdata Big Data Analytics Big Data Architecture Big Data Cloud big data project big data rules Bot Chatbots Cloud Cloud Computing Cloudera Cognitive Computing Container Container Orchestration data lake Data Streaming DevOps Docker Google Cloud Platform hadoop HDFS Hive Hortonworks Internet of Things IoT Kafka Lambda Architecture mapreduce Microsoft Azure Open Source pig queryengine RDD Real Time Scala Secondary Sort spark Spark RDD Splunk YARN