We should be excited that Apache Hive community have released the largest release and announced the availability of Apache Hive 2.0.0. It brings great and exciting improvements in the category of new functionality, Performance, Optimizations, Security, and Usability. Let us explore the features in detail below; HBase to store Hive Metadata – The current metastore […]
Blog 2 – Lets’ get started with Scala Just type Scala in your environment to get the Scala interpreter and if everything is fine we will prompt with scala>. If you have problem with installation please follow the link, which has step by step explanations. So we are good to explore the Scala commands. Now […]
Self-Biearn Yourself Scala in 21 Blogs – #1 Blog 1 – Scala the basics Thanks to the communities like LinkedIn, hadoop, Spark, Apache Software, Yahoo and more…from dataottam. As a new learning and sharing initiative we the dataottam team launched “Self-Learn Yourself Scala in 21 Blogs”. Scala is something Object-Oriented meets functional to have best […]
Self-Learn Yourself Apache Spark in 21 Blogs – #7 Key Concepts of Resilient Distributed Datasets (RDDs) and more… In this blog how do we create the RDDs and what operations can we perform with RDDs. Have quick read on the other blogs in this learning series. In simple RDD (Resilient Distributed Dataset); if data in […]
Celebrate the Big Data Problems – #4 What are the possible ways of command level searching in Linux? The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, Problem, Solutions) Framework. Context: Search in […]
Data Lake Architecture Considerations & Composition In our last blog we saw the key benefits of Data Lake, but let’s deep dive in to the internals of a Data Lake via discussing the key considerations and compositions. Architecture Considerations: Take in any solution considerations it is practical difficult to arrives with a one-size-fit-all architecture; hence […]
TCP/IP Layer-wise IoT Protocols Hello !! Hello everyone !! Thanks a lot for your valuable response for the previous blog. In this post, I will be explaining the basics of TCP (Transmission Control Protocol)/IP (Internet Protocol) stack and the respective IoT protocols associated with each layer. Anyone who has prior knowledge on TCP/IP stack can […]
Celebrate the Big Data Problems – #2 How to identify the no of buckets for a Hive table while executing the HiveQL DDLs ? The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, […]
Celebrate the Big Data Problems – #1 Daily we are facing many big data problems in production, PoC, and more perspective. Do we have any common repo to collect and share? No, as we know we don’t have any. As always dataottam is looking forward to share the learnings with community to celebrate their similar, […]
Big Data is problem statement and it can be solved with one of the tools like Apache Hadoop. But having Apache Hadoop as infra to do our proof of concepts, proof of values is little challenging. Hence we brought 3 click ideas to have your Apache Hadoop installed. What is Perquisite? Ubuntu 14.04 Internet Connection […]
Is Apache Hadoop the only option to implement Big Data? Yes, Hadoop is not only the options to big data problem. Hadoop is one of the solutions. The HPCC (High Performance Computing Cluster) Systems technology is an open source data driven and intensive processing and delivery platform developed by LexisNexis Risk Solutions. HPCC Systems incorporates […]
Thanks to Zaloni and Creating a Data-Driven Organization, Carl Anderson. The fantastic book, very well narrated in this book and I like to share our learning with our big data & IoT community. Many organizations think that simply because they generate a lot of reports or have many dashboards, they are data-driven. Although those activities […]
Top 12 excuses for why our big data isn’t paying off...
The Bot 101 [ Part 4 ] Dear Bot community members, thanks...
The List of 10+ Bot Platform for Developer and Architects!...
Top 150 Big Data & Cloud Computing Terminologies for...