Here let us see what kind of data organizations wants to ingest into Hadoop for their Business or Analytics Insights. Basically Large volume of data and unstructured data are strong candidates for Hadoop. Clickstream data : Clickstream data is the stream of clicks someone performs when visiting a website. This information can be used for […]
A First Look at Big Data Apache Flink! There is abundance of interest in learning how to analyze streaming data in large-scale systems, partly because there are situations in which the time-value of data makes real-time analytics so eye-catching. But gathering in-the-moment insights made possible by very low latency applications is just one of the […]
Top 16 Hadoop Built-in Ingress and Egress Tools ! Hadoop has revolutionized data ingestion, data processing and enterprise data warehousing, but its explosive growth has come with a large amount of uncertainty, hype, and confusion. With this blog, enterprise decision makers will receive short quick insights on what all the 16 Hadoop build-in Ingress and […]
The 9 Key steps to implement Big Data DevOps ! Per WiKi Definition: DevOps (a clipped compound of development and operations) is a culture, movement or practice that emphasizes the collaboration and communication of both software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes. Per Gene Kim(author of The […]
The Pyramid of Internet of Things (IoT) Alright, what is Internet of Things (IoT) ? How does it differ from Internet of Everything? What is M2M ? All the above queries would be running in your mind if you’re a beginner/newbie to this child protocol. So, the simplest answer is “They all are the same”. […]
The 8th Habit of Highly Effective Big Data Programmers ! Last week I read a book called “The Seven Habits of Highly Effective Big Data Programmers” by Rekha Joshi which is interesting. Happy to share with the community which I have encouraged from the book. Let’s understand first what Big Data is. Just by listening the […]
We should be excited that Apache Hive community have released the largest release and announced the availability of Apache Hive 2.0.0. It brings great and exciting improvements in the category of new functionality, Performance, Optimizations, Security, and Usability. Let us explore the features in detail below; HBase to store Hive Metadata – The current metastore […]
Self-Learn Yourself Apache Spark in 21 Blogs – #7 Key Concepts of Resilient Distributed Datasets (RDDs) and more… In this blog how do we create the RDDs and what operations can we perform with RDDs. Have quick read on the other blogs in this learning series. In simple RDD (Resilient Distributed Dataset); if data in […]
Data Lake Architecture Considerations & Composition In our last blog we saw the key benefits of Data Lake, but let’s deep dive in to the internals of a Data Lake via discussing the key considerations and compositions. Architecture Considerations: Take in any solution considerations it is practical difficult to arrives with a one-size-fit-all architecture; hence […]
How to have our basic statistics (Mean, Median, SD, Var, Cor, Cov) computed using R language? The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, Problem, Solutions) Framework. Context: In statistics Mean, Median, […]
In Blog 5, we will see Apache Spark Languages with basic Hands-on. Click to have quick read on the other blogs of Apache Spark in this learning series. With our cloud setup of our Apache Spark now we are ready to develop big data Spark applications. And before getting started with building Spark applications let’s […]
Celebrate the Big Data Problems – #2 How to identify the no of buckets for a Hive table while executing the HiveQL DDLs ? The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, […]
The List of 10+ Bot Platform for Developer and Architects!...
Top 150 Big Data & Cloud Computing Terminologies for...
The Bot 101 [ Part 3 ] Dear Bot community members, thanks...
Top 5 Focuses to Improve Cloud-Native Application...