What is the best big data solution for working with all databases from Splunk ! The answer is Splunk DB Connect! In this blog we will see how the Splunk DB connect helps us to integrate all the databases from Splunk. Splunk DB Connect is the best solution for working with databases from Splunk. It […]
High Level Framework of Big Data Graph Databases! In Big Data world, it was very much clear that the connected data to store and processing the data was first challenge. And the first ideation is to replace and leverage the tabular SQL Semantic with the graph-centric model. And then the graph is new to big […]
What is Beyond Classic Hadoop? Is it Spark and Flink? In this blog, we will explore the two new big data friends to Hadoop, and they are Apache Spark and Apache Flink. And if we take the Hadoop improvements with the parallel processing MapReduce; speed is very first focus. However, MapReduce is designed and developed for […]
Requirement To take a backup of our Cluster data for disaster recovery Approach We are going to use the Glacier Storage provided by AWS. About Glacier Storage Glacier is designed to address the shortcomings of a number of traditional archive solutions, like TAPE and DISK archiving none of which is completely satisfactory. Glacier leverages the […]
Here let us see what kind of data organizations wants to ingest into Hadoop for their Business or Analytics Insights. Basically Large volume of data and unstructured data are strong candidates for Hadoop. Clickstream data : Clickstream data is the stream of clicks someone performs when visiting a website. This information can be used for […]
A First Look at Big Data Apache Flink! There is abundance of interest in learning how to analyze streaming data in large-scale systems, partly because there are situations in which the time-value of data makes real-time analytics so eye-catching. But gathering in-the-moment insights made possible by very low latency applications is just one of the […]
Top 16 Hadoop Built-in Ingress and Egress Tools ! Hadoop has revolutionized data ingestion, data processing and enterprise data warehousing, but its explosive growth has come with a large amount of uncertainty, hype, and confusion. With this blog, enterprise decision makers will receive short quick insights on what all the 16 Hadoop build-in Ingress and […]
The 9 Key steps to implement Big Data DevOps ! Per WiKi Definition: DevOps (a clipped compound of development and operations) is a culture, movement or practice that emphasizes the collaboration and communication of both software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes. Per Gene Kim(author of The […]
The Pyramid of Internet of Things (IoT) Alright, what is Internet of Things (IoT) ? How does it differ from Internet of Everything? What is M2M ? All the above queries would be running in your mind if you’re a beginner/newbie to this child protocol. So, the simplest answer is “They all are the same”. […]
The 8th Habit of Highly Effective Big Data Programmers ! Last week I read a book called “The Seven Habits of Highly Effective Big Data Programmers” by Rekha Joshi which is interesting. Happy to share with the community which I have encouraged from the book. Let’s understand first what Big Data is. Just by listening the […]
We should be excited that Apache Hive community have released the largest release and announced the availability of Apache Hive 2.0.0. It brings great and exciting improvements in the category of new functionality, Performance, Optimizations, Security, and Usability. Let us explore the features in detail below; HBase to store Hive Metadata – The current metastore […]
Self-Learn Yourself Apache Spark in 21 Blogs – #7 Key Concepts of Resilient Distributed Datasets (RDDs) and more… In this blog how do we create the RDDs and what operations can we perform with RDDs. Have quick read on the other blogs in this learning series. In simple RDD (Resilient Distributed Dataset); if data in […]
The Artistic Guide to Big Data: Hadoop/Spark We love...
Is the Docker and Container are same? I thrilled in this new...
25 Free Must-Read Books in New Year 2018 on Open Source,...
Apache Spark is Superstar; but it’s Supernova on Azure for...