Big Data Meets Microsoft Azure ! For Big Data & Cloud Community members this post on “Big Data, Meet Azure” is all about doing big on public cloud Azure. And sure, we no need definition for Big Data and Cloud Computing, but in a line; I would like to called both as Super Nova for […]
How to Ingest HDFS in JSON format using Apache Sqoop ? by NS Saravanan In current project use lambda architecture, so Data from sources system extracted in two ways, Real time streaming OR speed layer Batch process or Bach Layer Speed layer implemented using Attunity > Kafka > Spark streaming . The out of Spark stream […]
The 4 Key Concepts in the Anatomy of an Apache Spark Job! For Big Data & Cloud Community members Apache Spark is Awesome to handle any workloads such as Batch, Streaming, Real-Time, and Ad-hoc. However, to fine tune and optimize of our Apache Spark Applications we need to have a grip on the Apache Spark […]
The 10 Misconceptions of iPaaS ( Integration Platform as a Service) Dear Data Community, On last month, my friend invited for a O’Reilly webcast on Integration Platform as a Service (iPaaS), by Leon Stigter, TIBCO Sr. Product Manager and Developer and API Enthusiast. In this cast they have covered; What is an iPaaS? Why would […]
The top 79 beautiful lines for taking big data architecture from drawing board to production! Dear Data Community, Instead of titling this blog is “The top 79 beautiful lines for taking big data architecture from drawing board to production”, It would be very suitable if we call it as book talk, which is inspired by […]
Getting Started with Google Cloud Platform ! Last month got a chance to attend Bengaluru Google Cloud OnBoard, instructor led enablement event for Google Cloud Platform(Big Data). Big Data on GCP is simply superb, must try once. And presenting the prepared Getting Started with Google Cloud Platform artifact for our handy reference. Below are the quick […]
Top 10 Reasons to Run Hadoop in the Public Cloud ! Hadoop ecosystem in the public cloud means, which it is running Hadoop clusters on hardware offered by a cloud service provider. And this practice is business as usual compared with running Hadoop clusters on our own hardware, called on-premises clusters or “on-prem”. But installing […]
Big Data Stack 2.0 and Beyond! The Google File System (GFS), MapReduce, and Bigtable are Googles & data industries Big Data revolution, which constructs Big Data Stack 1.0. Dough Cutting actually integrated the above released concepts into a tool called Hadoop. GFS + MapReduce + Bigtable > HDFS + MapReduce + HBase; which is together […]
What is the best big data solution for working with all databases from Splunk ! The answer is Splunk DB Connect! In this blog we will see how the Splunk DB connect helps us to integrate all the databases from Splunk. Splunk DB Connect is the best solution for working with databases from Splunk. It […]
What is Beyond Classic Hadoop? Is it Spark and Flink? In this blog, we will explore the two new big data friends to Hadoop, and they are Apache Spark and Apache Flink. And if we take the Hadoop improvements with the parallel processing MapReduce; speed is very first focus. However, MapReduce is designed and developed for […]
The 7 Habits Of Successful Big Data and NoSQL Projects by Ben Lorica ! Let’s have firstname.lastname@example.org
Big Data Splunk’s Best & Better Practices ! Introduction to Splunk We see servers, devices, apps, logs, traffic, and clouds. We see data, big data, and fat data everywhere. Splunk offers the leading platform for Operational Intelligence. It enables the curious to look closely at what others ignore which is called machine data and find […]
Big Data Meets Microsoft Azure ! For Big Data & Cloud...
How to Ingest HDFS in JSON format using Apache Sqoop ?...
The 4 Key Concepts in the Anatomy of an Apache Spark Job!...
The 1-2-3-4-5-6-7-8-9 of Cognitive Computing ! Dear Data...