Comparing Architecture Characteristics in Big Data Context! In this blog we’ll explore the differences between microservices and SOA in terms of the defining characteristics of the architecture pattern. In Big Data world, Apache Hadoop has come a long way in its relatively short lifespan. From its beginnings as a reliable storage pool with integrated batch […]
Requirement To take a backup of our Cluster data for disaster recovery Approach We are going to use the Glacier Storage provided by AWS. About Glacier Storage Glacier is designed to address the shortcomings of a number of traditional archive solutions, like TAPE and DISK archiving none of which is completely satisfactory. Glacier leverages the […]
Intra Cluster copying using DISTCP Step 1 : Get to know your namenode information of both the clusters using the below command hdfs getconf -namenodes Step 2 : Verify the accessibility to HDFS on both your cluster using the below command hdfs dfs -ls hdfs://Namenode1:8020/data/file.txt hdfs dfs -ls hdfs://Namenode2:8020/data/ Once Successful move to Step 3 […]
Here let us see what kind of data organizations wants to ingest into Hadoop for their Business or Analytics Insights. Basically Large volume of data and unstructured data are strong candidates for Hadoop. Clickstream data : Clickstream data is the stream of clicks someone performs when visiting a website. This information can be used for […]
Self-Learn Yourself Scala in 21 Blogs – #7 Missed the previous blogs have a quick look with Self-Learn Yourself Scala in 21 Blogs (#1, #2, #3, #4, #5, #6). In this blog let’s understand evaluation strategies in Scala programming. There are two common evaluations strategies in scala. Call by Value and call by name. The […]
The 9 Key steps to implement Big Data DevOps ! Per WiKi Definition: DevOps (a clipped compound of development and operations) is a culture, movement or practice that emphasizes the collaboration and communication of both software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes. Per Gene Kim(author of The […]
Top 11 Apache Hadoop YARN Frameworks Part of the core Hadoop project, YARN is the architectural center of Hadoop that allows multiple data processing engines such as interactive SQL, real-time streaming, data science and batch processing to handle data stored in a single platform, unlocking an entirely new approach to analytics. YARN is the foundation […]
Understand Kappa Architecture in 2 minutes What is Kappa Architecture ? Kappa architecture makes all the data processing in Near Real Time or Streaming mode, which in simple terms removing the batch layer from Lambda Architecture makes it a Kappa Architecture, to know quickly about lambda Architecture visit Understand Lambda Architecture in 2 minutes. Evolution […]
Relationship between MapReduce, Spark, YARN, and HDFS ! In Big Data era Hadoop is the de facto standard for developing of big data applications by using MapReduce framework. And Hadoop is composed of one or more master nodes and any number of slave nodes depends up on the data needed. Hadoop simplifies distributed applications by […]
The 8th Habit of Highly Effective Big Data Programmers ! Last week I read a book called “The Seven Habits of Highly Effective Big Data Programmers” by Rekha Joshi which is interesting. Happy to share with the community which I have encouraged from the book. Let’s understand first what Big Data is. Just by listening the […]
Understand Lambda Architecture in 2 minutes What is Lambda Architecture ? Lambda architecture which provides us a combined solution of realtime data with batch data. What is the Need for Lambda Architecture ? lambda Architecture was implemented mainly due to the Latency provided by the Map reduce paradigm, where the batch views was created on […]
Self-Learn Yourself Scala in 21 Blogs – #5 Blog 5 – Does functional programming matters and what are monads? Missed the previous blogs have a quick look with Self-Learn Yourself Scala in 21 Blogs (#1, #2, #3, #4). In this blog let’s understand for Scala developers does the functional programming matters and also what is […]
Step By Steps for deploying an Hello World, APP on Google Cloud Platform Container using Docker & Kubernetes
Step By Steps for deploying an Hello World, APP on Google...
The Bot 101 [ Part 2 ] Thanks for reading and sharing the...
“The Top 10 Container Orchestration tools” This...
The 11 DevOps Misconceptions ! In this blog we’ll have...