By Gurmukh Singh

Key Features

  • Become a professional Hadoop administrator and practice projects to optimize your Hadoop Cluster
  • Import and export information into Hive and use Oozie to regulate workflow.
  • Practical recipes can help you propose and safe your Hadoop cluster, and make it hugely available

Book Description

Hadoop permits the dispensed garage and processing of enormous datasets throughout clusters of desktops. studying easy methods to administer Hadoop is essential to take advantage of its particular good points. With this e-book, it is possible for you to to beat universal difficulties encountered in Hadoop administration.

The e-book starts with laying the root via exhibiting you the stairs had to organize a Hadoop cluster and its quite a few nodes. you'll get a greater realizing of the way to take care of Hadoop cluster, specifically at the HDFS layer and utilizing YARN and MapReduce. extra on, you are going to discover longevity and excessive availability of a Hadoop cluster.

You'll get a greater figuring out of the schedulers in Hadoop and the way to configure and use them on your projects. additionally, you will get hands-on event with the backup and restoration ideas and the functionality tuning points of Hadoop. ultimately, you'll get a greater realizing of troubleshooting, diagnostics, and most sensible practices in Hadoop administration.

By the top of this e-book, you might have a formal realizing of operating with Hadoop clusters and also will be capable of safe, encrypt it, and configure auditing on your Hadoop clusters.

What you are going to learn

  • Set up the Hadoop structure to run a Hadoop cluster smoothly
  • Maintain a Hadoop cluster on HDFS, YARN, and MapReduce
  • Understand excessive availability with Zookeeper and magazine Node
  • Configure Flume for information ingestion and Oozie to run numerous workflows
  • Tune the Hadoop cluster for optimum performance
  • Schedule jobs on a Hadoop cluster utilizing the reasonable and potential scheduler
  • Secure your cluster and troubleshoot it for numerous universal ache points

About the Author

Gurmukh Singh is a professional expertise expert with 14+ years of event in infrastructure layout, allotted platforms, functionality optimization, and networks. He has labored in significant facts area for the final five years and offers consultancy and coaching on a variety of technologies.

He has labored with businesses comparable to HP, JP Morgan, and Yahoo.

He has authored tracking Hadoop through Packt Publishing

Table of Contents

  1. Hadoop structure and Deployment
  2. Maintain Hadoop Cluster - HDFS
  3. Maintain Hadoop Cluster -YARN and MapReduce
  4. High Availability
  5. Schedulers
  6. Backup and Recovery
  7. Data Ingestion and Workflow
  8. Performance Tuning
  9. Hbase and RDBMS
  10. Cluster making plans
  11. Troubleshooting, Diagnostics and most sensible practises
  12. Security

Show description

Read Online or Download Hadoop 2.x Administration Cookbook PDF

Similar data mining books

Robust Data Mining (SpringerBriefs in Optimization)

Facts uncertainty is an idea heavily comparable with so much genuine existence functions that contain information assortment and interpretation. Examples are available in facts received with biomedical tools or different experimental suggestions. Integration of sturdy optimization within the present facts mining suggestions target to create new algorithms resilient to mistakes and noise.

Data Mining Mobile Devices

With today’s shoppers spending extra time on their mobiles than on their desktops, new tools of empirical stochastic modeling have emerged which can offer dealers with specified information regarding the goods, content material, and companies their buyers hope. information Mining cellular units defines the gathering of machine-sensed environmental information referring to human social habit.

Information Security Analytics: Finding Security Insights, Patterns, and Anomalies in Big Data

Details safety Analytics delivers insights into the perform of analytics and, extra importantly, how one can make the most of analytic strategies to spot tendencies and outliers that won't be attainable to spot utilizing conventional safeguard research suggestions. info safeguard Analytics dispels the parable that analytics in the info defense area is proscribed to only safeguard incident and occasion administration platforms and simple community research.

Big Data Analytics Using Multiple Criteria Decision-Making Models (Operations Research Series)

A number of standards determination Making (MCDM) is a subfield of Operations learn, facing selection making difficulties. A decision-making challenge is characterised by way of the necessity to opt for one or a number of between a couple of choices. the sphere of MCDM assumes certain significance during this period of massive information and enterprise Analytics.

Extra info for Hadoop 2.x Administration Cookbook

Example text

Download PDF sample

Rated 4.49 of 5 – based on 14 votes