An easy-to-follow Apache Hadoop administrator’s guide filled with practical screenshots and explanations for each step and configuration. This book is great for administrators interested in setting up and managing a large Hadoop cluster. If you are an administrator, or want to be an administrator, and you are ready to build and maintain a production-level cluster running CDH5, then this book is for you.
What you will learn
Understand the Apache Hadoop architecture and the future of distributed processing frameworks
Use HDFS and MapReduce for all filerelated operations
Install and configure CDH to bring up an Apache Hadoop cluster
Configure HDFS High Availability and HDFS Federation to prevent single points of failure
Install and configure Cloudera Manager to perform administrator operations
Implement security by installing and configuring Kerberos for all services in the cluster
Add, remove, and rebalance nodes in a cluster using cluster management tools
Understand and configure the different backup options to back up your HDFS