Rapid - Apache Mahout Clustering designs

Explore clustering algorithms used with Apache Mahout

Publisher:Packt Publishing Limited

Paid access

|Jan 2015

E-Book €23.99Institutions €119.95

Key Features

Book Description

As more and more organizations are discovering the use of big data analytics, interest in platforms that provide storage, computation, and analytic capabilities has increased. Apache Mahout caters to this need and paves the way for the implementation of complex algorithms in the field of machine learning to better analyse your data and get useful insights into it.
Starting with the introduction of clustering algorithms, this book provides an insight into Apache Mahout and different algorithms it uses for clustering data. It provides a general introduction of the algorithms, such as K-Means, Fuzzy K-Means, StreamingKMeans, and how to use Mahout to cluster your data using a particular algorithm. You will study the different types of clustering and learn how to use Apache Mahout with real world data sets to implement and evaluate your clusters.
This book will discuss about cluster improvement and visualization using Mahout APIs and also explore model-based clustering and topic modelling using Dirichlet process. Finally, you will learn how to build and deploy a model for production use.

What you will learn

Explore clustering algorithms and cluster evaluation techniques
Learn different types of clustering and distance measuring techniques
Perform clustering on your data using KMeans clustering
Discover how canopy clustering is used as preprocess step for KMeans
Use the Fuzzy KMeans algorithm in Apache Mahout
Implement Streaming KMeans clustering in Mahout
Learn Spectral KMeans clustering implementation of Mahout

Who this book is for

Understanding Clustering
Understanding K-Means Clustering
Understanding Canopy clustering using Mahout
Understanding Fuzzy K-Means Algorithm using Mahout
Understanding Model based Clustering
Understanding Streaming KMeans Algorithm
Understanding Spectral Clustering
Improving Cluster Quality
Creating Cluster model for production

PDF ISBN: 978-1-78328-444-3

Publisher: Packt Publishing Limited

Publication date: 2015

Language: English

Pages: 130

Related subjects:

General interest

Rapid - Apache Mahout Clustering designs

Key Features

Book Description

What you will learn

Who this book is for

Table of Contents

People also read

Publications carousel

Apache Mahout Essentials

Learning Apache Mahout

Cluster Analysis and Data Mining

Instant MapReduce Patterns - Hadoop Essentials How-to

Applied Unsupervised Learning with R

Hadoop Real-World Solutions Cookbook- Second Edition

Optimizing Hadoop for MapReduce

Hadoop: Data Processing and Modelling

Hadoop: Data Processing and Modelling

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Paradigm

My account