Pentaho Business Analytics 1st Edition Review

Just a few weeks ago, Pentaho updated of its products (both CE and commercial). Business as usual. The latest version, currently 5.1, is accompanied by the latest developer tools as usual. You can get them from Sourceforge. In summary, the new version includes better integration with MongoDB, R, Weka and Yarn.

Packt Publishing has given me the opportunity to review the new book for Pentaho which name is “Pentaho Business Analytics”, the new name for the platform. Let’s start with the good news: “Pentaho Business Analytics” from Sergio Ramazzina provides the most up-to-date overview of the Pentaho platform by examples. It’s a quick walkthrough (625 pages!) that covers the main aspects that a developer must know about the platform.

The bad news. This is not a book about Business Analytics so don’t confuse the marketing name of the Pentaho platform with the data strategy. You will not find any Business Analytics recipes in this book. Another drawback is that the new capabilities and features are omitted but this is understandable as this is not the objective of the book. It may be a nice chapter to add for the second version.

The book starts with the Pentaho User Console (chapter 1) and BA Server instance configuration (chapter 2). After so many books explaining time after time how to install pentaho server is good to find a fresh approach. At least considers that there is enough public information to be able to do these initial steps.

These first two chapters includes the most common tasks that a developer / administrator needs to manage a Pentaho server instance. Taking into consideration the huge changes from version 4 to 5, this knowledge will be quite welcome to the reader.

Then it continues with: how to deal with data sources (chapter 3), working pentaho metadata editor (chapter 4), creating my first report with pentaho interactive reporting (chapter 5), creating my first mondrian model and OLAP analysis (chapter 6), creating my first report with pentaho reporting (chapter 7), creating my first dashboard (chapter 8), how to use scheduling (chapter 9), how to use and setup pentaho mobile (chapter 10) and server customization (chapter 11). Many of these chapters should be know by an average developer, if not this is the right time to start.

As you can imagine if you want to master one of the developers tools, this is not your book. You have other books such as “Pentaho Reporting 5.0 by Example Beginner’s Guide” that provides a detailed overview of using Pentaho Report Designer by examples.

Why this book may be interesting for you?

If you are new to Pentaho you will appreciate the walkthrough that helps you to introduce the complete platform. If you want to upgrade from version 4 to 5, chapters 1, 2, 9 and 11 will help you to become familiar with the new environment and how to be able to do again all the main tasks to manage your platform. However if you are already working with version 5 and you have already mastered the platform this book will be provided you limited value. Moreover, there is a strong focus on the professional flavour of the platform and only some sections here and there for the community version. It will be nice to add a comparison at the beginning of each chapter.

In summary, a more-than-welcome book about the latest version of BI pentaho platform. It tries to provide a complete overview (based on foodmart data) going through metadata, reporting and OLAP. If you are struggling with users, permissions, folders,… this is you book!

You can buy the book here if you are interested.

“Pentaho Data Integration Cookbook” 2nd edition Review

In 2011, the first edition of “Pentaho Data Integration Cookbook” was published. In that moment in time, the book was interesting enough for a PDI (Pentaho Data Integration) developer as it provided relevant answers for many of the common tasks that have to be carried out for data warehousing processes.

After two years, the data market has greatly evolved. Among other trends, Big Data is a major trend and nowadays PDI included numerous new features to connect and use Hadoop and NoSQL databases.

The idea behind the second version is to include some of the brand new tasks required to tame Big Data using PDI and update the content of the previous edition. Alex Meadows, from Red Hat, has joined the previous authors (María Carina Roldan and Adrián Sergio Pulvirenti) in this second version. Maria is author of four books about Pentaho Data Integration.

What is Pentaho Data Integration?

I’m sure that many of you already know it. For those who doesn’t. PDI is an open source swiss army knife of tools to extract, move, transform and load data.

What is this book?

To put it simply. It includes practical handy recipes for many of the everyday situations for a PDI developer. All recipes follow the same schema:

  • State the problem
  • Create a transformation or job to solve the problem
  • Explain in detail and provide potential pitfalls

What is new?

One thing that a potential reader can question himself is: If I already have the previous one, is it worth to read this additional version? If you are a Pentaho Data Integration developer, the easy answer is yes. Mainly, because the book includes new chapters and sections for Big Data and Business Analytics, technologies that are becoming crucial core corporate capabilities in the information age.

So, in my humble opinion, the most interesting chapters are:

  1. Chapter 3: where the reader will have the chance to learn how to load / get data into Hadoop, hbase and MongoDB.
  2. Chapter 12: where the reader will be given the opportunity to read data from a SAS data file, to create statistics from a data stream and to build a random data sample for Weka.

What I’m missing or could be improved?

More screenshots, some readers probably could think the same. Being honest, while I’m happy about the chapter 3 and 12, it will be interesting to have more content related to these topics. So, let’s put it this way. I am counting down the days for the following edition.

In summary, an interesting book for PDI and data warehousing practitioners that give some information about how to use PDI for Big Data and Analytics. If you are interested you can find it here.


Review of “Pentaho Reporting 5.0 by Example Beginner’s Guide” from Mariano García Mattío and Dario R. Bernabeu

A few weeks ago, Pentaho released the new version of its products (both CE and commercial). The latest version, currently 5.0, is accompanied by the latest developer tools. As it is usual, each new major release means new features. For example, the new version focus on better user interface and support for Big Data.

We have a new version of Pentaho Reporting as well. This tool helps to create professional reports with graphics, formulas, subreports, and so on.

If you want to master this tool you have several options: (1) mastering the tool yourself by trial and error (and / or searching information in forums), (2) training (through a certified partner or not) or (3) using a book.

That brings me to the topic I want to speak about in this post. Packt Publishing has given me the opportunity to review the new book for Pentaho Reporting which name is “Pentaho Reporting 5.0 by Example Beginner’s Guide”. This books provides a detailed overview of using Pentaho Report Designer by examples.

The book starts with the usual suspects: What is Pentaho Reporting and Pentaho Reporting Designer (PRD), which are the main components of PRD and the evolution of Pentaho Reporting since 2002. Nothing additional for a the daily Pentaho developer, but it is still interesting for a newcomer.

Why this book may be still interesting for you? If you are a Pentaho developer the initial chapters are not new. Chapter 2 is about the installation of PRD, Chapter 3 is about the user interface and Chapter 4 is about your first report. So probably you are going to skip them.

The interesting part starts with Chapter 5. Even if you are a regular developer, it is easy to forget some features or the proper way to do things. Following a step by step description process, the book provides numerous examples and helps to increase your skills. Among the topics, it is worth to highlight: how to connect to a database, how to create formulas, how to add a new JDBC driver, how to add a group, how to add parameters, how to add charts, how to add subreports, how to publish your reports to pentaho server,…

This is: The main things that a reporting developer should know and master to start working with Pentaho Reporting Designer.

It’s nice to say that is one of the earliest books about the newest version of Pentaho. And that means that the chapter that references to Pentaho Server (Chapter 11) is including the screenshots with the new interface.

What if I am a beginner developer: This is your book. No brainier.

What if you are an expert developer. It may be a nice addition to your library if you don’t have any book, but probably you already know almost everything that is explained (even hyperlinks, sparklines, stylesheets and crosstabs). So, just remember what the title says: it’s for beginners.