Enterprise Data Warehouse Optimization with Hadoop on IBM Power Systems Servers

Enterprise Data Warehouse Optimization with Hadoop on IBM Power Systems Servers
Author :
Publisher : IBM Redbooks
Total Pages : 82
Release :
ISBN-10 : 9780738456607
ISBN-13 : 0738456608
Rating : 4/5 (608 Downloads)

Book Synopsis Enterprise Data Warehouse Optimization with Hadoop on IBM Power Systems Servers by : Scott Vetter

Download or read book Enterprise Data Warehouse Optimization with Hadoop on IBM Power Systems Servers written by Scott Vetter and published by IBM Redbooks. This book was released on 2018-01-31 with total page 82 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data warehouses were developed for many good reasons, such as providing quick query and reporting for business operations, and business performance. However, over the years, due to the explosion of applications and data volume, many existing data warehouses have become difficult to manage. Extract, Transform, and Load (ETL) processes are taking longer, missing their allocated batch windows. In addition, data types that are required for business analysis have expanded from structured data to unstructured data. The Apache open source Hadoop platform provides a great alternative for solving these problems. IBM® has committed to open source since the early years of open Linux. IBM and Hortonworks together are committed to Apache open source software more than any other company. IBM Power SystemsTM servers are built with open technologies and are designed for mission-critical data applications. Power Systems servers use technology from the OpenPOWER Foundation, an open technology infrastructure that uses the IBM POWER® architecture to help meet the evolving needs of big data applications. The combination of Power Systems with Hortonworks Data Platform (HDP) provides users with a highly efficient platform that provides leadership performance for big data workloads such as Hadoop and Spark. This IBM RedpaperTM publication provides details about Enterprise Data Warehouse (EDW) optimization with Hadoop on Power Systems. Many people know Power Systems from the IBM AIX® platform, but might not be familiar with IBM PowerLinuxTM, so part of this paper provides a Power Systems overview. A quick introduction to Hadoop is provided for those not familiar with the topic. Details of HDP on Power Reference architecture are included that will help both software architects and infrastructure architects understand the design. In the optimization chapter, we describe various topics: traditional EDW offload, sizing guidelines, performance tuning, IBM Elastic StorageTM Server (ESS) for data-intensive workload, IBM Big SQL as the common structured query language (SQL) engine for Hadoop platform, and tools that are available on Power Systems that are related to EDW optimization. We also dedicate some pages to the analytics components (IBM Data Science Experience (IBM DSX) and IBM SpectrumTM Conductor for Spark workload) for the Hadoop infrastructure.

Enterprise Data Warehouse Optimization with Hadoop on IBM Power Systems Servers Related Books

Enterprise Data Warehouse Optimization with Hadoop on IBM Power Systems Servers
Language: en
Pages: 82
Authors: Scott Vetter
Categories: Computers
Type: BOOK - Published: 2018-01-31 - Publisher: IBM Redbooks

GET EBOOK

Data warehouses were developed for many good reasons, such as providing quick query and reporting for business operations, and business performance. However, ov
AI and Big Data on IBM Power Systems Servers
Language: en
Pages: 162
Authors: Scott Vetter
Categories: Computers
Type: BOOK - Published: 2019-04-10 - Publisher: IBM Redbooks

GET EBOOK

As big data becomes more ubiquitous, businesses are wondering how they can best leverage it to gain insight into their most important business questions. Using
Hortonworks Data Platform with IBM Spectrum Scale: Reference Guide for Building an Integrated Solution
Language: en
Pages: 30
Authors: Sandeep R. Patil
Categories: Computers
Type: BOOK - Published: 2018-06-26 - Publisher: IBM Redbooks

GET EBOOK

This IBM® RedpaperTM publication provides guidance on building an enterprise-grade data lake by using IBM SpectrumTM Scale and Hortonworks Data Platform for pe
IBM Data Engine for Hadoop and Spark
Language: en
Pages: 126
Authors: Dino Quintero
Categories: Computers
Type: BOOK - Published: 2016-08-24 - Publisher: IBM Redbooks

GET EBOOK

This IBM® Redbooks® publication provides topics to help the technical community take advantage of the resilience, scalability, and performance of the IBM Powe
IBM Information Server: Integration and Governance for Emerging Data Warehouse Demands
Language: en
Pages: 194
Authors: Chuck Ballard
Categories: Computers
Type: BOOK - Published: 2013-07-10 - Publisher: IBM Redbooks

GET EBOOK

This IBM® Redbooks® publication is intended for business leaders and IT architects who are responsible for building and extending their data warehouse and Bus