Professional Hadoop

The professional's one-stop guide to this open-source, Java-based big data framework Professional Hadoop is the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings. Written by an expert team of certified Hadoop developers, committers...

Celý popis

Uloženo v:
Podrobná bibliografie
Hlavní autor: Benoy Antony, Konstantin Boudnik, Cheryl Adams, Branky Shao, Cazen Lee, Kai Sasaki
Médium: E-kniha
Jazyk:angličtina
Vydáno: Newark Wiley 2016
John Wiley & Sons, Incorporated
Wrox
Vydání:1
Témata:
ISBN:9781119267201, 111926720X, 9781119267171, 111926717X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Obsah:
  • YARN Architecture -- Application Submission in YARN -- Summary -- Chapter 7: Ecosystem at Large: Hadoop with Apache Bigtop -- Basics Concepts -- Software Stacks -- Test Stacks -- Works on My Laptop -- Developing a Custom-Tailored Stack -- Apache Bigtop: The History -- Apache Bigtop: The Concept and Philosophy -- The Structure of the Project -- Meet the Build System -- Toolchain and Development Environment -- BOM Definition -- Deployment -- Bigtop Provisioner -- Master-less Puppet Deployment of a Cluster -- Configuration Management with Puppet -- Integration Validation -- iTests and Validation Applications -- Stack Integration Test Development -- Validating the Stack -- Cluster Failure Tests -- Smoke the Stack -- Putting It All Together -- Summary -- Chapter 8: In-Memory Computing in Hadoop Stack -- Introduction to In-Memory Computing -- Apache Ignite: Memory First -- System Architecture of Apache Ignite -- Data Grid -- A Discourse on High Availability -- Compute Grid -- Service Grid -- Memory Management -- Persistence Store -- Legacy Hadoop Acceleration with Ignite -- Benefits of In-Memory Storage -- Memory Filesystem: HDFS Caching -- In-Memory MapReduce -- Advanced Use of Apache Ignite -- Spark and Ignite -- Sharing the State -- In-Memory SQL on Hadoop -- SQL with Ignite -- Streaming with Apache Ignite -- Summary -- Glossary -- Index -- EULA
  • Cover -- Title Page -- Copyright -- Contents -- Introduction -- Chapter 1: Hadoop Introduction -- Business Analytics and Big Data -- The Components of Hadoop -- The Distributed File System (HDFS) -- What Is MapReduce? -- What Is YARN? -- What Is ZooKeeper? -- What Is Hive? -- Integration with Other Systems -- The Hadoop Ecosystem -- Data Integration and Hadoop -- Summary -- Chapter 2: Storage -- Basics of Hadoop HDFS -- Concept -- Architecture -- Interface -- Setting Up the HDFS Cluster in Distributed Mode -- Install -- Advanced Features of HDFS -- Snapshots -- Offline Viewer -- Tiered Storage -- Erasure Coding -- File Format -- Cloud Storage -- Summary -- Chapter 3: Computation -- Basics of Hadoop MapReduce -- Concept -- Architecture -- How to Launch a MapReduce Job -- Writing a Map Task -- Writing a Reduce Task -- Writing a MapReduce Job -- Configurations -- Advanced Features of MapReduce -- Distributed Cache -- Counter -- Job History Server -- The Difference from a Spark Job -- Summary -- Chapter 4: User Experience -- Apache Hive -- Hive Installation -- HiveQL -- UDF/SerDe -- Hive Tuning -- Apache Pig -- Pig Installation -- Pig Latin -- UDF -- Hue -- Features -- Apache Oozie -- Oozie Installation -- How Oozie Works -- Workflow/Coordinator -- Oozie CLI -- Summary -- Chapter 5: Integration with Other Systems -- Apache Sqoop -- How It Works -- Apache Flume -- How It works -- Apache Kafka -- How It Works -- Kafka Connect -- Stream Processing -- Apache Storm -- How It Works -- Trident -- Kafka Integration -- Summary -- Chapter 6: Hadoop Security -- Securing the Hadoop Cluster -- Perimeter Security -- Authentication Using Kerberos -- Service Level Authorization in Hadoop -- Impersonation -- Securing the HTTP Channel -- Securing Data -- Data Classification -- Bringing Data to the Cluster -- Protecting Data in the Cluster -- Securing Applications