Big Data Track

Uncategorised 29 January 2024

Huawei Big Data Certification Training

HCIA-Big Data Training

Training Path

1	HCIA-Big Data Training
	Lecture,Practice	5.0 days

Target Audience

Big data practitioners
Big data related industry practitioners

Prerequisites

Have basic knowledge of Linux
With IT project experience
Have Hadoop basics

Objectives

On completion of this program, the participants will be able to:

Master the principles of big data components
Master the usage of big data components

Training Contents

HCIA-Big Data Training

MapReduce - Distributed Off-line Batch Processing and Yarn - Resource Negotiator

Introduction to MapReduce and YARN
Functions and Architectures of MapReduce and YARN
Resource Management and Task Scheduling of YARN
Enhanced Features

HBase - Distributed NoSQL Database

Introduction to HBase
Functions and Architecture of HBase
Key Processes of HBase
Huawei Enhanced Features of HBase

HDFS - Hadoop Distributed File System

HDFS Overview and Application Scenarios
Position of HDFS in FusionInsight HD
HDFS System Architecture
Key Features

Streaming - Distributed Stream Computing Engine

Introduction to Streaming
System Architecture
Key Features
Introduction to StreamCQL

Kafka - Distributed Message Subscription System

Introduction to Kafka
Architecture and Functions of Kafka
Key Processes of Kafka

Zookeeper - Cluster Distributed Coordination Service

Introduction to ZooKeeper
Position of ZooKeeper in FusionInsight
System Architecture
Key Features
Relationship with Other Components

Big Data Industry and Technological Trends

Big Data Era
Big Data Application Scope
Opportunities and Challenges in the Big Data Era
Huawei Big Data Solution

FusionInsight HD Solution Overview

FusionInsight Overview
FusionInsight Features
Success Cases of FusionInsight

Flume - Massive Logs Aggregation

Flume Overview and Architecture
Key Characteristics of Flume
Flume Applications

Hive - Distributed Data Warehouse

Introduction to Hive
Hive Functions and Architecture
Basic Hive Operations

Spark2x - In-memory Distributed Computing Engine

Spark Overview
Spark Principles and Architecture
Spark Integration in FusionInsight HD

Loader - Data Transformation

Introduction to Loader
Loader Job Management

Flink – Stream Processing and Batch Processing Platform

Flink Overview
Technical Principles and Architecture of Flink
Flink Integration in FusionInsight HD

Duration

5 working days