課程簡介
第一部分:HDFS中的數據管理
- 各種數據格式(JSON/Avro/Parquet)
- 壓縮方案
- 數據屏蔽
- 實驗:分析不同數據格式;啓用壓縮
第二部分:高級Pig
- 用戶自定義函數
- Pig庫介紹(ElephantBird/Data-Fu)
- 使用Pig加載複雜結構化數據
- Pig調優
- 實驗:高級Pig腳本編寫,解析複雜數據類型
第三部分:高級Hive
- 用戶自定義函數
- 壓縮表
- Hive性能調優
- 實驗:創建壓縮表,評估表格式和配置
第四部分:高級HBase
- 高級模式建模
- 壓縮
- 批量數據導入
- 寬表與高表比較
- HBase與Pig
- HBase與Hive
- HBase性能調優
- 實驗:調優HBase;通過Pig和Hive訪問HBase數據;使用Phoenix進行數據建模
最低要求
- 熟悉Java編程語言(大多數編程練習使用Java)
- 熟悉Linux環境(能夠使用Linux命令行,使用vi/nano編輯文件)
- 具備Hadoop的基本知識。
實驗環境
零安裝:無需在學生的機器上安裝Hadoop軟件!將爲學生提供一個可用的Hadoop集羣。
學生需要以下內容
客戶評論 (5)
The live examples
Ahmet Bolat - Accenture Industrial SS
課程 - Python, Spark, and Hadoop for Big Data
During the exercises, James explained me every step whereever I was getting stuck in more detail. I was completely new to NIFI. He explained the actual purpose of NIFI, even the basics such as open source. He covered every concept of Nifi starting from Beginner Level to Developer Level.
Firdous Hashim Ali - MOD A BLOCK
課程 - Apache NiFi for Administrators
That I had it in the first place.
Peter Scales - CACI Ltd
課程 - Apache NiFi for Developers
practical things of doing, also theory was served good by Ajay
Dominik Mazur - Capgemini Polska Sp. z o.o.
課程 - Hadoop Administration on MapR
The VM I liked very much The Teacher was very knowledgeable regarding the topic as well as other topics, he was very nice and friendly I liked the facility in Dubai.