Hadoop with Python

  • Автор темы b10w01f
  • 190
  • Обновлено
  • 23, Jan 2019
  • #1
Hadoop is mostly written in Java, but that doesn't exclude the use of other programming languages with this distributed storage and processing framework, particularly Python.

With this concise book, you'll learn how to use Python with the Hadoop Distributed File System (HDFS), MapReduce, the Apache Pig platform and Pig Latin script, and the Apache Spark cluster-computing framework.

This book takes you through the basic concepts behind Hadoop, MapReduce, Pig, and Spark.

Then, through multiple examples and use cases, you'll learn how to work with these technologies by applying various Python tools.
  • Use the Python library Snakebite to access HDFS programmatically from within Python applications
  • Write MapReduce jobs in Python with mrjob, the Python MapReduce library
  • Extend Pig Latin with user-defined functions (UDFs) in Python
  • Use the Spark Python API (PySpark) to write Spark programs with Python
  • Learn how to use the Luigi Python workflow scheduler to manage MapReduce jobs and Pig scripts


Hadoop with Python

b10w01f


Рег
22 Nov, 2008

Тем
41

Постов
41

Баллов
451
Тем
49554
Комментарии
57426
Опыт
552966

Интересно