Summary_Mastering Spark for Data Science-玄幻小说

书名：Mastering Spark for Data Science
作者名：Andrew Morgan Antoine Amend David George Matthew Hallett
本章字数：86字
更新时间：2025-04-04 19:38:10

Summary

In this chapter, we walked through the full setup of an Apache NiFi GDELT ingest pipeline, complete with metadata forks and a brief introduction to visualizing the resulting data. This section is particularly important as GDELT is used extensively throughout the book and the NiFi method is a highly effective way to source data in a scalable and modular way.

In the next chapter, we will get to grips with what to do with the data once it's landed, by looking at schemas and formats.