The course offers an overview of the features and challenges behind Big Data problems, applications and systems. Starting from the so-called 5 Vs of Big Data (volume, velocity, variety, variability, and value), the course focuses on the most common framework, Hadoop, and the next generation systems such as Spark, showing the differences between a traditional Database Management System and a Big Data Management System. The course will also introduce a spatial extension of Hadoop.
• Introduction to the course
• The MapReduce programming paradigm and Apache Hadoop
• Apache Spark
• The Hadoop Ecosystem
• SpatialHadoop: a spatial extension to Apache Hadoop
• Advanced Indexing and Partitioning in Hadoop
• DBMS for Big data
o Relational and Non-relational databases for Big Data
o Mongo DB: an example of NO-SQL dbms
• Challenges in the Big Data Era
The course will cover both theoretical and practical aspects.
CSS e script comuni siti DOL - frase 9957