Voxxed Days Berlin has ended
Back To Schedule
Wednesday, January 27 • 09:00 - 17:00
Introduction to Big Data and Apache Spark

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Spark (http://spark.apache.org) is currently the fastest growing project in Big Data environment. It allows processing Big Data sets faster and easier than in the existing solutions. This workshop will jump-start you into working with Spark and help in transition from analyst or developer to Big Data engineer.

  • Introduction to Big Data

    • Definition

    • What is Big Data?

    • History of Big Data

    • Big Data problems

  • Apache Spark

    • Introduction

    • History

    • Spark vs Hadoop

    • Resilient Distributed Datasets (RDDs)

    • Architecture

    • Operation variants

    • Administration

  • Spark Core

    • Introduction

    • Java vs Spark vs Python

    • Connecting to cluster

    • Dataset distribution

    • RDD operations

    • Shared variables

    • Execution and testing

  • Spark SQL

    • Introduction

    • Spark SQL vs Hive

    • Basic operation

    • Data and schema

    • Queries

    • Hive integration

    • Execution and testing

Register here for just 100 EUR.


Jakub Nowacki

Jakub is University of Bristol graduate where he obtained PhD in Engineering Mathematics. On the daily basis he utilizes his analytical and development skill working in software development. He is mostly interested in distributed processing and analysis of big data sets. Jakub originally... Read More →

Wednesday January 27, 2016 09:00 - 17:00 CET
Park Inn Alexanderplatz - Room Virchow 1

Attendees (6)