Price
$240
Course Type
Online
Duration
7 hours
Date
Various dates throughout the year
Entry Requirements
All Levels

About this course

Working with Big Data: Infrastructure, Algorithms, and Visualizations LiveLessonspresents a high level overview of big data and how to use key tools to solve your Data Organization challenges. This introduction to the three areas of big data includes:

¢ Infrastructure - how to store and process big data
¢ Algorithms - how to integrate algorithms into your big data stack and an introduction to classification
¢ Visualizations - an introduction to creating visualizations in JavaScript using D3.js

The goal was not to be exhaustive, but rather, to provide a higher level view of how all the pieces of a big data architecture work together by learning Big Data Management.

Paul Dixis the author of Service Oriented Design with Ruby and Rails. He is a frequent speaker at conferences and user groups including Web 2.0, RubyConf, RailsConf, The Gotham Ruby Conference, and Scotland on Rails. Paul is the founder and organizer of the NYC Machine Learning Meetup, which has over 2,900 members. In the past he has worked at startups and larger companies like Google, Microsoft, and McAfee. Currently, Paul is a co-founder at Errplane, a cloud based service for monitoring and alerting on application performance and metrics. He lives in New York City.

Lesson 1:

InUnstructured Storage and Hadoopyou learn how to set up Hadoop, load files into the Hadoop File System (HDFS), and write your first map reduce job.

Lesson 2:

InStructured Storage and Cassandrayou will set up Cassandra, learn how to model data in Cassandra's column oriented storage, use Cassandra from a Ruby library, and write data into Cassandra from a Hadoop map reduce job.

Lesson3:

Real Time Processing and Messagingis about real-time processing with messaging systems. Specifically, you will learn about Kafka, an open-source distributed messaging system. You'll install Kafka, read and write data from the messaging server, write data into Hadoop, and learn how to implement highly available and scalable message consumers.

Lesson 4:

Working with Machine Learning Algorithmsintroduces you to machine learning and the k-nearest neighbors algorithm. In it you will implement k-nearest neighbors, prepare raw text for use with machine learning algorithms, and make predictions using k-nearest neighbors.

Lesson 5:

InExperimentation and Running Algorithms in Productionyou learn how to test the accuracy of machine
learning models and how to integrate them into running big data architecture.

Lesson 6:

Basic Visualizationsteaches you about D3, a JavaScript toolkit for creating visualizations. In it, you will write a map reduce job to prepare raw data for use in visualizations and then use that data to create a bar chart and a time series.

</p>

What are the requirements?

  • General beginner to intermediate programming knowledge.

What am I going to get from this course?

  • Over 36 lectures and 7 hours of content!
  • The goal of these LiveLessons is to touch on the various aspects of big data at a high level. For instance, instead of going into detail on the many nuances of Hadoop, we'll just get it set up and use it in conjunction with other tools like Cassandra. Further, we'll go into how to work with and integrate algorithms and close out with tools for visualization. We'll be going through the full-stack to see how the different pieces of a big data system fit together.

What is the target audience?

  • Beginners looking for an introduction to the different concepts and technologies in big data.
Enquire now

Enquire now