The Data Day Seattle Schedule

This is the *final* schedule. A grid with room locations and further details will show up in the next day.

8:00 am

Registration 4th level - Grand Foyer

9:00 am

PLENARY: Maslow's Hierarchy of Needs for Databases - Charity Majors (Honeycomb) 4th level - Grand III

10:00 am

STREAMING KEYNOTE: Large-scale stream processing using Apache Kafka - Jay Kreps (Confluent) 4th level - Grand III

Elevating Your Data Platform - Kurt Brown (Netflix) 4th level - Grand II

NLP KEYNOTE: How Machine Learning is like Cycling - Michelle Casbon (Qordoba) 4th level - Fifth Avenue

Data Infrastructure at a Small Company - Melissa Santos (Big Cartel) 2nd level - Cascade I

How to Visualize Graph Data: A Developer’s Guide - Corey Lanum (Cambridge Intelligence) 2nd level - Cascade II

Lessons learned from deploying the top deep learning frameworks in production - Kenny Daniel (Algorithmia) 3rd level - Vashon

In Search of Database Nirvana – The Challenges of Delivering Hybrid Transaction/Analytical Processing - Rohit Jain (Esgyn) 4th level - Crescent

11:00 am

[Scaling Spark with a bonus preview of Spark 2.0: Structured Streaming + ML] - Holden Karau (IBM) 4th level - Grand II

TBA (Eric Sammer being his own bad self) - Eric Sammer (Rocana) 4th level - Grand III

What's Your Data Worth? - John Akred (Silicon Valley Data Science) 4th level - Fifth Avenue

word2vec, LDA, and introducing a new hybrid algorithm: lda2vec - Christopher Moody (Stitch Fix) 2nd level - Cascade I

Paths of Learning: The most effective way to learn about learning is to play among lovely graphs - Taylor Martin (O'Reilly) 2nd level - Cascade II

Data Science for the Masses: Can KNIME make the impossible possible? - Michael Berthold (KNIME) 3rd level - Vashon

NLP for the web: augmenting traditional systems with web specific features - Matthew Peters (Moz) 4th level - Crescent

12:00 pm

Lunch in the Grand Ballroom 4th level - Grand I

12:50 pm

Data Pipelines with Kafka and Spark (2 hour workshop - pt 1) - John Akred / Mark Mims / Stephen O'Sullivan (Silicon Valley Data Science) 4th level - Grand II

A Little Cassandra for the Relational Brain - Patrick McFadin (DataStax) 4th level - Grand III

Building better models faster using active learning - Nicholas Gaylord (Crowdflower) 4th level - Fifth Avenue

What does the future hold for Business Analysts in the New World? - Matthew Baird (AtScale) 2nd level - Cascade I

Graphs vs Tables: Ready? Fight. (2 hour deep dive with math) pt 1 of 2 - Denise Gosnell (PokitDok) 2nd level - Cascade II 2nd level - Cascade II

Governed Self Service analytics at eBay - Alex Liang (eBay) 3rd level - Vashon

1:45 pm

Data Pipelines with Kafka and Spark (2 hour workshop - pt 2) - John Akred / Mark Mims / Stephen O'Sullivan (Silicon Valley Data Science) 4th level - Grand II

Stuff you should know as an Advanced Cassandra user - Patrick McFadin (DataStax) 4th level - Grand III

Visualizing the Model Selection Process - Benjamin Bengfort (District Data Labs) 4th level - Fifth Avenue

Generating personalized travel recommendations from natural language queries - Melanie Tosik (WayBlazer) 2nd level - Cascade I

Graphs vs Tables: Ready? Fight. (2 hour deep dive with math) pt 2 of 2 - Denise Gosnell (PokitDok) 2nd level - Cascade II

Distilling dark knowledge from neural networks - Alex Korbonits (Remitly) 4th level - Crescent

2:40 pm

Introducing Apache Airflow (Incubating) - A Better Way to Build Data Pipelines - Siddarth Anand (Agari) 4th level - Grand II

Catching trains: Iterative model development with Jupyter Notebook - Chloe Mawer (Silicon Valley Data Science) 4th level - Grand III

Transforming Data to Unlock Its Latent Value - Tony Ojeda (District Data Labs) 4th level - Fifth Avenue

Real-time Search on Terabytes of Data Per Day: Lessons Learned - Joey Echeverria (Rocana) 4th level - Crescent

NLP @HomeAway: how to mine reviews and track competition - Brent Schneeman (HomeAway) 2nd level - Cascade I

Graph Database Engine Shoot-out: part 1 of 2 - Josh Perryman (Expero) 2nd level - Cascade II

The Algorithm Economy for Healthcare: best systems practices for data analytics - Sanjay Joshi (EMC Emerging Technologies) 3rd level - Vashon

3:20 pm

Afternoon Break - the bar opens

4:15 pm

Web Scraping in a JavaScript World - Ryan Mitchell (HedgeServ) 4th level - Grand II

How to Observe: Lessons from Epidemiologists, Actuaries and Charlatans - Juliet Hougland (Cloudera) 4th level - Grand III

Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS and to the 2016 Primary Elections - Steve Kramer (Paragon Science) 4th level - Fifth Avenue

Thinking like Spark: How trying to optimize one algorithm helped me re-think distributed data processing - Rachel Warren (Alpine Data) 2nd level - Cascade I

Graph Database Engine Shoot-out: part 2 of 2 - Josh Perryman (Expero) 2nd level - Cascade II

Open Source Lambda Architecture with Kafka, Samza, Hadoop, and Druid - Fangjin Yang (Imply) 3rd level - Vashon

TORA - eBay’s realtime data processing engine - Alex Liang / Thomas Varghese (eBay) 4th level - Crescent

5:10 pm

Extreme Streaming Processing at Uber - Hein Luu (Uber) 4th level - Grand II

[Eric Lubow's famous and scary talk about Cassandra counters - with updates - Eric Lubow (SimpleReach) 4th level - Grand III

SQL and NoSQL on MySQL - Peter Zaitsev (Percona) 4th level - Fifth Avenue

Modernizing the Fashion Industry with Data - Andy Terrel (Fashion Metric) 2nd level - Cascade I

Building Recommendations at Scale: Lessons Learned at Indeed - Preetha Appan (Indeed) 2nd level - Cascade II

Virtualizing Relational Databases as Graphs: a multi-model approach - Juan Sequeda (Capsenta) 3rd level - Vashon

Turning Unstructured Data into Kernels of Ideas - Jason Kessler (CDK Digital Marketing) 4th level - Crescent