Big Mountain Data & SQL Saturday Sessions


Data Technologies for Developers

Big Data Introduction
  7  

"Learn the Lingo" of Data Warehousing. Big Data employees are typically paid 20-30% more than their peers. Explore the technical realities and challenges within Big Data. We will discuss Data Warehouses, Star Schemas, Data Repository Architecture, Slowly Changing Data and several common terms utilized in Big Data discussions to have you speaking intelligently about Big Data.

Level 100 - Introduction
Duration: Hour
Presenter: John Weeks


NoSQL Introduction
  4  

Understand what NoSQL is and what it is not. Why would you want to use NoSQL within your project and which NoSQL database would you utilize. Explore the relationships between NoSQL and RDBMS. Understand how to select between an RDBMs (MySQL and PostgreSQL), Document Database(MongoDB), Key-Value Store, Graph Database, and Columnar databases or combinations of the above.

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: John Weeks


Build Reactive Applications with Akka
  4  

Akka is a toolkit and runtime for building highly concurrent, distributed, and fault tolerant event-driven applications on the JVM.

Level 100 - Introduction
Duration: Hour
Presenter: Garrett Smith


Handling database evolution with Liquibase
  4  

We always keep versions of our code in a source control system, but how about the database? How can we reduce the pain of updating the database model when working on a team, or even knowing when do we need to update it? What can we do if we want to rollback to a previous version of the database? Liquibase is a pretty awesome library that will help you update, maintain and control your database model, as any other asset in your project.

Level 100 - Introduction
Duration: Hour
Presenter: Andres Arias


How to get up and running in the "cloud" in less than 20 minutes
  2  

This session will cover how to be up and running in the "cloud" (OpenShift) in less than 20 minutes. I will show how easy it is to create nodes, set up continuous integration and scale all in few minutes using OpenShift.

Level 100 - Introduction
Duration: Half Hour
Presenter: Abhishek Andhavarapu


Beyond Batch: Stream-based Processing in Hadoop
  2  

It is quite common to think of Hadoop and associate batch processing. While batch processing certainly has a place in BigData solution architectures, it’s not the only trick the elephant can do. Businesses are demanding faster results from their data and therefore the need for stream-based processing is greater than ever. In this session, we will examine common reasons for stream-based processing and explore how tools like Kafka, Samza, and Storm (and maybe Spark) can teach your batch-oriented elephant new tricks.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: Ryan Plant


Cloud-based machine learning journey with AzureML
  1  

Enjoy seeing an end-to-end demo, from getting data, developing a predictive model and experiment, to visualizing. Background: AzureML Public Preview (previously called Project Passau) is a collaborative visual development environment that enables you to build, test, and deploy predictive and social analytics solutions that operate on your data. The service and development environment is cloud-based, provides compute resource and memory flexibility, and offers zero-setup and installation woes because you work through your web browser.

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Norman Warren


Get Value from your investment in Big Data; an deep dive into analytic & visualization tools (Microsoft PowerBI, Tableau, D3)
  0  

Get a very fast demo of visualization tools used to do one of three things: 1. Change a process or behavior 2. Change a product 3. Move a metric And find out when and why you want to use these tools: -Tableau -Microsoft PowerBI (Power View, PowerPivot) -D3

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Norman Warren


Deep Dive Into Pig and Lipstick
  0  

Starting from where we left off with the UHUG meeting we'll be taking the Lipstick tool and using it to dive into some real world problem solving in such areas a joins, udfs, large scripts, and difficult data. Pig can be complex and you may find yourself fighting yourself, the query optimizer, or your data and knowing which problem to tackle will enable you to write more performant code quickly. We'll also discuss how this tool can be utilized for other members of the team including QA and production staff.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: Matt Davies


PMML - Separating Development and Deployment
  0  

Interesting data problems of real business value often include unique challenges which require custom predictive modeling solutions. These solutions typically include significant data munging and feature engineering before any predictive model can actually be built. This becomes a challenge when it is time to deploy the models to production, as they may involve custom code for data pre- and post-processing. PMML provides a standard by which common machine learning models can be represented, along with a vast array of transformations to handle most feature engineering needs. Using PMML in the predictive modeling pipeline allows organizations to separate development and deployment, allowing data scientists to use their tools of choice to build custom models, yet maintaining only one production framework, void of any custom code.

Volunteers: Mason Victors
Want to volunteer? I Can Present This!



Machine Learning

Deep Learning
  6  

Google spent over $400million for a deep learning talent acquisition: http://www.technologyreview.com/news/524026/is-google-cornering-the-market-on-deep-learning/ Learn how to get starting using deep learning for prediction. Examples that will be shown will be text prediction and digit recognition. Ben Taylor is HireVue's principal data scientist who has spent the last year working on digital interview prediction.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: Ben Taylor


Unstructured Data & Deep Learning
  3  

Many companies are trying to still figure out just how sophisticated they need to be about their data. While companies like Narrative Science ($29M raised), InsideSales.com ($143M raised), and many others aim to bring advanced analytics as a service deeply integrated into their products. Together we'll take a look at the cost/benefit of going from a simple classifier to a more complex deep learning approach to keyword relevance in a simple proof of concept around predicting and optimizing responses in email and tweets.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: David Gonzalez



Data Analytics

Three Classic Mistakes of Data Analysis
  3  

Most folks get started analyzing data and drawing inferences the moment they have some data available. Statistical tools can be very valuable in providing new information and insight, however, any one of these three mistakes can completely mislead teams, leading them astray, and drawing the wrong conclusions. So - what are the three mistakes and how do they help? That will be discussed in this session with a limited amount of interactivity.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Half Hour
Presenter: Rai Chowdhary


Resume modeling
  3  

This session will cover the full flow for doing resume prediction from candidates to scoring resumes algorithmically. This talk should help inspire individuals who are looking at doing text prediction from unstructured formats. Also, it should inspire general unstructured prediction interests. Ben is HireVue's principal data scientist and has been responsible for much of the machine learning behind digital interview prediction.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: Ben Taylor



Professional Development

Building the Big Data Community in Utah
  4  

What are the strategic and tactical steps that have been taken to grow the Big Data Community in Utah? What will be the path going forward and how do we alter the trajectory we're on to reach seemingly unachievable levels of success and become the premier Big Data Hub for the country and the world? Overview, thought discussion, and open conversation about the next steps to community stardom!

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Nick Baguley


Data Driven Culture for Business Users
  2  

In big data companies, business intelligence problems are tackled by three major groups: Data scientists, analytics, and business analysts. With big data technology developing at a rapid pace, data scientists and data analytics experts are being asked to focus on more and more on predictive and machine learning business questions. Meanwhile, business analysts, who often rely on data analytics support, are left without adequate support to solve the business problems that will impact their bottom line. Sara and Raquel will present a case study of how eBay’s “Shared Purpose” has fostered a culture of professional learning opportunities for business analysts to grow foundational technology skills. As business analysts become increasingly more versant on data discovery technologies, and even basic SQL, we can bridge the gap between the business side and the analytic side -- driving increased business productivity overall.

Volunteers: Sara Jones
Want to volunteer? I Can Present This!


Dynamics of Mob Programming
  1  

Taking Agile development to the next level. What it means for Developers, Project Managers, and Business Analysts. What value it adds to the Client.

Level 200 - (Beginner): Introductory / fast moving
Duration: Half Hour
Presenter: Pamela Bradford


Intrepreneurism and the Giant Hairball
  0  

How to escape the gravitational pull of corporate status quo, by employing an innovative approach to R & D using a combination of Entrepreneurial Project Management, tertiary level analysis, and mob programming.

Level 200 - (Beginner): Introductory / fast moving
Duration: Half Hour
Presenter: Pamela Bradford



Architecture

Emerging Patterns in BigData Solutions
  5  

What do Big Data solution architectures being implemented today have in common? Regardless of industry, business context or operating environment, and technology selection, what patterns are emerging in organizations developing Big Data solutions? In this session, we will cover five solution patterns in BigData and provide some implementation guidance on how they may or not apply to your efforts in providing BigData solutions.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: Ryan Plant


Real time analytics using Hadoop and Elasticsearch
  2  

This session will cover how to use a batch processing engine (Hadoop) and speed layer (elasticsearch) for near real time analytics. Our master record store is SQL Server. We use Hadoop to perform all the heavy computations, etl process etc. And elasticsearch as the speed layer to which the reporting API’s hit. This session will also cover on how to import data into Hadoop, perform the necessary computation and export the data to an external data source.

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Abhishek Andhavarapu


Battle of the Big Data Bands
  1  

From a technical architecture point of view, big data has so many choices and yet so little time to try them out. We've put them to the test to see how they stack up against each other. See how the following perform head-to-head: Amazon Redshift > SQL Server Tabular > On-premise MPP (Actian Matrix for this demo) > Google Big Query > Hadoop/Hive.

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Peter Nettesheim


Data Modeling on NoSQL
  0  

I'd like to do a session on modeling OLTP data on NoSQL solutions. Highlighting how modeling is different on NoSQL than it is on relational data stores and showcase some tips/tricks for taking advantage of the strengths (and embracing the weaknesses) of NoSQL as your primary data storage engine.

Volunteers: Bryce Cottam
Want to volunteer? I Can Present This!


Intro to Dimensional Modeling
  0  

Most BI tools require a dimensional model but how do you go about designing that kind of model? What's a star schema, fact or dimension table? In this session we will start with the basics of dimensional modeling including a retail dimensional model example. Then we will take a look at building the model in Oracle SQL Developer Data Modeler. This free tool from Oracle has basically all of the same features as licensed data modeling tools. By the end of this session, you should be comfortable building a simple dimensional model in Oracle SQL Developer Data Modeler.

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Michelle Kolbe


Sustained Growth: From Seed to Tree Lessons Learned
  0  

The hype around "Big Data" often equates to a feeling of urgency which causes half-baked ideas spawning projects, consuming resources, and resulting in questionable results. Just because "Big Data" solutions are mainly open source does not imply there are risks involved which must be mitigated. These may include costs, freshness of the tools, finding development resources, adjustments to development processes, and management approvals. Many enterprises find themselves not knowing where to start or down the process of tackling too much. We'll talk about some strategies which should help an organization mitigate these risks and how to take the seedling idea to mature product (and onto the forest) in a sustained growth-centric model. If you are in architecture, development, production support, or , especially, management, you'll want to attend.

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: Matt Davies


Bigger Busses and More Complexity: Simplicity with Apache Kafka
  0  

We'll take a look at Apache Kafka and how it potentially is a fit for enterprises. We'll talk about a few of the canonical examples with associated experiences as well as some novel configurations. Whether it is messaging, log aggregation, stream processing, or something beneficial in your use case, at the end you'll be able to intelligently decide whether or not this technology has a place in your architecture.

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Matt Davies



Business of Data

It Takes Three Brains to Analyze Data
  4  

It is quite important to understand how the brain functions and what impact does that have when it comes to using data to make decisions. The world is replete with examples of bad decisions being made despite the data suggesting otherwise. Participants will gain insight into why we make decision mistakes despite the availability of factual data.

Level 100 - Introduction
Duration: Half Hour
Presenter: Rai Chowdhary


Understanding Your Processes Through Operational Intelligence
  4  

Business processes can get large and complex. So much so, that no one can understand each piece of the puzzle that makes the entire process work. This talk will be on how we put into place ways to capture the events of the processes so that we can more effectively understand and troubleshoot the system. This also allows us to perform further analysis (drilling down into details of the process, time patterns to see trends in how long a certain action takes, root cause analysis, and processing load analysis). This can also help people understand the many steps that can go into a business process.

Level 200 - (Beginner): Introductory / fast moving
Duration: Hour
Presenter: Scott Heffron


Taking the "So What" Out of Hadoop
  0  

Hadoop and other "Big Data" technologies are gaining more traction within enterprises as time progresses but often these very enterprises find upper management asking tough questions like: "How did the project increase my bottom line?" "How did the project save me money?" "How will this project add to my business capabilities?" "We have it now, so what?!" During this session we'll discuss how these tools should be viewed as a means to an end rather than more more box to check off in the shiny technologies list. We'll discuss some interesting outcomes and how to really justify this new capability with stakeholders so all may look back and change the "so what" questions to "what's next?"

Level 300 - (Intermediate): Basic knowledge of subject matter is suggested
Duration: Hour
Presenter: Matt Davies


Do I Have a "Big Data" Problem?
  0  

A discussion about what exactly qualifies as a "Big Data" problem? Am I getting caught up in the hype? Am I throwing good after bad? Is this some shiny new trinket I can play with or do I actually _need_ the capabilities of Big Data tools? There is no clear cut answer as to what qualifies for a "Big Data" problem, but we'll talk about how to identify certain characteristics which lean to, and lean away from using Big Data. We'll take some of your use cases and discuss how these tools may help, or if some other tool would be better.

Level 100 - Introduction
Duration: Half Hour
Presenter: Matt Davies



Data Visualization / Story Telling

There are no presentations in this track.


Track Name

Session Name

Track

Level

Duration

Session Abstract