Wednesday, October 17 • 9:00am - 9:20am
Designing automated pipelines for unseen custom data

Machine Learning applications at Salesforce use a wide variety of customer data that is highly customizable. In this talk, I discuss some challenges of designing automated machine learning pipelines that can deal with custom user data that it has never seen before, as well as some of our solutions. Examples include statistical tests between training and scoring data sets to help with the cold start problem, algorithms to throw out features that are "too good" because they are derived from the label we're trying to predict, and data-dependent feature engineering steps like automatically determining buckets for numeric variables and detecting categorical variables encoded as other data types.

Kevin Moore

Sr. Data Scientist, Salesforce
Kevin is a senior data scientist at Salesforce where he works on automated machine learning pipelines to generate and deploy customized models for a wide variety of customers and use cases. He has a PhD in astrophysics and prior to becoming a data scientist he worked on modeling how... Read More →

Wednesday October 17, 2018 9:00am - 9:20am
Horace Mann

