The Oracle Data Science Service is a fully managed, self-service platform for data science teams to build, train, and manage machine learning (ML) models in Oracle Cloud Infrastructure. This lab will introduce the Accelerated Data Science SDK, showing you how it can speed up your workflow and make you more productive. In this module, we will build a binary classification model in an effort to predict employee attrition. Using the Accelerated Data Science (ADS) SDK we will do an exploratory data analysis (EDA) to understand the nature and distribution of the data. We will visualize the data and assess the correlation between predictors. The Oracle AutoML tools will be used to perform and automatically tune Light Gradient Boosting Machine (GBM), XG Boost, Random Forest and Logistic Regression classifiers. These models will be evaluated and compared using ADS' model evaluation tools. Once the best model is selected, we will use the machine learning explainability (MLX) tools to explain the global and local behavior of the model. That is, we will see what features are important in the model using feature permutation importance, partial dependence plots (PDP), individual conditional expectation (ICE) and several other methods used to determine why the model made the prediction that it did.
Workshop Duration: 4 hours