Beyond Hadoop, Apache Spark has emerged as the Big Data analytics platform of choice for many companies. And while Spark is available on Azure HDInsight as a specialized cluster type, a new Spark service from Microsoft and Databricks (the company founded by Spark's creators) has emerged.
That service–Azure Databricks (ADB)–is in public preview as of this writing and may well be in general availability by the time of this session. Geared both towards analytics and machine learning/AI, ADB lets developers work in notebooks, offline, or interactively with running clusters; and lets the notebooks execute as production jobs on a scheduled basis, starting up Spark clusters on demand and shutting them down when the work is done. This session will cover the combination of the concepts, service mechanics, and code (in Python, R and/or Scala) necessary for you to do analytics, create dashboards, and train machine learning models on Azure Databricks.
You will learn:
- About the fundamentals of Apache Spark, Spark SQL and Spark MLlib
- How to use Databricks notebooks and dashboards
- How to manage clusters, serverless pools and jobs
- How to integrate Azure Databricks with blob storage and other Azure services
- How to write Python and R code for analytics and machine learning