DATABRICKS CENTER OF EXCELLENCE
SQL to Databricks Quickstart
![](png/sqlserverlogo.png)
Benefits
- Accelerate high-value analytics initiatives and gain experience with Databricks.
- Get up and running on Databricks in an accelerated timeline.
- Complete one (1) SQL workload migration and analyze with SQL analytics and BI tools.
- Blueprint accelerators speed time to value and actionable insights.
QuickStart Overview
Introduce
a data team to analytics on Databricks
Prepare
Databricks & data services, non-production environment
Complete
1 SQL workload migration
Analyze
with SQL Analytics and BI tools
Quickstart Timeline
Stage 1
Lakehouse 101
- Data acquisition
- Simple data transformations
- Organizing data
- Security
- BI, reporting
- Optimizing costs & management
Stage 2-3
Up & running
- Implementation → Powered by Infra-as-Code
- Security Config → Blueprint security rapid configuration
- Lakehouse optimization → Blueprint Lakehouse Optimizer
Stage 4-8
Data pipelines
- Identify data sources
- Historical & current data
- Data transformations
- Data quality
- Data set creation, scheduled
- Tables in the Lakehouse, ready!
Stage 8-10
BI & analysis, optimize, & roadmap
- Power BI or Tableau
- DBSQL for ad hoc analysis
- Dashboards & reports
- Utilization management
- Roadmap the future
![](png/databricks-flow.png)
Deployment journey
AWS or Azure readiness
- Account with create-rights
- Terraform installed
QuickStart data sources
- Cloud accessible services identified
- Read-only account
Scripted build
- Resource / Admin groups
- Storage, Databricks workspace, clusters
- Unity Catalog / Metastore / Permissions
- Blueprint Lakehouse Optimizer
Data ingestion pipeline
- 1 SQL workload migrated
Workflows active
- Enable job/workflows, scheduled
Databricks is LIVE!
- SQL analysis workshop
- Power BI / Tableau
- Engineering & workflow demos
Lakehouse optimized
- Monitor jobs
- Understand costs
- Identify orphaned workloads
Deliverables
Build a net-new use case and validate cost vs performance and usability of Databricks platform.
- TCO and performance projection report
- Established Databricks Lakehouse environment
- Data ingestion pipeline
- Platform utilization monitoring app
- Working end-to-end use case/business process
deployed to non-production environment - Results report and demonstration