Background
In late 2021 a telecommunications company with a market cap measured in billions of dollars retained the services of my employer to make better use of an anomaly detection service. The anomaly detection service offered a service that identifies anomalous results from previously ingested data. The team was leveraging Databricks as a central component of their analytics platform and requested that the consulting team utilize it as well.
Solution
My team designed a process that accomplished the following:
- Automated ingestion of data via structured API calls
- Data transformation operations, written in PySpark are executed as a Databricks job
- Data loaded to external storage for consumption by a PowerBI dashboard
Lessons Learned
- Parameterization of API calls can work pending authentication methods
- Databricks is a very powerful tool and something to focus on as I begin my career
- There is a very thin line that separates
Process
Outcomes
Skills Used
- Statistical Pedagogy
- Software Engineering
- Networking and Signal Processing
- Integration of Feedback