Clue Digital and CompileCrew:
Highly available and scalable data science infrastructure
- | 5 minute readAutomated DevOps
Clue Digital, an ad-tech platform that helps marketers interpret their ad data, approached CompileCrew with the requirement to build a highly available and scalable JupyterHub setup. This setup aimed at enabling their users to efficiently run, scale, and share their data science workloads. By leveraging Dask Hub, a managed service that provides a scalable environment for parallel computing with Dask and JupyterHub, Clue enhanced collaborative data science efforts by facilitating notebook access for their users. In this project, we also integrate HashiCorp Vault for managing secrets, encryption keys, and other sensitive data, ensuring secure operations. Continuous integration and assessment of the project were managed through Jenkins and SonarQube.
The ClueDev project offers a robust solution for creating and managing multi-user Jupyter notebook environments. By utilizing ClueDev, teams, classes, or organizations can establish shared computing environments to collaborate on data analysis, visualization, and machine learning projects. This setup not only enhances productivity and collaboration but also ensures scalability, security, and cost-efficiency.