Kubernetes for Data Scientists

Learn Kubernetes basics that are helpful for data scientists.


You are a data scientist, not an infrastructure engineer. You think you don’t have time to learn Kubernetes. You just want to get your work done. You want to focus on data science, not on infrastructure. I used to feel the same. Then I learned Kubernetes and learned that:

  • You don’t need to become an expert or learn all of Kubernetes to enjoy a massive impact on your productivity.
  • You can learn the most important concepts for data science projects in under 10 hours.

We outline why we think learning the fundamentals of K8s is highly beneficial in this blog post. Unfortunately, most courses on Kubernetes focus on operating Kubernetes as an administrator. Those courses are not designed for data scientists. We are going to change that.

We will teach you K8s from the perspective of what you need to know to be successful as a data scientist, so you can focus on getting value quickly. Our unique skills as data scientists-turned-infrastructure engineers will allow us to teach you the most important concepts and skills you need to be successful.

Course Instructors

Hamel Husain is a data scientist and software engineer. He is currently a fastai core contributor and has been involved with several open-source data infrastructure projects such as Jupyter, Metaflow, and Kubeflow. Hamel has built data science tools and infrastructure at companies such as Airbnb, GitHub, and DataRobot.

Jeremy Lewi is the co-founder and lead engineer of Kubeflow, a popular machine learning workflow system built on Kubernetes. Jeremy also worked on the YouTube recommendation system at Google, where he honed his skills in systems for machine learning. Jeremy is currently incubating his own applied ML startup.

Hamel and Jeremy will lead the instruction of this course with the assistance of the following people:

Michał Jastrzębski is a software and infrastructure engineer at VantAI, where he enables several teams of scientists to perform computational biology on vast amounts of data. Michal is a former OpenStack Kolla project technical lead.

Zander Matheson is the CEO of Bytewax. Bytewax maintains an open-source Python framework for building real-time applications with streaming data, which runs on Kubernetes.

