Home | Learning Snippets
This is the start of a series of posts about two up-and-coming technologies: HTTP/2 and gRPC. Both are closely related and slowly superseding HTTP/1.1.and conventional REST APIs respectively. In this first post, I would like to give a brief introduction to HTTP/2 and how we can enable Golang clients and servers to talk HTTP/2 over both encrypted and unencrypted connections.
When does a pod get evicted, and when does a pod get OOMKilled?
Recently I was working with configmaps to make dynamic configuration changes that would then affect the pod at runtime without restarting it. For this purpose, I mounted the configmap as volume and used a filewatcher to register changes. It was then when I noticed that k8s configmaps store their data in an interesting fashion.
Webhooks are a way to “intercept” requests to the k8s apiserver on their way from the apiserver’s HTTP handler to etcd persistency. There are two types of webhooks:
- Admission webhooks (includes mutating and validating webhooks)
- Conversion webhooks
In my previous post Writing Controllers for Kubernetes Custom Resources I explored the development process using pure client-go and how it differs from using controller-runtime (and kubebuilder). In this post, I explore how controller-runtime (v0.7.0) uses concepts we know from client-go.
Kubernetes has become omnipresent. Whether you’re part of a development team looking to deploy highly available apps or part of a data science team looking to run machine learning workloads in a scalable way - Kubernetes is often the platform of choice. The ecosystem around Kubernetes has grown considerably, and last year I used a project called Kubeflow a lot. (Kubeflow offers features such as distributed training and workflow orchestration, all running on top of Kubernetes.)
Sometimes, I would take a peek into the cluster to see what was going on, and one of the things I noticed were Kubeflow’s Custom Resource Definitions (CRDs) and their respective controllers. For example, when you create a recurring Kubeflow Pipeline job, you actually create a custom resource of type ScheduledWorkflow
under API group/version kubeflow.org/v1beta1
(you can see this easily with kubectl get scheduledworkflow.v1beta1.kubeflow.org
). All changes made to this resource are observed by a controller, which is basically a control loop running on Kubernetes that reacts to these changes.
Before an algorithm can be applied for sentiment classification, texts need to be vectorized. There are several different ways to do so - in this post, I try out TF-IDF.
Two weeks ago, I attended PyCon.DE. It was an amazing experience; the talks and the people I met there have become an additional source of motivation and inspiration for me. From the people who worked as data scientists, I heard a lot about containerization or working with open source projects like Dask or Arrow. Training models in Jupyter Notebook is only a tiny part in a data scientist’s daily work - handling infrastructure requirements is often more important. Below is a list of talks about topics I would like to try out in the future or read about in more detail:
The first talk introduced NLP library spaCy, which I chose to try out in my next project. Coincidentally, the Yelp company booth was advertising their Yelp Dataset Challenge, and one of their available datasets contained user reviews. And so, my first NLP project was born: Sentiment analysis based on Yelp review data.
My colleagues organized a darts tournament at around the same time I watched the CS231n lecture about reinforcement learning. Of course, this could only end in one way - in a small reinforcement learning toy example.
Part 14 of Stanford’s CS231n lecture was about Reinforcement Learning. The content of the slides was very dense, so I looked up a couple of concepts to understand the concepts better.