Four minute papers (inspired by fourminutebooks.com) aims to condense computing white papers down to a four minute summary.

Here is a four minute summary for Facebook’s time series database Gorilla white paper.

Premise for the paper

Back in 2013 Facebook’s monitoring system, an HBase time series db (TSDB), was not scaling sufficiently. Reads from the system were becoming too slow. Since there wasn’t a TSDB solution on the market that addressed their need of storing massive amounts of data in real-time, Facebook developed Gorilla, an in-memory TSDB.

Whats unique about Gorilla

The attributes that set Gorilla apart from other TSDBs were that its an in-memory TSDB that functions…


Recently I merged a pull request into Prometheus that adds support to create past data for new recording rules. This work was for Prometheus Issue 11 which was created back in January 2013! Only 8 years later and this issue is finally implemented.

This post explains how to use this new capability.

The term “backfill” means to fill in missing time series data for a specific time range.

Overview

Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series (ref: docs). When recording rules are created, their data…


I came across this tweet that asked what are the top 3 tools every Kubernetes cluster should have…

I liked this question a lot, however I think there are more than 3 that every Kubernetes cluster should have. Here is what I think are the basic tools that every Kubernetes cluster should be running …

Top Tools to Run in Every Kubernetes Cluster

  1. cert-manager
  2. external-dns
  3. cluster autoscaler
  4. metrics server
  5. nginx ingress controller
  6. Datadog Agent for monitoring, alerting, and log aggregation

Details About Each Tool

cert-manager

cert-manager is amazing! If you have it running in a k8s cluster, it will create/renew free TLS certs for any services that show up in the cluster.


I started using CockroachDB about four months and its been really great, I love the product. Here is a story about how I recently migrated a large database (~400GB) from PostgreSQL over to CockroachDB. This blog is a recap of my process and also of some of the tricky parts I encountered.

Setup Details

CockroachDB version: 19.2.6, Cockroach Cloud running in Google Cloud (GCP), 3 node cluster with 1.5TB capacity

Postgres version: 9.6, GCP CloudSQL

Overview, tl;dr

CockroachDB has documentation on the topic of migrating from Postgres:

The docs are helpful to show the options for…


Four minute papers (inspired by fourminutebooks.com) aims to condense computing white papers down to a four minute summary.

Here goes nothing…four minute paper for MapReduce (original white paper).

High level what is MapReduce: A programming model and an implementation for processing large data sets.

The problem MapReduce solves: Handle the complex problem of processing large datasets that are too big for one machine.

How MapReduce solved that problem: MapReduce is a relatively simple framework for programmers to accomplish the complex task of scaling work in parallel over thousands of machines.

Implementation Details: End users only need to implement two functions…


There are a bunch of different databases out there. For example, check out this database of databases created by Carnegie Mellon University. At the time of writing it shows 665 different DBMS! When a new database is created, often it’s an iteration of something that already exists, but with a spin to solve some specific problem. I think its fun to look back in history and learn the lessons from the past.

A database family tree

There are so many different databases and I’ve struggled keeping them all organized. I wanted a way to organize the options out there and better understand their historical…


Many people ask “What is the difference between a VM and a container?”, but in my opinion a more interesting question is …

“What is the difference between a process and a container?”

When containers started gaining popularity back in 2013, it was common to hear “Containers are like mini VMs”. This statement made sense because people where using containers instead of VMs, but in my opinion a more technically appropriate statement would be to say that a container is a process.

This post will describe what a process is, what a container is, and also what a VM…


Using GKE and Stackdriver Metrics

Autoscaling deployments in Kubernetes is more exciting since HorizontalPodAutoscaler can scale on custom and external metrics instead of simply CPU and memory like before.

In this post I will go over how HPAs work, whats up with the custom and external metric API, and then go through an example where I configure autoscaling an application based on external Nginx metrics.

Background

How the Horizontal Pod Autoscaler Works

HPAs are implemented as a control loop. This loop makes a request to the metrics api to get stats on current pod metrics every 30 seconds. Then it calculates if the current pod metrics…


Did you ever dream of the day where there would be free TLS certs that were automatically created and renewed when a new service shows up? Well that day has arrived. If you’ve jumped on the cool train and are running Kubernetes in production, then cert-manager is a must have. cert-manager is a service that automatically creates and manages TLS certs in Kubernetes and it is as cool as its sounds.

Here are the steps I took to get cert-manager up and running.

Overview to setup cert-manager

To get this setup in a kubernetes cluster, there are 3 main moving…


AWS + Kops

Intro

Most all communication in the kubernetes (k8s) cluster goes through the k8s api server. In order to access the k8s api server, first you need to be authorized to do so. There are a number of different authentication methods to choose from. Here I will review the steps I took to authenticate with the OpenID Connect Token method with Google as the Identity Provider (IdP). After successful authentication, RBAC (role based access control) is used for authorization to specify what actions the user could perform.

Overview:

Here are the steps I took to get authentication set up with Google OIDC and…

Jessica G

Live simply. Program stuff.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store