Logging With Aws Kubernetes Eks Cluster

May 24, 2019
Logs

EKS is the managed kubernetes offering by AWS that saves you the stress of managing your own control plane with a twist of offboarding some controls like what goes on in your control. The feature was not available when the service went GA but was recently made available recently. Here are the kinds of logs that it provides;
- API server component logs: You know that component of your cluster that validates requests, provides api rest endpoint and so on? These are the logs from the apiserver which are very critical when trying to diagnose things like why your pods are not creating, admission controller issues etc.
  E0523 03:27:22.258958 1 memcache.go:134] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
- Audit Logs: People make changes in your cluster and you want to know who, what and when. This logs gives you the ability to this. ```yaml { “kind”: “Event”, “apiVersion”: “audit.k8s.io/v1beta1”, “metadata”: { “creationTimestamp”: “2019-05-23T02:08:34Z” }, “level”: “Request”, “timestamp”: “2019-05-23T02:08:34Z”, “auditID”: “84662c40-8d4f-4d3e-99b2-0d4005e44375”, “stage”: “ResponseComplete”, “requestURI”: “/api/v1/namespaces/default/services/kubernetes”, “verb”: “get”, “user”: { “username”: “system:apiserver”, “uid”: “2d8ad7ed-25ed-4f37-a2f0-416d2af705e9”, “groups”: [ “system:masters” ] }, “sourceIPs”: [ “::1” ], “userAgent”: “kube-apiserver/v1.12.6 (linux/amd64) kubernetes/d69f1bf”, “objectRef”: { “resource”: “services”, “namespace”: “default”, “name”: “kubernetes”, “apiVersion”: “v1” }, “responseStatus”: { “metadata”: {}, “code”: 200 }, “requestReceivedTimestamp”: “2019-05-23T02:08:34.498973Z”, “stageTimestamp”: “2019-05-23T02:08:34.501446Z”, “annotations”: { “authorization.k8s.io/decision”: “allow”, “authorization.k8s.io/reason”: “” } }
```
+ Authenticator Logs: EKS uses this thing called aws-iam-authenticator to guess what? Authenticate against the EKS cluster using AWS credentials and roles. These logs contains event from these activities
```yaml
time="2019-05-16T22:19:48Z" level=info msg="Using assumed role for EC2 API" roleARN="arn:aws:iam::523447765480:role/idaas-kubernetes-cluster-idauto-dev-masters-role"
```
- Controller manager: For those familiar with kubernetes objects such as Deployments, Replicas etc; these are managed by controllers which ships with kubernetes controller manager. To see what these controllers are doing under the hood, you need these.
  E0523 02:07:55.486872 1 horizontal.go:212] failed to compute desired number of replicas based on listed metrics for Deployment/routing/rapididentity-default-backend: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
- Scheduler: This component of the control plane does what it name says, put pods on the right node after factoring a number of constraints and resources available. To see information on how this component is making its decision, check these logs.
```
E0523 02:07:55.486872 1 horizontal.go:212] failed to compute desired number of replicas based on listed metrics for Deployment/routing/rapididentity-default-backend: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
```
Enabling Logs

You can easily enable the logs in your EKS cluser console and AWS updates your cluster to enable those logs to ship to cloudwatch. The corresponding cloudwatch log group will be displayed in your console. For those using terraform to provision their cluster, you can just pass in the types of logs that you want to provision and also create the log group to ship it to.
```
resource "aws_eks_cluster" "my_cluster" {
  depends_on = ["aws_cloudwatch_log_group.eks_log_group"]
  enabled_cluster_log_types = ["api", "audit"]
  name                      = "${var.cluster_name}"
  # ... other configuration ...
}
```

Setting Up Jenkins As Code

May 8, 2019

Ok so our goal here is to deploy jenkins with the click of a button with our job configured and all. Our secret sauce for this will be jenkins configuration-as-code-plugin(JCasC) which allows you to define your jenkins setup in a YAML file or folder. The problem is, we want to use JCasC to configure jenkins but we need JCasC plugin installed ahead to be able to do that for us. Thankfully, we have a solution for that. We will be using Jenkins built-in process to install plugins.

Install plugins

workflow-aggregator:latest
blueocean:latest
pipeline-maven:latest
configuration-as-code-support:latest
job-dsl:latest

For those installing jenkins using kubernetes, you will need to update your helm values file. Now, lets crank things up;

#plugins.txt
workflow-aggregator:2.6
blueocean:1.16.0
pipeline-maven:3.6.11
configuration-as-code-support:1.14
job-dsl:1.74
workflow-job:2.32
credentials-binding:1.18
git:3.10.0

Build and Configure

jenkins:
  systemMessage: "I did this using Jenkins Configuration as Code Plugin \n\n"
tool:
  git:
    installations:
    - home: "git"
      name: "Default"
  maven:
    installations:
    - name: "Maven 3"
      properties:
      - installSource:
          installers:
            - maven:
                id: "3.5.4"
jobs:
  - script: >
      pipelineJob('pipeline') {
          definition {
              cpsScm {
                  scriptPath 'Jenkinsfile'
                  scm {
                    git {
                        remote { url 'https://github.com/mkrzyzanowski/blog-001.git' }
                        branch '*/docker-for-mac'
                        extensions {}
                    }
                  }
              }
          }
      }

These are the plugins that we are trying to install as well as how we want our jenkins setup. Here is the Docker image build that takes care of installing for us.

#Dockerfile
FROM jenkins/jenkins:lts
COPY plugins.txt /usr/share/jenkins/ref/plugins.txt
RUN /usr/local/bin/install-plugins.sh < /usr/share/jenkins/ref/plugins.txt

Once the image has been built, we need a way to let JCasC know the location of our configuration file( named jenkins.yaml in most cases).

Copy the jenkins.yaml file to /var/jenkins_home/. It looks for this file by default
Use CASC_JENKINS_CONFIG environmental variable to point to the file location and the location could be any of these;
- A file path(/my/path/jenkins.yaml)
- A folder path(/my/path/jenkins_casc_configs/)
- A configuration file URL PATH(https://example.com/git/jenkins.yaml)

For this example, I will mount the jenkins.yaml to /var/jenkins_home with docker

$ docker run --name jenkins -p -d 8081:8080 -v $(pwd):/var/jenkins_home my_jenkins_image
Running from: /usr/share/jenkins/jenkins.war
webroot: EnvVars.masterEnvVars.get("JENKINS_HOME")
May 08, 2019 12:00:19 AM org.eclipse.jetty.util.log.Log initialized
INFO: Logging initialized @612ms to org.eclipse.jetty.util.log.JavaUtilLog
May 08, 2019 12:00:19 AM winstone.Logger logInternal
INFO: Beginning extraction from war file
May 08, 2019 12:00:40 AM org.eclipse.jetty.server.handler.ContextHandler setContextPath
WARNING: Empty contextPath
May 08, 2019 12:00:40 AM org.eclipse.jetty.server.Server doStart
INFO: jetty-9.4.z-SNAPSHOT; built: 2018-08-30T13:59:14.071Z; git: 27208684755d94a92186989f695db2d7b21ebc51; jvm 1.8.0_212-8u212-b01-1~deb9u1-b01
May 08, 2019 12:00:47 AM org.eclipse.jetty.webapp.StandardDescriptorProcessor visitServlet
INFO: NO JSP Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
May 08, 2019 12:00:47 AM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
INFO: DefaultSessionIdManager workerName=node0
May 08, 2019 12:00:47 AM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
INFO: No SessionScavenger set, using defaults
May 08, 2019 12:00:47 AM org.eclipse.jetty.server.session.HouseKeeper startScavenging
INFO: node0 Scavenging every 660000ms
Jenkins home directory: /var/jenkins_home found at: EnvVars.masterEnvVars.get("JENKINS_HOME")
May 08, 2019 12:00:50 AM org.eclipse.jetty.server.handler.ContextHandler doStart
INFO: Started w.@a50b09c{Jenkins v2.164.2,/,file:///var/jenkins_home/war/,AVAILABLE}{/var/jenkins_home/war}
May 08, 2019 12:00:50 AM org.eclipse.jetty.server.AbstractConnector doStart
INFO: Started ServerConnector@5a38588f{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
May 08, 2019 12:00:50 AM org.eclipse.jetty.server.Server doStart
INFO: Started @31513ms
May 08, 2019 12:00:50 AM winstone.Logger logInternal
INFO: Winstone Servlet Engine v4.0 running: controlPort=disabled
May 08, 2019 12:00:53 AM jenkins.InitReactorRunner$1 onAttained
INFO: Started initialization
May 08, 2019 12:02:20 AM hudson.ClassicPluginStrategy createClassJarFromWebInfClasses
WARNING: Created /var/jenkins_home/plugins/job-dsl/WEB-INF/lib/classes.jar; update plugin to a version created with a newer harness
May 08, 2019 12:02:36 AM jenkins.InitReactorRunner$1 onAttained
INFO: Listed all plugins
May 08, 2019 12:02:58 AM jenkins.InitReactorRunner$1 onAttained
INFO: Prepared all plugins
May 08, 2019 12:02:58 AM jenkins.InitReactorRunner$1 onAttained
INFO: Started all plugins
May 08, 2019 12:03:09 AM jenkins.InitReactorRunner$1 onAttained
INFO: Augmented all extensions
May 08, 2019 12:03:10 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.plugins.git.GitTool.name = Default
May 08, 2019 12:03:10 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.plugins.git.GitTool.home = git
May 08, 2019 12:03:10 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.tasks.Maven$MavenInstallation.name = Maven 3
May 08, 2019 12:03:10 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.tasks.Maven$MavenInstaller.id = 3.5.4
May 08, 2019 12:03:10 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.tools.InstallSourceProperty.installers = [{maven={}}]
May 08, 2019 12:03:10 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.tasks.Maven$MavenInstallation.properties = [{installSource={}}]
May 08, 2019 12:03:11 AM io.jenkins.plugins.casc.Attribute setValue
INFO: Setting hudson.model.Hudson@4fbfd7e4.systemMessage = I did this using Jenkins Configuration as Code Plugin 

Processing provided DSL script
May 08, 2019 12:03:15 AM javaposse.jobdsl.plugin.JenkinsJobManagement createOrUpdateConfig
INFO: createOrUpdateConfig for pipeline
May 08, 2019 12:03:16 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.plugins.git.GitTool.name = Default
May 08, 2019 12:03:16 AM io.jenkins.plugins.casc.impl.configurators.DataBoundConfigurator tryConstructor
INFO: Setting class hudson.plugins.git.GitTool.home = git
May 08, 2019 12:03:16 AM io.jenkins.plugins.casc.Attribute setValue
INFO: Setting hudson.plugins.git.GitTool$DescriptorImpl@7d18607f.installations = [GitTool[Default]]
....

Here is the screenshot of our newly configured Jenkins.

Jenkins setup

Happy Automation!!!

Kubernetes Performance And Cpu Manager

May 7, 2019
So you have a workload that is CPU senstive and you want to optimize things by providing better CPU performance to your workflow, CPU Manager can help. Now, howexactly does it help you; Before we can talk about this, lets try to understand CFS(Completely Fair Scheduler) slang;

CFS Share

No, this is not like stock market share, we are talking CPU here. Think about a fixed time that everyone is trying to take a slice of. CPU Shares simply implies how much of system CPU time do you have access to.
- CPU Share: This determines your power when assigned to a CPU core under excess load. Lets say two processes(A and B) jumps on a CPU core and they both get allocated 1024 shares each(default allocation unless you change things), it means they both carry the same weight in terms of time allocation with each getting 1/2 CPU core time. Now, if we make things interesting and make process B share updated to 512, it means B gets (512/(1024+512)) = 1/3 of the CPU time. Now, one more thing to remember is that, if process A goes idle, process B can use some of that CPU time provided we only have A and B on the core.
- CPU Period: This is part of the CFS bandwidth control and it determines the what a period means to a CPU. What is a Period? Think of it as a time that represent a CPU cycle, usually, 100ms(100,000) for most system and it is expressed as cfs_period_us.
- CPU Quota: A process with 20ms(20,000) quota will get 1/5 of time during a CPU period of 100ms. So quota is basically, how much of the time slice do you get to access? You see this variable expressed as cfs_quota_us.
Ok, enough of the jargons, how does kubernetes translate a container with 100m(0.1 CPU) to shares and quota? You can see the answer below.

This kubernetes go code explains it all for those interested in how kubernetes does all these.
```
// milliCPUToShares converts milliCPU to CPU shares
func milliCPUToShares(milliCPU int64) int64 {
	if milliCPU == 0 {
		// Return 2 here to really match kernel default for zero milliCPU.
		return minShares
	}
	// Conceptually (milliCPU / milliCPUToCPU) * sharesPerCPU, but factored to improve rounding.
	shares := (milliCPU * sharesPerCPU) / milliCPUToCPU
        // for example, share := (100m/1024) * 1000 = (100/1000) * 1024 = 102.4 shares
	if shares < minShares {
		return minShares
	}
	return shares
}

// milliCPUToQuota converts milliCPU to CFS quota and period values
func milliCPUToQuota(milliCPU int64) (quota int64, period int64) {
	// CFS quota is measured in two values:
	//  - cfs_period_us=100ms (the amount of time to measure usage across)
	//  - cfs_quota=20ms (the amount of cpu time allowed to be used across a period)
	// so in the above example, you are limited to 20% of a single CPU
	// for multi-cpu environments, you just scale equivalent amounts
	if milliCPU == 0 {
		return
	}

	// we set the period to 100ms by default
	period = quotaPeriod

	// we then convert your milliCPU to a value normalized over a period
	quota = (milliCPU * quotaPeriod) / milliCPUToCPU


	// quota needs to be a minimum of 1ms.
	if quota < minQuotaPeriod {
		quota = minQuotaPeriod
	}

	return
```
CPU Manager and Scheduling

Nice that we have gone through all these terms. The question remains, how does CPU manager work to help with my CPU sensitive workload. so basically, it uses cpuset feature in linux to place containers on specific cpu. It takes slice of cpu, equals to specified requests(or limit) in the container, separates it and assign it to your container thereby preventing context switching and noisy-neighbour issue. Let’s look under the cover; it creates different pool of memory as shown below
- Shared Memory Pool: This is the pool of memory that every scheduled container gets assigned to until decision is made to move them elsewhere.
- Reserved Pool: Remember that your kubelet can reserve cpu right? Yes, those are these guys. Simply, cpu that you can not touch in the shared pool.
- Assignable: This is where containers with that meets exclusivity requirement get their CPU from. They are taking from remaining CPU left after removing the reserved pool for kubelet. Once they are assigned to a container, they get removed from the shared pool.
- Exclusive Allocations; This pool contains those cpuset assigned to containers.
The next question, who qualified for Assignable Pool? Any Guaranteed container(request = limit) with integer CPU. Yes, integer! Containers like the one below;
```
apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
  namespace: mem-example
spec:
  containers:
  - name: memory-demo-ctr
    image: polinux/stress
    resources:
      limits:
        cpu: 1
        memory: "200Mi" 
      requests:
        cpu: 1
        memory: "200Mi"
    command: ["stress"]
```
Ok, I mentioned everyone gets assigned to shared pool at first. What moves them to exclusive pool? Well, kubelet does what we call resync(configurable kubelet option) by checking the containers in the shared pool every certain period and move those who qualified to exclusive pool. That means, your pod could be in shared pool until next resync. Also, please note it is possible that kubelet or system process will be running on the exclusive CPU set because manager only guarantees exclusivity for pods.Other processes in the system, thats a not kubelet business. ###Show me the money Enough theory, let’s get dirty. How do we enable this feature on our kubelet? Just enable feature-gate CPUManager and pass in static policy. I used this in my test kubeadm setup
```
kind: KubeletConfiguration
featureGates:
  CPUManager: true
cpuManagerPolicy: static
systemReserved:
  cpu: 500m
  memory: 256M
kubeReserved:
  cpu: 500m
  memory: 256M
```
Lets go ahead and create this pod for example;
```
apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
  labels:
    app: myapp
spec:
  containers:
  - name: myapp-container
    image: busybox
    resources:
      requests:
        memory: "24Mi"
        cpu: "150m"
      limits:
        memory: "28Mi"
        cpu: "160m"
    command: ['sh', '-c', 'echo Hello Kubernetes! && sleep 3600']
```
After creating the pod on the cluster with CPU Manager feature gate enabled, it get scheduled onto node. At this point, it is a burstable pod with 153(150/1000 * 1024) share of CPU and 16000 CPU Quota(160/1000 * 100,000). You can confirm this by looking the container cgroup.
```
cat /sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/podf19e6b4b-6eb0-11e9-898e-062ad3dc4fe4/138208e13ba
73882fc0a5c06862b7b0bc7f6d3f43116d61ecf2488fae11d6004/cpu.shares 
153
cat /sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/podf19e6b4b-6eb0-11e9-898e-062ad3dc4fe4/138208e13ba
73882fc0a5c06862b7b0bc7f6d3f43116d61ecf2488fae11d6004/cpu.cfs_quota_us 
16000
```
The above pod is a burstable pod but to test CPU Manager, we need a guaranteed pod with whole number CPU. Once the pod is scheduled, the kubelet should configure our container runtime to run the pod on particular core(s) using cpuset.
```
apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-myapp-pod
  labels:
    app: myapp
spec:
  containers:
  - name: myapp-container
    image: busybox
    resources:
      requests:
        memory: "38Mi"
        cpu: "1"
      limits:
        memory: "38Mi"
        cpu: "1"
    command: ['sh', '-c', 'echo Hello Kubernetes! && sleep 3600']
```
We can see this in the container docker inspect as well as the cgroup.
```
#  docker inspect 59964c06d765 | grep -i cpu
            "CpuShares": 1024,
            "NanoCpus": 0,
            "CpuPeriod": 100000,
            "CpuQuota": 100000,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "1",
            "CpusetMems": "",
            "CpuCount": 0,
            "CpuPercent": 0,
# cat /sys/fs/cgroup/cpuset/kubepods/pod273e9d00-706c-11e9-a529-062ad3dc4fe4/59964c06d7657face0585c9db37
5d8773dcb1b351a2d7e87204e89a2e47c2b97/cpuset.effective_cpus
1
```
If you look at other burstable containers, they will be restricted to shared cores.
```
# docker inspect 6ca5ea998f0d | grep -i cpu
            "CpuShares": 153,
            "NanoCpus": 0,
            "CpuPeriod": 100000,
            "CpuQuota": 16000,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "0,2-3",
            "CpusetMems": "",
            "CpuCount": 0,
            "CpuPercent": 0,
# cat /sys/fs/cgroup/cpuset/kubepods/burstable/podd024aac6-706b-11e9-a529-062ad3dc4fe4/539b9b1c4ec3e49fa
49b84b22ff4a06058a1c0bd6db57667cb30786927d3a380/cpuset.cpus
0,2-3
```
Helpful Links

Newer »

Logs

Enabling Logs

Install plugins

Build and Configure

CFS Share

CPU Manager and Scheduling

Helpful Links