Tuesday, April 23, 2019

Debugging awslab's aws-service-operator with go delve on vscode

Currently, I'm doing a lot of work in Kubernetes, especially around operators. One operator, in particular, I am working on is aws-service-operator from awslabs. We ran into a bug with the default behavior around the dynamodb CR. There is a bug in this cloudformation template that defaults RangeAttributeTypes into Strings, when the operator supports strings, number, bytes.


I know this is a bug, the highlighted text from the click-through clearly states the bug, but how do I verify the bug? My environment is a macbook pro with vscode using all the go tools extensions.

So let's set up the debug environment:

First I need to setup the repo itself
mkdir -p awslabs
cd $GOPATH/src/github.com/awslabs
git clone git@github.com:awslabs/aws-service-operator.git



Now let's follow the development guideline and build the environment outside of vscode (getting dep and everything working)



$> code aws-service-operator // this is an extension from vscode to call it at the command line.

Click the menu Debug, click Add Configuration. Paste below.

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Launch",
            "type": "go",
            "request": "launch",
            "mode": "debug",
            "remotePath": "",
            "port": 2345,
            "host": "127.0.0.1",
            "program": "${workspaceRoot}/cmd/aws-service-operator",
            "env": {},
            "args": ["server","--kubeconfig=/Users/dathan.pattishall/.kube/config", "--region=us-west-2", "--cluster-name=dathan-eks-cluster", "--resources=s3bucket,dynamodb,sqs", "--bucket=wek8s-dathan-aws-service-operator", "--default-namespace=system-addons"],
            "showLog": true
        }
    ]
}


Click the menu Debug and click Start Debugging. This assumes that you're using saml for aws auth, your auth is admin and has at least IAM EKSWorkerNodeRole. If you are using AWS-Admin like I am, you are good.


Now let's start debugging. Put a breakpoint at line 101 of pkg/helpers/helpers.go.  Step into

resource, err := clientSet.CloudFormationTemplates(cNamespace).Get(cName, metav1.GetOptions{})

You'll see that the application makes a call to itself to try to get the cloudformation templates you installed. If you didn't install any cloudformation template called dynamodb the default will be used:

https://s3-us-west-2.amazonaws.com/cloudkit-templates/dynamodb.yaml


This is where the bug is. The cloudformation yaml has a bug where it does not ref the Hash or Range Attribute Types and the workaround is to install a cloudformation CR.


apiVersion: service-operator.aws/v1alpha1
kind: CloudFormationTemplate
metadata:
  name: dynamodb
output:
url: "https://s3-us-west-2.amazonaws.com/a-temp-public-test/dynamodb.yaml"

output.url contains the ClouldFormationTemplate with a data field that defines the cloudformation template. I can only surmise that to make common code paths, that they will make extra API calls for reuseability, because even though the aws-service-operator has the CloudFormationTemplate, it needs to fetch it remotely due to how the code is constructed, making redundant calls. You'll see this in the debug. Make an API call to itself, then parse the YAML, then fetch the YAML from a remote endpoint.

Now what we see here is that the operator needs to pull the CR from a REST endpoint or HTTP endpoint even though it already has it defined in K8s itself.

The fix to the bug is as follows.

From:



          AttributeDefinitions:
            -
              AttributeName: !Ref HashAttributeName
              AttributeType: "S"
            -
              AttributeName: !Ref RangeAttributeName
              AttributeType: "S"


To 

AttributeDefinitions: 
        -
          AttributeName: !Ref HashAttributeName
          AttributeType: !Ref HashAttributeType
        -
          AttributeName: !Ref RangeAttributeName

          AttributeType: !Ref RangeAttributeType



Additional to this,  awslabs uses N as a value. This value means false in YAML (why I don't know). Thus for the yaml passed to create a dynamodb table you need to quote it.

So in the end to create my table I need the following yaml to create the dynamodb table which I use to test the operator.

kind: DynamoDB
+ metadata:
+   name: sample-tablename
+ spec:
+   hashAttribute:
+     name: AuthorizationCode
+     type: "S"
+   rangeAttribute:
+     name: CreatedAt
+     type: "N"
+   readCapacityUnits: 10
+   writeCapacityUnits: 10



Notice the S is quoted along with the "N" otherwise, N equates to false.






In conclusion. Delve is awesome, the operator has a bug and I was table to figure it out with this debugging method to produce this case https://github.com/awslabs/aws-service-operator/issues/181

No comments: