Kubernetes Changelog from Audit log

Often we want to ask “what exactly changed about this resource ?” especially during or after an incident.
The answer usually is “check the audit log”.
But the audit log is very verbose and hard to scan, so here is a ruby rake task to parse the audit log and spit out a nice diff. (Customize to read from the log source of your choice)

require 'uri'
require 'cgi'
require 'time'
require 'json'
require 'hashdiff' # gem install hashdiff
require 'kennel' # gem install kennel

class Logs
  class << self
    # does not flatten arrays, but we don't need this here
    def flatten_hash(hash)
      hash.each_with_object({}) do |(k, v), h|
        if v.is_a? Hash
          flatten_hash(v).map do |h_k, h_v|
            h["#{k}.#{h_k}".to_sym] = h_v
          end
        else
          h[k] = v
        end
      end
    end

    def clean_for_diff(object, ignore_status:)
      # datadog turns labels like metadata.labels.foo.bar into a nested foo: bar hash
      object.replace flatten_hash object

      # general
      object.delete :"metadata.annotations.deployment.kubernetes.io/revision"
      object.delete :"metadata.annotations.kubectl.kubernetes.io/last-applied-configuration"
      object.delete :"metadata.generation"
      object.delete :"metadata.managedFields"
      object.delete :"metadata.resourceVersion"
      object.delete :"spec.template.metadata.creationTimestamp"

      # status
      if ignore_status
        object.delete_if { |k, _| k.start_with? "status" }
      else
        object.delete :"status.observedGeneration"
      end
    end
  end
end

namespace :logs do
  desc "show change history for a given resource by parsing the audit log CLUSTER= RESOURCE= [NAMESPACE=] NAME= [DAYS=7] [STATUS=ignore|include]"
    cluster = ENV.fetch("CLUSTER")
    resource = ENV.fetch("RESOURCE")
    name = ENV.fetch("NAME")
    namespace = ENV["NAMESPACE"]
    ignore_status = ((ENV["STATUS"] || "ignore") == "ignore")
    days = Integer(ENV["DAYS"] || "7")

    # get current version to be able to diff the latest update
    result = `kubectl --context #{cluster} get #{resource} #{name} #{"-n #{namespace}" if namespace} -o json --ignore-not-found`
    raise unless $?.success?
    if result == ""
      warn "Resource not found, assuming it was deleted"
      current = nil
    else
      current = Logs.clean_for_diff(JSON.parse(result, symbolize_names: true), ignore_status:)
    end

    # build log url
    url = <whatever your log system is>

    # say what we are looking at
    warn "Inspecting #{days} days of logs #{ignore_status ? "ignoring" : "including"} status changes."
    warn url

    # produce diff from logs
    verb_colors = { "update" => :yellow, "delete" => :red, "patch" => :cyan, "create" => :green }
    printer = Kennel::AttributeDiffer.new
    list_logs(url) do |line| # build this method for whatever your log system is
      status = line.dig(:attributes, :http, :status_code)
      next if status >= 300

      # print what happened
      verb = line.dig(:attributes, :verb)
      time = line.dig(:attributes, :requestReceivedTimestamp).sub(/\..*/, "")
      user = line.dig(:attributes, :user, :username)
      puts(Kennel::Console.color(verb_colors.fetch(verb), "#{time} #{verb} by #{user}"))
      next if verb == "delete"

      # print diff
      previous = Logs.clean_for_diff(line.dig(:attributes, :responseObject), ignore_status:)
      unless current # support looking at deleted resources
        current = previous
        next
      end
      diff = Hashdiff.diff(previous, current, use_lcs: false, strict: false, similarity: 1)
      diff.each { |l| puts printer.format(*l) }
      current = previous
    end

And you get a nice diff like this

Leave a comment