Day 6 - Parsing Unique Characters

Solution code on GitHub

All 2022 Problems

Subscribe to Monday Morning Haskell!

Problem Overview

Full Description

With today's problem, we're looping through a string and searching for the first sequence of a certain length with unique characters. For part 1, we have to find the index where our 4 most recent characters are all unique. For part 2, this number gets bumped to 14.

Relevant Utilities

This will be the first time we use an Occurrence Map (OccMap) this year. A lot of problems rely on counting the occurrences of particular values. So I added a few wrappers and helpers to make this easy. So by using incKey, we can bump up the stored value up by 1.

type OccMap a = OccMapI a Word
type OccMapI a i = Map a i

emptyOcc :: OccMap a
emptyOcc = M.empty

incKey :: (Ord a, Integral i) => OccMapI a i -> a -> OccMapI a i
incKey prevMap key = addKey prevMap key 1

decKey :: (Ord a, Integral i) => OccMapI a i -> a -> OccMapI a i
decKey prevMap key = case M.lookup key prevMap of
  Nothing -> prevMap
  Just 0 -> M.delete key prevMap
  Just 1 -> M.delete key prevMap
  Just x -> M.insert key (x - 1) prevMap

addKey :: (Ord a, Integral i) => OccMapI a i -> a -> i -> OccMapI a i
addKey prevMap key count = case M.lookup key prevMap of
    Nothing -> M.insert key count prevMap
    Just x -> M.insert key (x + count) prevMap

In this solution, we'll also use decKey. Note that we delete the key if the count gets down to 0. This will be important in our problem!

Solution Approach and Insights

When I initially approached this problem, I made a custom data type and stored the different characters as individual elements. This worked fine for 3 characters, but it was cumbersome for 14. So I rewrote the solution more generically. We track the most recent characters we've seen in two different structures simultaneously.

First, we use a sequence to track the order we received them, so that with each iteration, we'll drop one character from the front and add a new one to the back.

We'll also use an occurrence map to track the counts for each character type in the last 4 (or 14). We'll increment a character's key when it is added, and decrement when it is removed from the front. If at any point we have 14 keys in our occurrence map, we're done!

Parsing the Input

Today we're only parsing a string:

mjqjpqmgbljsphdztnvjfqwrcgsmlb

So the parser is trivial:

type InputType = String

parseInput :: (MonadLogger m) => ParsecT Void Text m InputType
parseInput = some letterChar

Getting the Solution

To solve our problem, we're going to need one primary function to process the characters. We'll parameterize this by the number of characters we need for a unique code. If we don't have enough characters to reach the unique number, we'll log an error and return the max integer.

processChars :: (MonadLogger m) => Int -> [Char] -> m Int
processChars numCharsNeeded input = if length input < numCharsNeeded
  then logErrorN "Not enough chars!" >> return maxBound
  else ...

Now we need to initialize our structures. We'll split the input string into its first part (up to the number of unique characters) and the rest. The first characters will go into a sequence as well as our occurrence map.

processChars :: (MonadLogger m) => Int -> [Char] -> m Int
processChars numCharsNeeded input = if length input < numCharsNeeded
  then logErrorN "Not enough chars!" >> return maxBound
  else do
    let (firstChars, rest) = splitAt (numCharsNeeded - 1) input
        seq = Seq.fromList firstChars
        occ = foldl incKey emptyOcc firstChars
    ...

Now we need our recursive helper. This function will also be parameterized by the number of characters needed. The "state" for the helper will have an Int for the current index we're at in the string. We'll also have the current queue of characters, as well as the occurrence map for the counts of each character.

Now for implementation, starting with the "base" case. This function should never reach the end of the input. If it does, we'll handle this error case in the same way as above.

processTail :: (MonadLogger m) => Int -> (Int, Seq.Seq Char, OccMap Char) -> [Char] -> m Int
processTail _ _ [] = logErrorN "No remaining chars!" >> return maxBound
processTail numCharsNeeded (count, seq, occ) (c : cs) = ...

Now break off the first piece of the sequence using Seq.viewl so that we'll be able to modify the sequence later. We have another error case that should never be tripped.

processTail numCharsNeeded (count, seq, occ) (c : cs) = case Seq.viewl seq of
  Seq.EmptyL -> logErrorN "Sequence is empty!" >> return maxBound
  (first Seq.:< rest) -> do
    ...

Here's where we do the calculation. First, increment the value for our new character c. At this point, we can check the size of our occurrence map. If it equals the number of characters we need, we're done! We can return the current count value.

Otherwise we'll recurse. We add the new character to the end of the queue and we decrement the occurrence map for the character we removed.

processTail :: (MonadLogger m) => Int -> (Int, Seq.Seq Char, OccMap Char) -> [Char] -> m Int
processTail _ _ [] = logErrorN "No remaining chars!" >> return maxBound
processTail numCharsNeeded (count, seq, occ) (c : cs) = case Seq.viewl seq of
  Seq.EmptyL -> logErrorN "Sequence is empty!" >> return maxBound
  (first Seq.:< rest) -> do
    let occ' = incKey occ c
    if M.size occ' == numCharsNeeded
      then return count
      else processTail numCharsNeeded (count + 1, rest Seq.|> c, decKey occ' first) cs

And now we just plug in the call to this helper into our original function!

processChars :: (MonadLogger m) => Int -> [Char] -> m Int
processChars numCharsNeeded input = if length input < numCharsNeeded
  then logErrorN "Not enough chars!" >> return maxBound
  else do
    let (firstChars, rest) = splitAt (numCharsNeeded - 1) input
        seq = Seq.fromList firstChars
        occ = foldl incKey emptyOcc firstChars
    processTail numCharsNeeded (numCharsNeeded, seq, occ) rest

Answering the Question

Now answering the questions is quite easy. We parameterize the calls with the different length values.

processInputEasy :: (MonadLogger m) => InputType -> m EasySolutionType
processInputEasy = processChars 4

processInputHard :: (MonadLogger m) => InputType -> m HardSolutionType
processInputHard = processChars 14

solveEasy :: FilePath -> IO (Maybe Int)
solveEasy fp = runStdoutLoggingT $ do
  input <- parseFile parseInput fp
  Just <$> processInputEasy input

solveHard :: FilePath -> IO (Maybe Int)
solveHard fp = runStdoutLoggingT $ do
  input <- parseFile parseInput fp
  Just <$> processInputHard input

And we're done!

Video

YouTube Link

Previous
Previous

Day 7 - File System Shaving

Next
Next

Day 5 - Crate Stacks