Thread safety operations

Intro

Asynchronous programming is difficult. Reading/writing to/from an object from within some asynchronous function can often be a source of many bugs and failures. Though there exists some concepts to help with this — such as a Struct in Swift or a value class in Kotlin, with use value semantics and are copy-on-write — you’ll still often find yourself in situations where multiple threads reading or writing a value will be a large source of bugs.

Further proposals or existing frameworks also don’t help with this issue. Coroutines, which exist in Kotlin now and are later coming to Swift with async/await, allow you to write code which looks synchronous, but is in fact asynchronous. This is fantastic for code readability, but can make it even more tricky to spot potential thread unsafe read/writes to an object. Though Swift’s concurrency roadmap (https://forums.swift.org/t/swift-concurrency-roadmap/41611) has a lot of interesting ideas to mitigate the issue outlined in this blog — such as Actors and the eventual full Actor isolation model — that’s a while away. Instead let’s focus on now, and the best way I think to picture this is through a common example.

Token Refreshing

Perhaps the most common example of this thread-safety requirement is handling refreshing an access token and retrying the failed requests. Networking stacks will (hopefully) use a group of threads to accept and handle network responses, since there’s a high chance you’ll receive multiple responses at very close intervals to each other. But this presents a problem — how do you block all the threads that require a new access token until one is retrieved? Sure, you can try blocking the thread that you received the unauthorized response on until you get a token, but there are a number of other threads that are still running. If you have 4 responses come in, each on their own thread, and you block each thread’s execution, you’ll still be refreshing the token 4 times. Instead we need a way to block across all those threads. This is where the idea of a mutex comes in play. You might have already seen this with the NSLock or NSRecursiveLock classes from Foundation or the Mutex class in Kotlin.

The idea is this: When a thread begins to execute code on an object, you lock the mutex. When another thread attempts to execute that same piece of code on that same object instance, it won’t be able to continue execution until that mutex is unlocked. Kotlin’s Mutex works well with its coroutines and provides a withLock { } convenience function which automatically locks and unlocks the mutex for you.

Example:

Let’s say you have this code (this will be a mix of Swift and Kotlin since it shouldn’t be taken as code you can copy, just pseudocode):

let currentToken = repo.currentAccessToken
let tokenFromRequest = request.authToken

if currentToken != tokenFromRequest {
    // The access token has changed, so just use that
    request.replaceAuth(withNewToken: currentToken)
    request.retry()
} else {
    // An async function which stores the result itself
    repo.refreshToken() { refreshedToken in
        request.replaceAuth(withNewToken: refreshedToken)
        request.retry()
    }
}

To break this down, we’re just getting the token the app currently has stored, checking if the token on the failed request is different (if they’re different, then the token that the app has was refreshed by another request, so replace the Authorization header with the token the app has), and retrying. If they’re the same, then try to refresh the token the app has stored, store that refreshed token, and retry the request.

If we have 4 threads run this code, we’ll refresh 4 times, which leaves us with 3 lingering access tokens that we’ve lost a reference to (which is at best a mild security issue, depending on what access to the API the token gives you). A bad actor could potentially use those tokens to access a user’s account until it expires.

So, let’s block execution of this code so that a refresh only happens once. We’ll add a Mutex (or NSRecursiveLock — the ideas are the same), lock it, execute our code and then unlock.

runBlocking {
    mutex.withLock {
        let currentToken = repo.currentAccessToken
        let tokenFromRequest = request.authToken

        if currentToken != tokenFromRequest {
            // The access token has changed, so just use that
            request.replaceAuth(withNewToken: currentToken)
            request.retry()
        } else {
            // An async function which stores the result itself
            repo.refreshToken() { refreshedToken in
                request.replaceAuth(withNewToken: refreshedToken)
                request.retry()
            }
        }
    }
}

Now, runBlocking blocks the particular thread from further execution, but the mutext.withLock will lock the mutex as the first thing that happens before any of our code is executed. When another thread attempts to run this same code, it will be forced to wait until the lock is released by the other thread, so long as each thread is using the same object instance. Now we can break down what will happen:

You get 4 requests which fail with an unauthorized response. The first request acquires the mutex’s lock, sees the token in the request is the same as the token the app has, and so refreshes the token. After this, it releases its lock on the mutex. The next thread can now access the mutex and lock it. It will see that the token on the request is different (since the first thread just refreshed it) and so will just replace the auth header rather than refreshing the token. This will be the same for each subsequent failed response.

This concept of locking across threads is a vital concept to understand in an asynchronous world where you have to opt-in to asynchronous programming. There are many other ways to work with asynchronous programming (e.g. DispatchQueue, AtomicBoolean, etc) that are all used in rare circumstances, but are vital to know about when you do need them.

One thought on “Thread safety operations”

Leave a comment