Does ActionCable Smell Like Rails?

Not long ago, Rails got ActionCable. ActionCable is an interface to WebSockets and (potentially) other methods of turning a normal sent-to-browser web page into a two-way connection that can keep exchanging data. There have been a lot of these attempts over the years - WebSockets, Server-Sent Events, Comet and Server Push (HTTP1 and HTTP2) are all protocols to do that. There have been many Ruby implementations of these. Good old AJAX was probably the first widespread attempt. I'm sure there are some I missed. It's been the "near future" for as long as there's been a present.

Do you already know why we want WebSockets, and how bad most solutions are? Scroll down to "ActionCable and Convenience" below. Or heck, scroll wherever you like - I'm not the boss of you.

Yeah, But Why?

One big question is always: do we need this? What good does it do us?

Let's look at one-way updates (server-to-browser) separately from two-way.

The "hello, world" of one-way interactive web pages is the page update. When Twitter or Facebook shows you how many updates are waiting, or pops up new posts, those are interactive web pages. In order to tell you what's waiting for you, or to deliver it, they have to interact with a page that's already in your browser. You can also show site news banners that change, or pop up an alert like "site going down for maintenance at midnight!" The timing is often not that important, so a little delay is usually fine. AJAX is the usual way to do this, or Server-Sent Events can be a bit more efficient at the cost of some complexity.

See that "99+" up there? For you to know how many notifications are waiting, Twitter has to tell your browser somehow.

See that "99+" up there? For you to know how many notifications are waiting, Twitter has to tell your browser somehow.

The "hello, world" of two-way interactive web pages is the chatbox. From a simple "everybody sees everything" single-room chat to something more complex with sub-areas and filtering and private messages, you have to have two-way interaction to make chat work. If you put intercom.io or similar customer service chatbots on your site, they have to use two-way interaction of some kind. Any kind of document collaboration (e.g. Google Docs) or multiplayer game needs it. Two-way interaction starts to make sense quickly if you're doing something that people update together - auction sites, for instance, or fast-updating financial markets (stock trading, futures markets, in-game auction houses.) You can do two-way interaction with AJAX but it starts to get inefficient quickly, and the latency can be high. Long-polling (e.g. Comet, SSE) can help, or you can work around the browser with plugins (e.g. Flash or Unity.) This is the situation WebSockets (and thus ActionCable) were really designed for.

There are one-way updates in the browser-to-server direction, but forms and AJAX handle those just fine. They are often things like form submit, error reporting or analytics where the user takes action, but there's only a simple acknowledgement -- "yup, you did that, the server saw it and you can get back to what you were doing."

So What's Different With Interactivity?

In old-style HTTP, the browser gets an HTTP page. It may already be cached, so the browser may not use the network at all! Some pages can work entirely offline. The browser can keep requesting pages or resources from the server when and if it wants to. That may fail - the network connection isn't guaranteed. But the server can't interfere with most of this. The server can't send anything it wasn't asked for. The server may not even know that this is all going on. If the HTML page came from cache, the server probably has no idea that any of this is happening. In a normal web framework, the request gets served and the server moves on to other requests without a backward glance.

If you want interactivity with a constant connection to the server, that changes everything. The server needs to sit and spin, holding the connection open. It needs to see what other connections are doing and figure out what to send to everybody else every time you do something. If Bob sends a chat message, it may go out to 250 other people who have a certain page open.

The Programming Model

One big difficulty with two-way interaction is that web servers and frameworks aren't usually designed to handle it. If you keep a connection open all the time and you can send to it at random times... How does that look to the programmer? How does it work? That's new for Rails... And Sinatra. And nearly any Ruby framework that isn't using EventMachine. So... nearly all of them.

It's not just WebSockets that have this problem, incidentally. HTTP/2 adds server push, which also screws up all the frameworks... But not quite as much as WebSockets does.

In fact, very few application servers (e.g. Puma, Passenger, Unicorn, Thin, WEBrick) are designed for pushing from the server either. You can hack long-polling or server push on top of NGinX or Apache, but... that's not how nearly anybody uses them. You really want an evented server. There are some experimental Ruby evented webservers. But mostly, that's not how you write Ruby web applications. Or nearly any other web applications, in any language. It's an inefficient match for HTTP1 apps, but it's the only reasonable match for HTTP/2 apps with much two-way interactivity.

Node.js folks can laugh at the rest of us here. They do evented web applications all the time. And even they have a bit of a rough road to HTTP/2 and WebSocket support, because their other tools (reverse proxies, caches, etc) are designed in the traditional way, not the HTTP/2 way.

The whole reason for HTTP's weird, hacked-on model of sessions and cookies is that it can't keep a connection to the server or repeatedly identify individual clients. What would we do if it could? Maybe we'll find out.

But the fact remains that it's weird and hard to combine a long-running stateful server with lots of connections (WebSockets) to a respond-and-forget stateless web server (NGinX, Apache, Puma, Passenger, etc.) in the current day and age. They're not designed to work that way.

I only know of one web server that was ever designed with this in mind: Zed Shaw's Mongrel2. It's a powerful, interesting model for a server architecture. It's also rough around the edges, very raw and not used in production by anybody I know of. I've tried to get it running with Ruby web apps, and mostly you rapidly discover that nobody else designs anything that way, so it's hard to use with anything you don't write from scratch. There are bespoke frameworks inside big tech companies (e.g. LinkedIn, Facebook) that do the same thing. That makes sense. They'd have to, even with their current architecture. The rest of us have a long road to get there.

ActionCable and Convenience

So: Rails bundled WebSocket support into recent versions. Problem solved, right? Rails is awesome about making things convenient, so we're golden?

Yes... and no.

I've worked with the old solutions to this like Faye and Juggernaut. There is absolutely no question that ActionCable is a smoother experience. It starts the extra server automatically. It includes all the extra pieces of software. Secure sockets work approximately out-of-the-box, maybe, with a bunch of ugly caveats that aren't the server's fault. But they're ordinary HTTPS caveats, not really new ones for WebSockets. Code reloading works, kind of. When you hit "reload" in your browser the classes reload at least, like, 80% of the time. I mean, unless you have connections from multiple browsers or something else unreasonable.

Let's be clear: for hooking WebSockets up to a traditional web framework this is a disturbingly smooth experience. That's how bad the old ways of doing this are. Many old problems like "have I restarted both servers?" are nearly 100% solved in development, and are only painful in production. This is a huge step up.

The fact that code reloading works at all, ever, is surreal. You can nearly always get it to work by restarting the server and then hitting reload in the browser. Even that wasn't really reliable with older frameworks (ew, browser caching.) Thank you, Rails asset pipeline! And there's only one server in development mode, so you don't have to bring down multiple server-side processes and reload your browser.

This stuff used to be incredibly bad. ActionCable brings the pain level down to something a smartphone-app programmer would call tolerable.

(I'm an old phone operating system programmer. We had to flash ROMs every time. You kids don't know how good you have it. Get off my lawn!)

But... Does It Smell Like Rails?

Programming in Rails is a distinctive experience. From the controller actions to the views to the config files, you can just glance at a chunk of Rails code to figure out: yup, this is Ruby on Rails.

For an example of "it's not the web, but it smells like Rails," just look at ActionMailer. It may be about sending email, but it uses the same controller actions and the views feel like Rails. Or look at the now-defunct ActiveResource, which brings an ActiveRecord-style API to remote procedure calls. Yeah, okay, it was kind of a bad idea, but it really looks like Rails.

I've been trying to figure out for awhile: is there a way to make ActionCable look more like Rails? I'll show you what I've found so far and I'll let you decide.

ActionCable kind of wants to be structured like Rails controllers. I'll use examples from the ActionCable Rails Guide, just to make sure I'm not misrepresenting it.

Every user, when they connect to your site, gets a single ApplicationCable::Connection object:

# app/channels/application_cable/connection.rb
module ApplicationCable
  class Connection < ActionCable::Connection::Base
    identified_by :current_user
 
    def connect
      self.current_user = find_verified_user
    end
 
    private
      def find_verified_user
        if verified_user = User.find_by(id: cookies.encrypted[:user_id])
          verified_user
        else
          reject_unauthorized_connection
        end
      end
  end
end

You can think of it as being a bit like ApplicationController. Of course, all your Channels inherit from ApplicationCable::Channel, which is also a bit like ActionController. In the Guide there's not much in it. In fact, it can be very problematic to have many methods in it. I'll talk about why a bit later.

Then you have actual Channels inherited from ApplicationCable::Channel, which correspond roughly to pub/sub subscriptions, and then you can subscribe to one or more streams on each channel, which are sort of subchannels. This is explained poorly, but it's not too complicated once you figure it out. ActionCable wants one more level of channel-ness than most pub/sub libraries do, but it still boils down to "listen on channel names, get messages for those channels." ActionCable's streams are what every other Pub/Sub library would call a channel, approximately, and ActionCable's Channels are a namespace of streams.

It's clearly trying to look like a Rails controller:

# app/channels/chat_channel.rb
class ChatChannel < ApplicationCable::Channel
  def subscribed
    stream_from "chat_#{params[:room]}"
  end
 
  def receive(data)
    ActionCable.server.broadcast("chat_#{params[:room]}", data)
  end
end

But then that starts breaking down...

Where ActionCable Smells Funny

ActionCable actions are a bit like Rails controllers, and they use a somewhat different API than every other Pub/Sub library to get it. That's not necessarily a problem. In fact, that's fairly Rails-like, as far as it goes.

An ActionCable action can only be called from a browser, though. So there's two fairly distinct forms of sending: browsers send actions, while the server sends "broadcasts", which can also be sent from the server, including from actions. That's a bit like Rails controller actions, though it's a bit unintuitive. And then the server sends back JSON data, like a JSON API.

Then, JavaScript receives the JSON data and renders it. Here's what that looks like on the client side:

# app/assets/javascripts/cable/subscriptions/chat.coffee
# Assumes you've already requested the right to send web notifications
App.cable.subscriptions.create { channel: "ChatChannel", room: "Best Room" },
  received: (data) ->
    @appendLine(data)
 
  appendLine: (data) ->
    html = @createLine(data)
    $("[data-chat-room='Best Room']").append(html)
 
  createLine: (data) ->
    """
    <article class="chat-line">
      <span class="speaker">#{data["sent_by"]}</span>
      <span class="body">#{data["body"]}</span>
    </article>
    """

This should look like exactly what Rails encourages you NOT to do. It's ignoring everything Rails knows about HTML in favor of a client-side framework, presumably separate from whatever rendering you do on the server side. You can't easily use Rails view rendering to produce HTML, and no Rails controller is instantiated, so you can't use things like render_to_string. There's probably some way to patch it in, but that doesn't seem to be an expected choice. It's certainly not done by default, and at a minimum requires including Rails' internal modules into your Channels.

Also: please don't include random modules into your Channels, because every public method on every Channel object becomes publicly callable via JavaScript, so that's a really unsafe thing to do. So I would absolutely not recommend you include the Rails modules for view rendering into your Channels, because I would not recommend including anything you don't control into your channels. You need near-absolute control of your set of public methods for security reasons. Exposing public calls to whatever Rails has exposed from internal modules as a public API isn't a good idea even if you love Rails views (and I do.)

I'm not saying it's hard to do it in a more Rails-like way. You can build a simple template renderer using Erubis for this if you want. I'm saying ActionCable won't do this for you, or particularly encourage you to do it that way. The documentation and examples all seem to think that ActionCable should ship JSON around and your JavaScript/CoffeeScript client code should do the rendering. To be fair, that's likely to be more space-efficient than shipping around HTML, especially because WebSocket compression is fairly primitive (e.g. permessage-deflate). But compromising the programmer API for the sake of bandwidth fails the "smells like Rails" sniff test, in my opinion. Especially because it's entirely possible to have the examples mostly use the server-rendering-capable flavor and mention that you can convert to JSON and client rendering rather than vice-versa.

Here's what a very simple version of that Erubis template rendering might look like. It's not difficult. It's just not there by default:

def replace_html_with_template(elt_selector, template_name, locals: {})
  unless template_name["."]
    template_name += ".html.erb"
  end
  filename = File.join("app/views", template_name)
  tmpl = Erubis::Eruby.new(File.read filename)
  replace_html(elt_selector, tmpl.evaluate(locals))
end

(The replace_html method above needs to be a broadcast to the client, in a format the client recognizes. Also, don't make this a public method on the Channel object - I actually wrap the Channel object with a second object to avoid that problem.)

In general, ActionCable feels simple and "raw" compared to most Rails APIs. Want HTML escaping? Ask for it explicitly. Curious who is subscribed to what? Feel free to track that yourself. Wonder what some of the APIs do (e.g. identified_by)? Read the source code.

That may be because it's an immature API. ActionCable is fairly recent, as Rails APIs go. Or it may be the ActionCable isn't going to be as polished. WebSockets are a thing you add to an existing app for better performance, as a rule. So maybe they're being treated as an advanced feature, and the polish isn't wanted. It's hard to tell.

A Few More Details on How It Works

Before wrapping up, let's talk a bit more about how ActionCable does what it does.

If you're going to have a bunch of connections held open, you're going to need network sockets for it and some way to do work for those connections. ActionCable handles this with a thread pool, with four workers by default. If you're messing with the database in your ActionCable code, that means you should increase your database connection pool by that number of workers to avoid running out.

ActionCable eats a ton of memory to do its thing. That's fine - it's the best Ruby environment for WebSocket development, and you didn't start using Rails because it was the fastest or smallest - if you wanted that, you'd be writing bespoke C binaries for CGI scripts, or maybe assembly language.

But when you're looking at deploying to a production environment with a reasonable number of users, you may want to look for a more efficient, API-compatible solution like AnyCable. You can find a RubyKaigi talk by Vlad Dementyev about that and how it compares, if you'd like more information.

While ActionCable uses only a single server in development mode, it's assumed you'll want multiple in production. 

So What's the Takeaway?

ActionCable is a great way to reduce the pain of WebSockets development compared to the older solutions out there. It provides a relatively clean development environment. The API is unusual - not quite Rails, not quite standard Pub/Sub. But it's decently clean and usable once you get used to it.

A Rails API is often carefully polished. They tend to be hard to misuse with strong indicators of how to do things The Rails Way and excellent documentation. ActionCable isn't there yet, if that's where they're headed. You'll need more sophistication about what ActionCable builds on to avoid security problems - more like old-style Rails routing or controllers, less like heavily-secured Rails APIs like forms, HTML escaping or forgery protection.