Happy Eyeballs: Structured Concurrency's Hello World

sustrik · February 10, 2019, 10:11am

When explaining structured concurrency to newbies, but as well when the different ways to implement it Happy Eyeballs protocol (RFC 8305) has roven to be very instructional. At the moment there’s at least Trio and libdill version out there.

The nice thing is that it exhibits all the aspects of structured concurrency (cancelation, error handling, different kind of async tasks etc.) which still remining relatively simple.

Another nice feature is that it comes with “easy” and “hard” mode. When proving your initial implementation of structured concurrency you can start with with a stripped down version which does does the DNS query synchronously, then does actual connecting in asynchronous manner. When successful you can switch to the hard mode, where the DNS queries are done asynchronously, as described in the RFC.

This post is not meant to propose anything specific. It just to point out that in Happy Eyeballs we have a kind of common gauge, an example problem we can use to compare different ideas: How easy it is to implement Happy Eyeballs with each proposed solution?

alanb · February 13, 2019, 8:27pm

Happy Eyeballs is indeed good algorithm to try to implement.

The javadoc for the FiberScope API in Project Loom includes a number of code samples, one of which is described as:

“The method is called with an array of socket addresses and returns a SocketChannel connected to one of the addresses. The connection attempts are staggered. A fiber is scheduled to connect to the first socket address and if the connection is not established within a certain time then another fiber is scheduled to try to connect to the next socket address. The staggered attempts continue until a connection is established, none of the connections succeed, or the deadline is reached. In the event that several connections are established then all but one are closed so that the method does not leak resources.”

This may be another interesting example to explore.

sustrik · February 13, 2019, 10:00pm

Yes, that sounds very much like the simple version of Happy Eyeballs. Can you link to the sample? I would like to put the three implementations side by side and see whether anything interesting could be learned from that.

mauijerry · August 11, 2019, 1:55am

The Happy Eyballs RFC wasnt linked find it here
Nathaniel noted it was updated in 2018

It would be good to have a link to a simple python-trio implementation.

njs · August 11, 2019, 2:39am

In my PyCon talk last year I walked through a Trio implementation of Happy Eyeballs, which I think is where the idea of using it as an example comes from: https://www.youtube.com/watch?v=oLkfnc_UMcE

You can also look at Trio’s source to see a slightly more “production ready” version, that’s also had some simplifications since the talk: https://github.com/python-trio/trio/blob/master/trio/_highlevel_open_tcp_stream.py

jab · October 9, 2019, 6:23pm

Recently came across https://cljdoc.org/d/missionary/missionary/b.11/doc/readme/guides/happy-eyeballs which (links to Nathaniel’s live coding demo! and) offers a more functional approach to solving this problem, using the author’s “missionary” library. (More discussion about missionary here: https://clojureverse.org/t/missionary-new-release-with-streaming-support-design-notes/)

Seemed worth a link from this thread. Would be interested to see what people here think of the functional approach

lerencao · January 19, 2020, 2:03pm

I implements a simple version in rust based on task_scope.
Happy eyeballs is indeed a good example to demonstrate structured-concurrency.

adamw · February 4, 2020, 8:27am

I’ll add my version using Scala & ZIO: https://blog.softwaremill.com/happy-eyeballs-algorithm-using-zio-120997ba5152

leonoel · February 5, 2020, 10:32am

Your solution is concise, pure, type-safe, and wrong. When two sockets concurrently succeed to connect, the loser of the race is discarded and stays opened for an undefined duration.

adamw · February 5, 2020, 4:47pm

As we talked on twitter - yes, there was a problem with unclosed sockets. Fixed now, the solution isn’t very elegant, but I think works fine. Thanks for spotting the error.

leonoel · February 5, 2020, 5:23pm

I actually spent a fair amount of time myself trying to model this algorithm declaratively. The solution I eventually preferred involves a shared dataflow variable to model the race explicitly, such that each attempt can know if the socket must be kept open. I’m not sure it’s the most elegant solution so I’m definitely interested knowing if you end up with something better with ZIO !

Moral of the story : happy eyeballs is really an exciting problem.

adamw · February 6, 2020, 8:00am

I think I’ve come up with a better solution: embracing the fact that multiple sockets might end up being open. We run the original HE algorithm as-is (which manages the timing, concurrency, races), but modify the effects that are run so that they collect all successfully produced values.

Then, on top of this we build a second stage, which creates a queue for all successfully produced values, runs HE, and closes all but one (which is returned):

object ReleasableHappyEyeballs {
  def apply[R, T](
      tasks: List[ZIO[R, Throwable, T]],
      delay: Duration,
      releaseExtra: T => ZIO[R, Nothing, Unit]
  ): ZIO[R with Clock, Throwable, T] =
    for {
      successful <- Queue.bounded[T](tasks.size)
      enqueueingTasks = tasks.map { t =>
        t.onExit {
          case Success(value) => successful.offer(value)
          case Failure(_)     => ZIO.unit
        }
      }
      _ <- HappyEyeballs(enqueueingTasks, delay)
      // there has to be at least one, otherwise HE would fail
      first :: other <- successful.takeAll
      _ <- ZIO.foreach(other)(releaseExtra)
    } yield first
}

Note that .onExit is guaranteed to be run if a value is produced, which is crucial taking into account possible interruption.

njs · February 6, 2020, 8:35am

I enjoyed your post; thanks for sharing! And I was also wondering about socket cleanup, so it’s nice to see that discussion. Though to be fair, I also skipped this detail in my talk, for time reasons

Your revised version here is conceptually similar to how Trio’s real implementation handles this: we keep a collection of all the open sockets, and then at the end close all the ones that are still open:

github.com

python-trio/trio/blob/ae557cc16a893cce0505da8a0b30122a727470f2/trio/_highlevel_open_tcp_stream.py#L287


        winning_socket = sock
        nursery.cancel_scope.cancel()
    except OSError as exc:
        # This connection attempt failed, but the next one might
        # succeed. Save the error for later so we can report it if
        # everything fails, and tell the next attempt that it should go
        # ahead (if it hasn't already).
        oserrors.append(exc)
        attempt_failed.set()


with close_all() as open_sockets:
    # nursery spawns a task for each connection attempt, will be
    # cancelled by the task that gets a successful connection
    async with trio.open_nursery() as nursery:
        for *sa, _, addr in targets:
            # create an event to indicate connection failure,
            # allowing the next target to be tried early
            attempt_failed = trio.Event()


            nursery.start_soon(attempt_connect, sa, addr, attempt_failed)

The version that’s currently in your post is interesting:

    // the interrupt might have happened when the socket is already 
    // established. 
    // In this case, we need to cleanup and close the socket.
    effectBlocking {
      socket = new Socket(address, port)
      socket
    }.onInterrupt(closeSocket)

Does this work? I.e., do ZIO’s semantics guarantee that if two raced effects with onInterrupt handlers complete simultaneously, then at least one of their onInterrupt handlers will be called? Or is it possible to have a sequence like:

left effectBlocking resolves successfully
left onInterrupt resolves successfully
right effectBlocking resolves successfully
right onInterrupt resolves successfully
race detects that the left fiber has completed, and interrupts the right fiber, but the right fiber has already resolved so onInterrupt isn’t called

?

I peeked at the implementation of race but it didn’t help as much as I was hoping In Trio the equivalent of that sequence would be possible though, which is why we can’t handle the final socket cleanup inside the individual tasks, but instead do it at the end.

leonoel · February 6, 2020, 8:58am

That strategy looks right to me, and nicely captures the problem’s essential complexity.

adamw · February 6, 2020, 10:43am

Your revised version here is conceptually similar to how Trio’s real implementation handles this: we keep a collection of all the open sockets, and then at the end close all the ones that are still open:

Maybe it’s an essential part of any solution then

I peeked at the implementation of race but it didn’t help as much as I was hoping In Trio the equivalent of that sequence would be possible though, which is why we can’t handle the final socket cleanup inside the individual tasks, but instead do it at the end.

Yes I think you are correct, that implementation would also be wrong. But still this might be made to work. There’s a variant of race, raceWith, which receives both the value that won and the loosing fiber.

We would be able then to interrupt the fiber (calling Fiber.interrupt); as a result, we would get either an error (meaning that the fiber was indeed interrupted), or the computed value (if the fiber was already completed). In the first case releasing would have already happened, in the second we would need to call release on the computed value (which lost).

But that would only increase complexity so I think the layered solution is better so I converted the example to use that

Thanks for the comment and pointing out another possible bug!

Topic		Replies	Views
Structured Concurrency in Ruby	9	1954	July 16, 2020
Structured Concurrency Kickoff Structured concurrency	22	10717	February 15, 2019
Zio (Scala library) Structured concurrency	4	3033	February 15, 2019
Discussion: "Notes on structured concurrency, or: Go statement considered harmful" Structured concurrency	24	7962	May 12, 2025
Structured concurrency resources Structured concurrency	12	6690	February 27, 2024

Happy Eyeballs: Structured Concurrency's Hello World

Related topics