Post

Go Worker Pools

Go Worker Pools

Introduction to the Worker Pool Design Pattern in Go

A man of great wisdom recently told me, “The only incorrect amount of concurrency is unbounded concurrency.”

Concurrency in Go is one of the harder parts to get your head around, particularly if approaching Go from a background in a language where concurrency works differently, or isn’t an option. Here is a description of the Worker Pool design pattern in Go, which you can use to fan-out a particular kind of task to multiple concurrent workers.

The Worker Pool Pattern is a popular design pattern in Go for managing concurrency. The pattern efficiently distributes tasks among a fixed number of goroutines, which can help manage resource utilisation and improve performance.

The only incorrect amount of concurrency is unbounded concurrency

Understanding the Worker Pool Pattern

The worker pool pattern involves creating a specific number of workers (goroutines), each responsible for executing tasks. These workers listen on a common channel that dispatches tasks to them. As tasks arrive, any free worker can pick up a task and execute it. Once a worker completes a task, it returns to listening for more tasks. This model helps in managing a large number of tasks without overloading the system with goroutines, thus optimizing resource usage.

Worker Pool Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
func worker(tasks <-chan int, results chan<- int) {
    // Keep receiving tasks from the queue until it is closed, then the loop finishes
    for task := range tasks {
        // Do something useful here
        results <- foo(task)
    }
    // After exiting the loop, the worker goroutine just dies at the end of this function
}

func runWorkers() {

    // Set the number of workers to an appropriate level by experimentation
    workers := 100
    // In this example we have 1000 tasks
    taskCount := 1000

    // Make channels for tasks and their results
    tasks := make(chan int, taskCount)
    results := make(chan int, taskCount)

    // Spin up the workers
    for w := 1; w <= workers; w++ {
        go worker(w, tasks, results)
    }

    // Send jobs to the queue for workers to handle them
    for j := 1; j <= taskCount; j++ {
        tasks <- 42
    }

    // Closing the 'jobs' queue kills the workers when they've finished reading everything: see worker() above
    close(tasks)

    // You can receive values from the 'results' channel and use them as they come in
    for a := 1; a <= taskCount; a++ {
        res := <-results
        fmt.Printf("Hello this is a result %d\n", res)
    }
}

Key Components of the Worker Pool in Go

Task Channel: This is a channel through which tasks are sent to the workers. It’s usually buffered to hold multiple tasks waiting to be processed.

Worker Function: Each worker runs a specific function, typically in an infinite loop, listening for tasks on the task channel. When a task is received, the function processes the task and then returns to listening.

Dispatcher: A dispatcher function is responsible for feeding tasks into the task channel. It ensures that tasks are distributed among the available workers.

Results Channel: Optionally, if task results need to be communicated back, a results channel can be used where workers send back the results of processed tasks.

How Many Workers Do I Need?

The Worker Pool Pattern offers you controlled concurrency, so you can vary the number of workers. If their input channel is coming from an I/O bottleneck upstream, increase the number until further increases don’t improve performance. If they’re doing something computationally expensive, increase your number of workers until CPU becomes the bottleneck, and your resources are being well utilised. If a downstream process is wring to a disk or something slow, increase the number until the output channel is making good use of its buffer. Basically, mess around with that number until stuff runs faster.

Best Libraries for Working with Worker Pools in Go

Here are some popular libraries to be aware of, if you’d like to explore further:

  • Ants- Automated management and recycling of very large numbers of goroutines, with an extensive API, periodic purging of goroutines, and efficient memory use.

  • Tunny - A comparatively lightweight library for spawning and managing a goroutine pool

  • Workerpool - A simple Worker Pool implementation that limits the concurrency of task execution, without blocking submitting tasks.

Conclusion

The worker pool design pattern is a powerful tool in Go for managing concurrency, particularly when dealing with a high volume of tasks. It optimises resource usage and improves application performance by distributing tasks evenly among a pool of workers. By understanding and implementing this pattern, developers can build more efficient and robust concurrent applications.

This post is licensed under CC BY 4.0 by the author.