A thread pool is not a new concept. It’s basically a gang of worker threads to whom a task would be given to be executed. Why thread pools? Because the program wouldn’t be starting threads as it sees fit and somehow reach the maximum thread number soon. Simply said thread pools allows us to limit the number of threads spawned by our program execution. Trust me, you don’t want your code going to town spawning threads. It comes back to bite you in your behind, sooner than you think.

Programming languages usually provide built in libraries implementing thread pools, however Python doesn’t seem to have a pooling strategy for threads. It does however have a Process Pool concept, where a set of workers can be used to submit a function, but that involves more complexities (ex: function’s ability to be pickled or unpickled) than threads. It also involves processes, which differ drastically from threads when it comes to multi-threading requirements.

However it’s considerably easier in Python to write a simple thread pool implementation. All we really need is a thread safe blocking queue, a task interface, and a thread implementation which waits for tasks to appear on the blocking task queue. That is exactly why I decided to pack that all in to a single Python library called BreadPool.

In the past there were several instances where, for me, the need for a proper thread pool implementation came up without the time to dedicate write one from scratch. This would result in several thread pool implementations everywhere. It’s better to have a thread pool implementation at a mere pip install.

BreadPool can be installed from PyPI and used immediately.

pip install breadpool
from breadpool.pool import ThreadPool

thread_pool = ThreadPool(5, "CustomThreadPool", polling_timeout=1)

This will make sure that we will have a set of worker threads numbering no more than 5. You can refer more documentation on the project GitHub’s README.

BreadPool also includes a scheduled task executor which would submit a given task to a given thread pool, repeatedly with a given time interval in between. It’s supposed to be a thread safe way to schedule a certain task without rewriting your own scheduled executor for Python 2.7. It’s designed with Java’s ScheduledExecutor in mind, but still has a few more features to be desired.

BreadPool doesn’t depend on anything other than the Python 2.7 standard library, and will try to keep it that way in the future. So it doesn’t drag anything unexpected in.

The released version is 0.0.5, and is licensed under Apache v2.0. Feel free to download and use BreadPool. The code base is small, and the work was short, but I figured it would save some time for you when it comes to worrying about thread pools.

Originally published at chamilad.github.io on December 10, 2015.


Written on December 10, 2015 by chamila de alwis.

Originally published on Medium