Python Bits — Using Threads
This is the second in the series of Python blog posts I’m writing, you can find the first one here. In this particular one we’ll add threads to our Imgur Album downloader, hopefully making it a tad bit faster than before.
First of, you must be wondering, due to the infamous GIL(in CPython of course), threads are not useful in Python. Luckily in our case, most of the time threads will be waiting for network activity, and the GIL would happily switch threads from one to another instead of locking on one of them. So, it would actually be beneficial to use threads since most of the times we’ll be doing either Network IO (while downloading the images) or File IO (while writing the images to disk)
Since this is Python, using threads is quite simple, we just need to import the
threading module, create a thread instance, and then tell it to run. While
creating the instance we can tell it what function to run, and if needed, we can
pass arguments to that function through this thread we have created. We will
then call the join
function in the main process loop so that we can wait for
our thread to finish
We’ll just call our download_img
function from each thread, telling it to
download a different picture. One problem we might face now is with the progress
bar, since threads run parallel to our main process thread, our for loop will
finish as soon as we have launched all the requisite number of threads, and thus
the progress bar would reach 100% before all the images have finished
downloading.
To counter this, after each thread completes, we’ll manually update our progress bar.
This is how we do it :
The max_value
tells it that we have this many items, when the count reaches
that number, the progress bar should be at 100%. To update the progress bar,
we’ll take a lock to increment a variable, and use that variable to update the
progress bar. The lock is necessary to prevent multiple threads from updating
the same global variable simultaneously, and not mess up the whole thing.
Phew, that was quite some work with locks and all. In the next post, we’ll move from these messy threads to the new and shiny async-await style for doing asynchronous code.