Description
Hmmmm how can I make this faster? I have idea, I’ll just run it in parallel.
Luckily I am working with Python, and we have PEP20:
There should be one— and preferably only one —obvious way to do it.
So what is the obvious way to do it:
There are 5 different popular packages to do this: multiprocessing
, subprocess
, threading
, gevent
:FacePalm:
This talk will cover the main concurrency paradigms, show you the pros and cons of each and give you a framework for picking the right solution for your project
Objectives
Attendees will learn the main multiprocessing options in both python 2.7 and python 3. Will leave with a framework for determining which approach is best for them
Detailed Abstract
Concurrency is hard. As a lay-developer there is a lot of ramping up to figure out how to solve what would seem like simple problems:
“I want to check the status of 1000 urls?”
“how can I run my test suite in parallel?”
“I have millions of jobs on a queue — what is the best way to spawn workers to process them?”
With Python you have many options, each one does a certain thing well. Here we will explain the tools in our toolbelt so you can pick the right tool the problem you are trying to solve.
threading
: interface for threads, mutexs and queues
multiprocessing
: is similar to threading but offers local and remote concurrency (with some gotchas)
subprocessing
: Allows you to spawn new processes with minimal memory sharing support. But great for a lot of things
gevent
: a coroutine-based Python networking library that uses greenlets
Outline
- Background — This is hard
- Threads
- Processes
- Pipes
- GIL
- Subprocesses
- How to use
- Joins
- Pipes
- Good use cases
- Multiprocessing
- How to use
- Sharing Memory (
SyncManager
) - Handing Interupts
- Good use cases
- Gevent
- How to use
- Monkey Patching
- Good Use Cases
- Threading
- How to use
- Locks, Conditions, Timers
- Good Use Cases
- Summary
- “Do not cross the streams”
- Decision Framework
- What about tulip