tushman.io

The musings of an insecure technologist

PyCon Proposal - Pragmatic Concurrency

Description

Hmmmm how can I make this faster? I have idea, I’ll just run it in parallel.

Luckily I am working with Python, and we have PEP20:

There should be one— and preferably only one —obvious way to do it.

So what is the obvious way to do it:

There are 5 different popular packages to do this: multiprocessing, subprocess, threading, gevent

:FacePalm:

This talk will cover the main concurrency paradigms, show you the pros and cons of each and give you a framework for picking the right solution for your project

Objectives

Attendees will learn the main multiprocessing options in both python 2.7 and python 3. Will leave with a framework for determining which approach is best for them

Detailed Abstract

Concurrency is hard. As a lay-developer there is a lot of ramping up to figure out how to solve what would seem like simple problems:

“I want to check the status of 1000 urls?”

“how can I run my test suite in parallel?”

“I have millions of jobs on a queue — what is the best way to spawn workers to process them?”

With Python you have many options, each one does a certain thing well. Here we will explain the tools in our toolbelt so you can pick the right tool the problem you are trying to solve.

threading: interface for threads, mutexs and queues

multiprocessing: is similar to threading but offers local and remote concurrency (with some gotchas)

subprocessing: Allows you to spawn new processes with minimal memory sharing support. But great for a lot of things

gevent: a coroutine-based Python networking library that uses greenlets

Outline

  1. Background — This is hard
    1. Threads
    2. Processes
    3. Pipes
    4. GIL
  2. Subprocesses
    1. How to use
    2. Joins
    3. Pipes
    4. Good use cases
  3. Multiprocessing
    1. How to use
    2. Sharing Memory (SyncManager)
    3. Handing Interupts
    4. Good use cases
  4. Gevent
    1. How to use
    2. Monkey Patching
    3. Good Use Cases
  5. Threading
    1. How to use
    2. Locks, Conditions, Timers
    3. Good Use Cases
  6. Summary
    1. “Do not cross the streams”
    2. Decision Framework
    3. What about tulip

Additional Notes:

  • My parallelized version of lettuce is open sourced here
  • I have other open-source libraries, can find them here
  • This is my first time speaking at PyCon. I have spoken at Boston Python. My slides for that talk are here
  • I sometimes write about Python. My blog is here

Comments