Python has a GIL, and lots of complainers

nkurz · on July 9, 2010

Wow, that's quite a self-referential post! I don't know what Gustavo's role is within Python development, but I really don't think he understands the motivation of the core developers like Brett.

Here's from Brett's original:

  I have a finite amount of time to volunteer in helping 
  make Python what it is. Having people push upon me that   
  they either think I am failing because I can't find a 
  solution to a problem (the "complainers") or that the 
  solution that I help create is not good even though it 
  has not been implemented and proven to be a failure (the 
  "what about this?" folks) just sucks away that much more 
  time, preventing me from helping make Python better for 
  YOU. It gets to the point that I sometimes wonder why I 
  put in so much time and effort when so many people gripe 
  about the volunteer work that I put in to produce this 
  thing that is given to the world for free.

And then Gustavo:

  I apologize, but I have a very hard time reading this and 
  not complaining. 
  ...
  I know this is just yet another complaint, though. I 
  honestly cannot fix the problem either, and rather just 
  talk about it in the hope that someone who’s able to do 
  it will take care of it.

Disgusting. Could you at least make an effort to convince the developer that they personally would benefit from the massive amount of work you are demanding of them, instead of just trying to bully them into it?

More from Gustavo:

  I can’t provide ideas or solutions.. I can’t fix the 
  problem.. they don’t even care about the problem. Why am 
  I using this thing at all?

Go a little farther: how does the developer benefit from your use of this software? If you can figure out how to answer this question in the positive, you've probably figured out a much better strategy for convincing others to solve your problems for you. If you can't, perhaps there is some other software that you could buy? If the developers don't care about the 'problem', maybe it's not their problem after all?

nailer · on July 9, 2010

I think it's OK to have an opinion on anything you like. You shouldn't expect anyone to fix it for free though.

I've never written a VM, and I don't think I'd be very good at it. I've written useful modules other people use, spoken at a Python conference, and have other OSS code in various places. I'm not a succubus, at least I hope I'm not. I don't even mind multithreading in Python, since I use Threads and Queues and they work just fine for my purposes. But I'm sure, other more cooler things would happen if CPython had the same kind of multithreaded coolness that Jython and IronPython have. I don't think it makes me a bad person if I say that.

nkurz · on July 10, 2010

I agree: it's good to have and to express opinions. But as you point out there is a real difference between "wouldn't it be cool if..." and "you really should be working on this problem of mine" instead of doing whatever it is you currently think is more important.

I don't even really know who is right about a Python GIL. When working with C, I have a strong preference for separate processes and mmaps. When working with Perl (which I use a lot more than Python), I've never really felt much urge to use threading. Aesthetically, although I haven't really used it, I think Erlang's message passing approach is much better than native threads. But maybe Gustavo's right --- better threading would indeed allow lots of cool things to be done.

That aside, what saddens me about the exchange between Gustavo and Brett is the misconnect. It strikes me as obvious that the key for Gustavo here is to show that it's in Brett's self-interest to improve threading in Python. I didn't see any attempt of this. Instead, he seems to be taking exactly the tack Brett just finished saying was ruinous to his morale.

jnoller · on July 10, 2010

Except Gustavo is wrong in that he has to convince Brett that free threading is a Good Thing. Part of Brett's post was we already know that - but we're not willing to sacrifice backwards compatibility for it, and we do not have/have not thought of any good solutions to fix it.

So having people say "this sucks you guys should fix it" 100x a day, with increasing levels of "you owe me" or "you suck" tone in their voices is depressing and frustrating.

j_baker · on July 10, 2010

I don't disagree with you, but that's just part of working on a popular project. There are plenty of armchair language designers who could of course do everything better than you.

jnoller · on July 10, 2010

Oh, I agree with you! But I think we both know that while we all end up growing thick skins (or quitting) it can still get under your nails and make you sad sometimes.

Especially when you've been at it awhile. Guido must have skin like a bloody rhinoceros.

silentbicycle · on July 10, 2010

> I've never written a VM, and I don't think I'd be very good at it.

It's really not that hard. You don't even have to use C, you can do it in Python. Can you write a brainfuck interpreter? Congratulations, you now have a virtual machine with eight bytecodes. Build from there.

Kein-Hong Man's "No-Frills Guide" to the Lua VM documents one set of reasonably general bytecodes. (http://luaforge.net/docman/view.php/83/98/ANoFrillsIntroToLu...) There's a good section on virtual machines in the last chapter of _The Practice of Programming_, too.

jnoller · on July 10, 2010

The trick to writing a good interpreter is not in the writing of it; it's in the part wherein you manage to completely avoid making any decisions you might regret in 10 years.

silentbicycle · on July 10, 2010

Oh, sure. :) But a lot of people seem to find them impossibly intimidating, one of those things that are "perceived to be magical artifacts, carefully crafted by the wizards, and unfathomable by the mere mortals."* They're really not that tricky at an implementation level, until/unless you're going for extremely high performance.

Keeping reverse compatibility is a huge pain, though. No argument there!

* From Abdulaziz Ghuloum's excellent "An Incremental Approach to Compiler Construction" (http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf).

jacobolus · on July 10, 2010

It makes you an annoying person if you say it without any research, and without even reading any of the decade of accumulated identical discussions, on the mailing list devoted to python development, distracting contributors from actually fruitful discussions and developments.

tomjen3 · on July 10, 2010

>I've never written a VM, and I don't think I'd be very good at it

Don't be so sure, there are a lot of research in this area and it is not black magic.

aarongough · on July 10, 2010

I agree! I'm actually in the process of writing my first VM right now, and it's not as hard as you would think...

mkramlich · on July 10, 2010

All projects are easy at the start. See I just open this empty file here and start typing...

aarongough · on July 10, 2010

Whoever said I was at the start? The prototype is nearly done, and already working in the meantime.

mkramlich · on July 10, 2010

"The prototype is nearly done."

you're still pretty early in the project's lifecycle. software projects generally move fastest and have the least difficult challenges early on, as a general rule, in my experience. once you have a lot of users, and legacy dependencies, and have to work right on N platforms, and handle edge cases, and be performant, and the codebase becomes so big you no longer remember all the details, and maybe a few years have passed since you were working on it full-time, etc., then you'll find that velocity and agility decreases.

JabavuAdams · on July 10, 2010

Disgusting. Could you at least make an effort to convince the developer that they personally would benefit from the massive amount of work you are demanding of them, instead of just trying to bully them into it?

Come on, it's 2010. Multi-core is here to stay. It's disgusting that a general-purpose language like Python has such a horrible default properties on multi-core.

rbanffy · on July 10, 2010

Care to explain what real-world problems you had with the GIL?

I am usually the first guy to say the GIL must go, but, quite frankly, I have learned to make Zope+Plone and Django stacks scale without messing with it. If you have a 32-core server and you want to serve, say, a Zope site, just use ZEO (ZEORaid, perhaps) and 31+ ZEO clients behind your Varnish cache.

The GIL must go, but the fact the very clever people who have tried to make threading nicer have tried to get rid of it and failed indicates this is not an easy problem.

It's not about getting rid of the GIL. It's about getting rid of it without breaking compatibility with the mountain of modules written in C there are - and that experience little or no GIL problems by themselves.

And, mind you, we are talking about CPython, the canonical implementation. IIRC, neither Jython nor PyPy have the same kind of multi-core performance problem. I am not sure, but if you write your performance-critical module in RPython and then compile it into C, you won't have many GIL problems either.

While the GIL is the biggest problem I can see in CPython, for most usage scenarios, it's not a big one.

JabavuAdams · on July 10, 2010

At one point, I started writing a 3D game engine in Python, with the understanding that I'd profile and if necessary re-write the necessary parts.

I've also used Panda3D, which has a lot of Python-wrapped C++ code.

Now, I don't have specific benchmarks to point to, but it's important not "prematurely pessimize", just as it's important not to optimize prematurely. In high-performance 3D games, you'd switch compilers for say a 10% gain in optimized code performance.

Ever since I've become aware of how deep the GIL problems go, I can't really justify starting to write an engine in Python, if I'm just going to have to re-write critical and error-prone parts in C++ anyway.

There's a more general principle here that applies to languages and runtimes, as well as library design. You want to be able to get started with a new component or language / runtime easily, trading ease of use for performance, BUT as the project progresses you want a way to be sure that you could beat-down or recover from any problems in a way that doesn't essentially require a re-write.

These GIL issues remove that confidence in Python, for me. It's not a win if I can only write the easy parts of my code in Python. I specifically want to use Python to write the hard parts of my code.

I agree that there doesn't seem to be an easy, or even moderately difficult solution to this problem. That makes it even worse from the point of view of advocating Python use on a project.

To summarize, it's reasonable to say "don't optimize until you know there's a problem", but you can come at it the other way from a risk-management perspective. "What is the risk that there's some dead-end due to this language / runtime that we just won't be able to overcome (because the smartest dudes who know the language inside-out, can't fix it either)."

JabavuAdams · on July 10, 2010

It's sad, because I'm a huge fan of higher-level languages, but even dealing with other people's broken or retarded bindings to C or C++ libraries is a huge time sink.

At a certain point it just becomes more productive to write in C++, never mind the FUD about programmer productivity.

That said, for my current project I'm using Unity, and writing script code in Javascript and C#, but this is more about having a toolset that Just Works out of the box. It's not clear that I would choose that approach if I had to write it from scratch.

fauigerzigerk · on July 10, 2010

Yes, the question is whether Python intends to support the growing number of usage scenarios for which the GIL is a problem. You could write something like Cassandra or Lucene using a combination of Python and C, but the GIL prevents that.

rbanffy · on July 10, 2010

You could write something like Cassandra or Lucene in Python. Python is a language (much like Java, BTW). You could then run it under Jython or PyPy. The GIL is unique do CPython, AFAIK.

fauigerzigerk · on July 10, 2010

PyPy and Jython lag far behind the current version of Python, which is defined by the implementation of CPython. Also, as I said, I would like to use a combination of Python and C. That's for a reason. The JVM does not have structured value types (structs in C). All structured types in Java have to be referenced, which is the reason why Java software eats so much more memory than software implemented in C, C++, C# or Go. Maybe PyPy can work some day, if they can catch up. That day is definately not today.

rbanffy · on July 10, 2010

PyPy and Jython may lag behind, but that's not a huge problem. I would love to use dict comprehensions and {} sets like I can in CPython 2.7, but I know I can't always deploy against the latest version. And, more than that, I know I can work around those features. The code may look uglier, but it's usually as fast as it would be with the newest features.

And you are right - CPython is the canonical implementation of the language. The fact we can use a reference implementation in production environments is great.

You probably wouldn't be able to write something like Cassandra in Python 2.7 and expect it to run unmodified under Jython or IronPython, but you could write it in Python 2.5 and run it under both 2.7 CPython (with the expected multi-core annoyances) or under current Jython.

And you can't have painless C modules in Java either.

j_baker · on July 10, 2010

Actually, I believe PyPy has a GIL as well (unless they've removed it recently).

rbanffy · on July 10, 2010

Oops... May have gotten that one wrong, but the point stands. Even if Jython is the only GIL-free implementation of Python, you can run your thread-heavy code on it.

There is no such thing as "You wouldn't be able to do [cool-multi-core-product] in Python because of the GIL"

fuzzyman · on July 10, 2010

IronPython is another GIL free Python implementation.

jnoller · on July 10, 2010

Are you sure you couldn't write it?

fauigerzigerk · on July 10, 2010

I could write it, but it would either be dog slow or take 10 times as much time to write. The only way of using multiple cores would be to implement everything on top of shared memory. That implies using no pointers in any of the data structures and implementing a custom garbage collector.

jnoller · on July 10, 2010

Alright, so it's not the optimal language for something with a large amount of mutable shared state being hacked at by threads. No one ever tried to bill it as the "one true language" - maybe it's better for glue in this case, or APIs or clients, etc. Maybe it needs some jython, etc.

This is why multiple languages (a diverse toolbox) and multiple interpreters each with their pros and cons is also a good thing. Each one will have its shortcomings, and it's almost impossible to achieve all of the goals of all of the modern programming languages in a single language.

So, sure - I'd love to see python used for this problem - and all problems but thats me being irrational, that will never be true. What is true is that eventually the GIL will go away, or a non cPython interpreter will rise above cPython as the defacto standard.

fauigerzigerk · on July 10, 2010

I don't see why it is irrational to wish that a general purpose programming language be able to run data intensive, parallel server code. It's not exactly asking for HTML to be modified so you can write signal processing code in it.

The sad thing is that a Python + C combination would actually be _more_ suitable for the task than any of the JVM based language implementations, because the JVM has the unfortunate feature of eating memory like crazy due to the lack of structured value types.

This type of application is all about conserving memory. The core data structures must be extremely tight. But they make up only like 5% of the code. So Python + C would be way superior to Java for something like Cassandra if it wasn't for the GIL.

I guess I have to practice meditation for a few years so I can one day be as relaxed about these things as you are ;-)

rbanffy · on July 11, 2010

It's not irrational. But, since you are willing to rely on C in addition to Python, why not code the parallelizable parts in C and leave the rest in Python.

And you can always write a proof-of-concept in Python, prove the algorithm is right and rewrite the performance-sensitive parts in C. If your tests are thorough, all is good.

fauigerzigerk · on July 11, 2010

That's unfortunately not an option because all requests would have to go through that single threaded bottleneck to get to the parallelized C code, which is only acceptable for long running operations. For a request/response type of scenario like Cassandra, Lucene or the kind of analytics systems I'm working on it would make the whole parallelization pointless.

The realistic option is to use a separate C based process behind the usual multi-process mod_python setup. That works but is more complicated and error prone than an in-process solution. So I think that's why so many new projects that have these kinds of scenarios (NoSQL for instance) go with Java in spite of its flaws.

jnoller · on July 10, 2010

Well, if it was I/O bound - the GIL shouldn't bother it too much ;)

Don't mistake me for calm though, it drives me nuts :)

shadowsun7 · on July 10, 2010

nkurz isn't arguing about multi-core (and various benefits thereof). He's arguing about convincing the developer that there are benefits of adding such support, instead of 'just complaining'.

On a personal note, I must say here that I've never considered the differences between the two, and I'm taking this to heart.

ahk · on July 10, 2010

Here's an analogy I thought up: You go to a shanty town and announce "free apartments for everybody", and the people are excited and help you build the new apartments. When nearly complete, you announce "oops looks like we didn't design for toilets here. Guess you'll all have to go back to the shanty town for all that". I think it's understandable that people would be pissed. Pointing out that you got all that other stuff built isn't really going to cut it.

swannodette · on July 9, 2010

It's interesting that people clamor for a language feature that doesn't work well if you don't have:

  1) Cheap immutable data structures

  2) Excellent language support for managing mutable state

  3) A sophisticated VM designed around concurrency (parallel GC)

Python has none of these. Getting rid of GIL won't make parallel programming in Python any easier. It'll just show you how hard it really is.

The other option is to adopt a language that does provide these things. I can think of a new Lisp that celebrates pragmatism over purism that fits the bill pretty well...

alnayyir · on July 9, 2010

People don't want to learn, work, strive, experiment, or otherwise do anything that doesn't involve having the world handed to them on a platter.

A lot of people use Python because it was batteries included, because everything you ever wanted was already done and made for you. I work in Python on a daily basis, but I cringe regularly at the #python channel on Freenode and the kinds of people the language attracts.

Python, quite simply, attracts people who don't want to code or learn anything.

Those aren't the kinds of people who are going to learn Clojure, sadly.

Rubyists, god bless them, are a little more adventurous, for better or worse.

nostrademons · on July 10, 2010

That's not really fair. A lot of Pythonistas are also quite strong Haskell, Scheme, or even Clojure programmers. On their own time. But the nice thing about Python is that they can also get paid to write it.

The part about the GIL that I hate is that it directly affects my ability to use Python for work. There's a good chance that a Python-based system that I love will be replaced with Java because the GIL simply isn't workable in a multithreaded environment, and there're a bunch of other engineering reasons why the system really should be multithreaded. I don't suffer any GIL-related problems in my personal programming, because my hobby projects simply don't get to the scale where it matters. But most of my hobby hacking is in Haskell anyway.

jnoller · on July 10, 2010

What's the use case at work that's making it unworkable - just curious.

Ixiaus · on July 9, 2010

The same, if not worse, can be said for PHP - where I came from. I've been a casual Python user for some time and a serious Python programmer for the last six months; the Python community is a godsend in comparison with that of PHP's.

Granted, there are helpful and genius people in both communities but there is a larger share of "muggles" in the PHP camp due it's low barrier to entry (everyone wants a website!).

tomjen3 · on July 10, 2010

Downvoters, yeah he is harse but he still has a point - Python is famous for coming with batteries included so it is not unreasonably assume that the people whom it attracts are those who want a language like that.

But he is wrong with those people not wanting to learn Clojure: since it runs on the JVM, you can use any Java library to do what you want. I doubt there are as many libraries for Python as there are for Java.

l0nwlf · on July 9, 2010

"Python, quite simply, attracts people who don't want to code or learn anything." should be rephrased as "Python, quite simple, attracts people who don't want language to stand in their way." What you are commenting is harsh and does not really apply in general.

clemesha · on July 9, 2010

This type of generalization are what language wars are made of.

alnayyir · on July 10, 2010

Actively answer all questions that you can in #python for two weeks. 6-pack if you can do it without groaning to yourself.

ori_b · on July 10, 2010

Actively answer all questions you can in #popular-project for two weeks. 6-pack if you can do it without groaning to yourself.

shadowsun7 · on July 10, 2010

That's a tenuous link you're making there, about Python being batteries included, and so therefore attracting people who suck.

A perhaps more credible generalization would be - as a language becomes more popular, more and more people that 'you' cannot stand would appear. Attributing such phenomena to the language's innate features ... well you'd probably need to spend more time justifying that claim before we may accept it as true.

andraz · on July 9, 2010

Brett in the comments points out that they would accept a patch doing away with GIL if it keeps C module compatibility.

However the GIL cannot be done away without breaking compatibility with existing C modules while keeping the performance up. That's impossible requirement. Doing away with GIL means doing away with refcounting, or the performance penality is unavoidable. Doing away with refcounting means major changes of third party modules C API.

The issue people have is that Python 3.0 is introducing a bunch of stuff that makes code incompatible in different ways - mostly for aesthetical reasons, but does not deliver stuff that people are calling for again and again - performance and dropping the GIL.

Oh, and for people saying "use multiple processes" have never really had a problem where the process has multiple gigabytes of state that it needs to do its work. The state that can be easily shared in multithreaded mode, but can be only shared with a great deal of manual work in multiprocess situation.

Yes, you then do distribution over multiple machines too, but that doesn't solve your problem that you cannot afford running a separate python process on each of 12 cores, each process taking 10gb of memory. Because you only have 32gb.

However complaining won't help. Maybe Unladen Swallow or PyPy eventually resolve the issue, but it doesn't look like it's on their priority list either. Well, though luck for us lazy programmers who are getting all this stuff for free.

So thank you Brett, Guido and others working relentlessly on Python. It's great. You know what bothers us, but even if you don't ever fix it, I'll still be grateful for all the hard work you did!

jnoller · on July 10, 2010

Actually, the good news is that Python 3 already has an improved GIL - it's much better than the 2.x version modulo one or two issues, and Unladen-Swallow's jit work is already well, well under way in the Python SVN tree. I'm hoping the latter will hit Python 3.2.

All told, removing the GIL without throwing all of the C extensions and other work people have done through the years onto a bonfire is really, really hard. We're getting incremental speed improvements, and if we're lucky, unladen-swallow will allow us to jump ahead performance wise very quickly.

As a side note; I agree with you - threads have their place, many of my apps make heavy, heavy use of python threads. I also use multiprocessing when I know it's CPU bound. You have to know which is the right tool, and when.

andraz · on July 12, 2010

Do not be fooled, right now Python 3 does _nothing_ to increase parallelism of two python codepaths being able to be executed at the same time.

It just fixes some pathological cases where GIL behaved even worse than it should.

Unladen doesn't try to fix GIL either and it is doubtful than it will be able to do so, PyPy has a greater chance of achieving that, but it's still not on the roadmap.

aaronblohowiak · on July 10, 2010

"Oh, and for people saying "use multiple processes" have never really had a problem where the process has multiple gigabytes of state that it needs to do its work. The state that can be easily shared in multithreaded mode, but can be only shared with a great deal of manual work in multiprocess situation."

Doesn't python have fork with copy-on-write semantics?

thwarted · on July 10, 2010

The copy-on-write semantics are a function of fork, an OS call (which is why it's available in python as os.fork), not a function of python. So if your OS supports COW for forked processes, python will have it.

Check the man pages for fork(2) and vfork(2) for fork related COW information. There's some interesting stuff in there.

As a side note, there's an interesting write-up on the portability of fork between UNIX and win32 in the perlfork man page (forgive me for mentioning perl in a python related thread), and the hoops that needed to be jumped through to emulate fork on win32 and keep it compatible with real fork from the visibility of the perl script itself.

unshift · on July 10, 2010

copy-on-write doesn't write back to the original process, which may be desired. there are situations where a single process space are preferred (multiple threads sharing a file descriptor, etc) so "just use processes" isn't always the right answer. it can be done, but it's not always the nicest solution.

andraz · on July 12, 2010

You do know that every temporary storage of a an object in python causes an increase of refcount which is a write? Each of those writes causes a 4kb block to be copied, and they are eventually scattered around enough to matter.

What we've done is mmaping multi-gb files into memory by separate processes. It works, but it's a cumbersome workaround which makes you do way more work than you would ever wish to be able to use multiple cores.

reitzensteinm · on July 10, 2010

"If you’re tired of hearing the same arguments again and again for 10 years, from completely different people, there’s a pretty good chance that there’s an actual issue with your project, and your users are trying in their way to contribute and interact with you in the hope that it might get fixed."

I completely disagree. The design of complex software involves tradeoffs, and even if you make very good decisions, there will always be disadvantages to your approach that will get pointed out over and over. You can never make everyone happy.

jnoller · on July 10, 2010

Thank you; that's one of the nicest ways I've heard it put.

jrockway · on July 10, 2010

In theory, the maximum speedup you could achieve by making Python perfectly multi-threaded on a 4 core machine is 4x. In comparison, rewriting the critical section in Haskell (a very Python-like language), you'll get a 50x speedup. On one core.

The "dynamic language" family is not for high-performance number-crunching. It's for gluing together extremely complex applications, at which it excels. A 2x speedup just doesn't matter much.

(You can use cooperative multitasking to scale out anyway; I can write a Perl application that easily handles 30,000 open TCP connections each with its own thread of execution; a stack, C stack, etc. And 30,000 is an OS limit, not a Perl limit... if I had more sockets, I could serve many more connections. Remember, most things that are hard to program are complex systems. Writing the performance-intensive-but-simple parts in a different language is easy and effective. Why waste the Python core developers' time making it good at something it's going to be bad at, when they could be spending the time making it better at something it's good at?)

nostrademons · on July 10, 2010

Actually, no, it's worse than that. Threaded code on Python often runs slower on multiple CPUs than on a single CPU. David Beazley ran some numbers on a simple benchmark than indicated that running the same program with two threads on two CPUs was twice as slow than running the same program on a single thread on one CPU. It's not just a matter of not being able to use extra cores: the GIL actively slows down the other threads through context-switching overhead:

http://blip.tv/file/2232410

Also, the serialization introduced by the GIL hurts worst when you're gluing together complex applications. Say that you have a finely-tuned C++ app that spends about 20% of its time in an embedded Python interpreter (number chosen to make the math easy). Now you make the C++ multithreaded and run it on four cores. The 80% of the CPU time spent in C++ gets parallelized effectively, so now it's only 20% of the original runtime. But the Python access is all serialized by the GIL, so it's still 20% of the original time, or 50% of the new runtime. At this point, you're highly likely to get thread contention issues between different threads attempting to acquire the GIL (as per the video), so your Python performance goes down even more - which makes thread contention even more likely, and so on.

By contrast, if you'd rewritten the Python in Haskell, the most you could get was a 25% speedup, because most of the execution time was spent in C++ anyway.

jrockway · on July 10, 2010

Yeah; I've read that many implementations of GIL-less Python make it significantly slower overall. Slow on one core, slow on four cores. Great.

With repsect to your second example, I would argue that if you're gluing, you put the glue on the outside of the core -- hence, your single-threaded Python wrapper would be running your fast, multi-threaded Haskell or C++. In that case, the only pain point is supplying data and interpreting the results, which is usually well inside the realm of Perl/Python/Ruby's capabilities.

nostrademons · on July 10, 2010

Most large-size mixed Python/C++ systems have multiple layers:

http://c2.com/cgi/wiki?AlternateHardAndSoftLayers

"Inside" or "outside" of the core doesn't have much meaning in this context. Typically, you have a C++ driver that provides your application's main(), which invokes the Python interpreter, which can then call back into C libraries. Or you might wrap that itself with a Python script. That's the approach that the Tornado web server takes: a Python driver program, which enters an epoll select loop, which calls back into the Python to handle requests, which may themselves call functions written in C. The point is that it's not just Python scripts calling super-fast C libraries, it's also often large C++ apps invoking Python scripts for lesser-used features.

dfox · on July 9, 2010

I think that GIL is pretty tangential to multi-core. If you want something to execute in parallel on SMP system it is always better to have it in separate address spaces. This is even more true in dynamic language runtime, because almost anything you do (such as accessing global namespace) requires some kind of synchronization and for CPU-intensive code slowdown from this synchronisation tends to be larger than from sequential execution with GIL for almost any significant number of concurrent threads (such as 3 threads on Core 2 Quad on Linux 2.6.2something)

blasdel · on July 9, 2010

The people making serious real-world efforts at this kind of thing have problems bigger than one machine -- they need to be distributed anyway, so why not focus on multi-process solutions?

fauigerzigerk · on July 10, 2010

How do you imagine the design of individual processes participating in that distributed architecture? Each of them is going to have to keep a lot of data in memory and allow concurrent access to it.

Your argument is like saying we don't have to bother with wheels on aircraft because flying is a much bigger problem than rolling on wheels.

runT1ME · on July 9, 2010

> requires some kind of synchronization and for CPU-intensive code slowdown from this synchronisation tends to be larger than from sequential execution with GIL for almost any significant number of concurrent threads (such as 3 threads on Core 2 Quad on Linux 2.6.2something)

what? You're saying the absolute worst case synchronization scenario is on par with the GIL, so you might as well not have it?

CrLf · on July 10, 2010

I'm not a very heavy user of Python, but I have used it for a few projects. I've never hit the need to have multiple threads in a Python project, but I wonder:

Is the GIL that much of a bottleneck? Isn't the multiprocessing module and its locking primitives enough for most work where CPU-bound processing needs to be distributed over multiple cores? When that fails can't that part of the software be done in C with Python as a wrapper language?

I see people here complaining about how the GIL limits their use of Python for serving requests and whatnot. Stuff that looks like the canonical example of something that can be solved by a multi-process module.

But I also see people complaining about stuff that is really CPU-bound, without seeing them state that they already overcome the main problem with this stuff, which is making them (theoretically) parallel in the first place.

I'm from an unix background, where threading is as heavy (or as lightweight) as multi-processing. Maybe much of the complainers come from Windows, where the equivalent of fork is very heavy compared to threading. Maybe the problem is under Python, and not really the GIL.

I'm not saying that the GIL is not a problem. But if it hasn't been removed yet, maybe the problems that come from removing it are even worse.

And Python isn't the only tool around. You can combine languages in a single project...

jnoller · on July 10, 2010

As the maintainer of multiprocessing, I can say I've never really been bothered by the GIL. I know about it - I know that if I have CPU bound tasks, I'm going to better off using processes, ergo, multiprocessing. I spin up some processes, pop in a queue and I'm off the the races.

That said - the majority of my code uses plain old python threads. Web load testing tools, subprocess execution, etc, etc - anything with I/O works fine for me contained in threads and since thats where I spend most of my time (in I/O) they work fine for me.

In my last "multiprocessing heavy" chunk of code, I wasn't using it for local processes and work-sharing. I was using it to spread work over a network of hosts using managers and the other tools within it.

The one gotcha with jumping in between the two is serialization. When you deal with multiprocessing, your objects which are passed in between the pools/queues/processes much be pickle-able - this means for tasks which contain lots of unpickle-able, shared state multiprocessing is not a good answer.

Essentially, having free threading in cPython would mean that you could have your cake (concurrency on multiple cores) and eat it (without incurring serialization, mutable shared state) too.

ramchip · on July 10, 2010

I tend to use threads for simple I/O cases, not as an optimization per se. For example, I have an image processing library thanks to which I can do a simple image.save(), but since I don't want to sit idle for milliseconds until it returns I have a thread pool run it. Same thing if I have an expensive operation and I want the GUI to keep responding.

Then again, I'll confess that I am not much of a Python user either and multiprocessing may be a sufficient answer.

aaronbrethorst · on July 9, 2010

I don't know anything about Python, and had no idea what the GIL was until I scanned through the article, and found out a little more about it 5 paragraphs from the end. Please follow the inverted pyramid structure! http://en.wikipedia.org/wiki/Inverted_pyramid

Here's how the blog post could have started:

"CPython, the standard Python implementation, cannot use coroutines, lightweight processes, fork/join frameworks, and other non-sequential programming techniques due to its Global Interpreter Lock, or GIL. Brett Cannon, a Python core developer, unfairly dismisses this fundamental flaw: <quote from brett>"

mfukar · on July 10, 2010

Let me start by saying I don't like the GIL. It's limiting and buggy (last week we discovered a bug involving GIL and on-demand imports for constructing Unicode objects...it's hiding pretty well). However, if I were to judge it as a design decision and not as something I can "fix", I would seriously say I'm neutral about it.

Why? It's pretty simple: since the language doesn't have any acceptable way to provide concurrency with threads, I'm going OS on its ass. And so should you, "complainers" or not. The solution is probably not ideal, and it's cetrainly A Bad Thing to many of you, but Unix programming showed the way. That's right. We should be doing more of it (in this context). A lot more of this (again, in this specific context). I’m talking about fork(2), execve(2), pipe(2), socketpair(2), select(2), kill(2), sigaction(2), and so on and so forth. These are our friends. They want so badly just to help us.

Sure, it's not always nice. Typically, if you want to share a large amount of state, you'd have to do it manually, which would result in code not so pretty as the Python lovers usually write. That may be a disadvantage to them. This point is moot, because I remember my professors screaming to us to avoid shared state as much as possible when doing concurrent programming - if you need to do it, you should rethink your solution before you rethink your tools.

I've agreed with Brett on multiple occasions in the past about this; people seem to think that changing the way Python is implemented to encompass more and more of certain features the way they like them would somehow make the language more powerful and popular and all around awesome. That's not true. Most of the "complainers" are people that chose Python because of its power, not wanting to actually learn anything other than an API. Well, folks, you can't have a single tool for every job.

cageface · on July 9, 2010

I'm skeptical that allowing threads in the interpreter is really the right away to achieve good concurrent performance. Xavier Leroy makes the case better than I can:

http://caml.inria.fr/pub/ml-archives/caml-list/2002/11/64c14...

microtonal · on July 9, 2010

Did you read this quote near the end?

"Shared-memory multiprocessors have never really 'taken off', at least in the general public."

The assumptions that held in 2002 seem to be gone these days. There are many 8-16 CPU users these days, especially in OCaml's main public (academia).

cageface · on July 9, 2010

True. In that respect the situation has changed quite a bit and you might imagine that the potential gains might be significantly higher.

The extra complexity still sounds pretty daunting though. You're going to make development of Python core for the common case (single-threaded) a lot harder for the benefit of the exceptional case. I think it's fair to say that most of the people that really need full-bore SMP performance are going to want a faster, lower-level language than Python to do their heavy lifting in anyway. Game developers are probably ahead of most of the rest of us in this area.

dfox · on July 9, 2010

I think that many VM programmers tend to overestimate the effort required for implementation of fully multithreaded VM/interpreter. I suppose that better angle to look at this issue is: does fully multithreaded interpreter solve any real issue?

ErrantX · on July 9, 2010

All I will say on this matter (to any post about the GIL) is that Python is a language I love very much.

But my main work use is concurrency and threading; writing that native threading in C is fine and dandy, but it does seriously offset the benefits of doing the rest in Python.

(for the record; I do understand and agree with the rationale behind needing the GIL - but it is frustrating to come up against it when trying to push python to your boss as a "really simple language we could hack this up with")

jnoller · on July 10, 2010

I never had this problem; we use python threads heavily at work - they're all I/O bound and they work just fine. We switch to multiprocessing when it's CPU bound. Works pretty well.

JabavuAdams · on July 10, 2010

Unfortunately, this makes CPython fairly useless for high-performance games and tools.

Obviously, that's not its primary market, but this GIL thing is a non-obvious land-mine that people need to be aware of when considering Python for a project.

The problem with the "just re-write the slow parts in C or C++" argument is that in my experience, this isn't actually faster than just carefully designing C++ from the start, using modern libraries. It's really only faster for people who are really awful at memory management.

keytweetlouie · on July 10, 2010

Thanks for pushing to get python in a more competitive position. Even if you don't like threads it's a bad idea to not let anyone use them. With growing multi-core machines developers want all the options available to them. I don't want my language to tell me what I can't do. Running python on other vm's is a fine idea but the most effort goes into cpython NOT jython or others. I wouldn't want great features on a VM that gets little attention from the python core developers. Ironically I'm complaining and not helping.sorry. I would love to tackle these issue's if I had the time.

xenophanes · on July 9, 2010

Python programs execute in sequence. No Fork/Join frameworks, no coroutines, no lightweight processes, nothing. Your Python code will execute in sequence if it lives in the same process space.

The answer from Brett and Guido to concurrency? Develop your code in C, or write your code to execute in multiple processes.

Could someone explain this to me? It says no fork/join to get concurrency, but then it says you can use multiple processes. Fork makes another process. I'm confused. What's a "process space"?

fjh · on July 9, 2010

It says "no Fork/Join frameworks", not "no Fork/Join". You can definitely fork processes in Python: http://docs.python.org/library/os.html#os.fork

Using different processes allows you to execute code in parallel, but different processes can't access each other's memory, while threads within a process share their memory space and can therefore operate on the same data.

xenophanes · on July 9, 2010

Is there a different kind of fork that would make "fork/join frameworks" work?

afhof · on July 9, 2010

I am sure it has been said before, but it might not be a bad idea to try what Linux is doing. The BKL is used less and more granular locks are starting to take its place.

pmjordan · on July 9, 2010

It's not quite so easy. The kernel is in a special position in that it can distinguish between and defend against two types of race condition:

- races caused by different CPUs accessing the same resource. Unlike userspace, the kernel knows what each CPU is doing (roughly) at any given time.

- races caused by preemption. Basically, this means the running thread/process is paused and the CPU is scheduled to run a different thread/process which then does something that interferes with whatever the original thread/process was in the middle of doing.

First, the kernel can prevent the latter altogether by marking sections of code non-preemptible. What this means for races is that you can stop the CPU from being forced to context switching while accessing a contented resource, thus guaranteeing speedy progress out of the critical section.

Secondly, there are spinlocks. This means that if a thread of execution tries to gain exclusive control over a contented resource, instead of relinquishing control over the CPU (thus rescheduling/context switching), it just sits there in a live loop waiting for the resource to be freed. On the surface, this seems like a bad idea: the CPU can't do anything while it's waiting for another CPU to do its stuff with the contended resource, and it's just wasting precious cycles. However, because the kernel is a known entity, it's possible to guarantee that all resource accesses will be extremely short because the critical section running on another CPU is non-preemptible. Generally, this technique is used when the critical sections are (much) shorter than the cost of a context switch.

In userspace, the two types of races are indistinguishable, and you don't have separate weapons for fighting them. Spinlocks are a really bad idea in userspace unless you know your software is the only (major) user of CPU time on the system, and only runs in as many threads as there are CPUs.

tl;dr: The kernel (a) has more information and (b) more control, so you can't apply those techniques to Python.

btilly · on July 9, 2010

That is not necessarily a good idea. It is quite a few years old, but http://www.bitmover.com/llnl/smp.pdf is very informative of potential problems in the path that Linux is heading down.

metachris · on July 9, 2010

The comments are a great read!

j_baker · on July 10, 2010

Am I the only heavy python user who didn't know who Brett Cannon is?

jackdied · on July 11, 2010

Sadly no, but now I know for sure you aren't Jim Baker (the jython commiter).

Most python users have never heard of anyone other than Guido. That's fine because people who contribute to open source projects don't do it for the fame, and that's reasonable because you can't be expected to memorize the names of all the people who wrote the tools you use. That said, it can be frustrating being not Guido (Brett gets heaps of credit inside the community but zero percent of Guido's name recognition).

1st story: my first PyCon I went out to the smokers deck and introduced myself (this was '02 or '03 - I'm not sure we even had name badges). The other guys turned out to be Tim Peters, Christian Tismer, and Alex Martelli. Python jackpot.

2nd story: I told that first story to a random lunch table of people at PyCon last year (as a rule I eat lunch with no one I know at cons) and the collective reaction was: who?

So yeah, don't feel bad about not knowing who Brett is (but do buy him a beer if you run into him). And definitely don't get involved in open source for the fame. [Bonus trivia: Gustavo also has a commit bit, as do a couple other people in this thread]