Brett in the comments points out that they would accept a patch doing away with GIL if it keeps C module compatibility.
However the GIL cannot be done away without breaking compatibility with existing C modules while keeping the performance up. That's impossible requirement. Doing away with GIL means doing away with refcounting, or the performance penality is unavoidable. Doing away with refcounting means major changes of third party modules C API.
The issue people have is that Python 3.0 is introducing a bunch of stuff that makes code incompatible in different ways - mostly for aesthetical reasons, but does not deliver stuff that people are calling for again and again - performance and dropping the GIL.
Oh, and for people saying "use multiple processes" have never really had a problem where the process has multiple gigabytes of state that it needs to do its work. The state that can be easily shared in multithreaded mode, but can be only shared with a great deal of manual work in multiprocess situation.
Yes, you then do distribution over multiple machines too, but that doesn't solve your problem that you cannot afford running a separate python process on each of 12 cores, each process taking 10gb of memory. Because you only have 32gb.
However complaining won't help. Maybe Unladen Swallow or PyPy eventually resolve the issue, but it doesn't look like it's on their priority list either. Well, though luck for us lazy programmers who are getting all this stuff for free.
So thank you Brett, Guido and others working relentlessly on Python. It's great. You know what bothers us, but even if you don't ever fix it, I'll still be grateful for all the hard work you did!
Actually, the good news is that Python 3 already has an improved GIL - it's much better than the 2.x version modulo one or two issues, and Unladen-Swallow's jit work is already well, well under way in the Python SVN tree. I'm hoping the latter will hit Python 3.2.
All told, removing the GIL without throwing all of the C extensions and other work people have done through the years onto a bonfire is really, really hard. We're getting incremental speed improvements, and if we're lucky, unladen-swallow will allow us to jump ahead performance wise very quickly.
As a side note; I agree with you - threads have their place, many of my apps make heavy, heavy use of python threads. I also use multiprocessing when I know it's CPU bound. You have to know which is the right tool, and when.
Do not be fooled, right now Python 3 does _nothing_ to increase parallelism of two python codepaths being able to be executed at the same time.
It just fixes some pathological cases where GIL behaved even worse than it should.
Unladen doesn't try to fix GIL either and it is doubtful than it will be able to do so, PyPy has a greater chance of achieving that, but it's still not on the roadmap.
"Oh, and for people saying "use multiple processes" have never really had a problem where the process has multiple gigabytes of state that it needs to do its work. The state that can be easily shared in multithreaded mode, but can be only shared with a great deal of manual work in multiprocess situation."
Doesn't python have fork with copy-on-write semantics?
The copy-on-write semantics are a function of fork, an OS call (which is why it's available in python as os.fork), not a function of python. So if your OS supports COW for forked processes, python will have it.
Check the man pages for fork(2) and vfork(2) for fork related COW information. There's some interesting stuff in there.
As a side note, there's an interesting write-up on the portability of fork between UNIX and win32 in the perlfork man page (forgive me for mentioning perl in a python related thread), and the hoops that needed to be jumped through to emulate fork on win32 and keep it compatible with real fork from the visibility of the perl script itself.
copy-on-write doesn't write back to the original process, which may be desired. there are situations where a single process space are preferred (multiple threads sharing a file descriptor, etc) so "just use processes" isn't always the right answer. it can be done, but it's not always the nicest solution.
You do know that every temporary storage of a an object in python causes an increase of refcount which is a write? Each of those writes causes a 4kb block to be copied, and they are eventually scattered around enough to matter.
What we've done is mmaping multi-gb files into memory by separate processes. It works, but it's a cumbersome workaround which makes you do way more work than you would ever wish to be able to use multiple cores.
However the GIL cannot be done away without breaking compatibility with existing C modules while keeping the performance up. That's impossible requirement. Doing away with GIL means doing away with refcounting, or the performance penality is unavoidable. Doing away with refcounting means major changes of third party modules C API.
The issue people have is that Python 3.0 is introducing a bunch of stuff that makes code incompatible in different ways - mostly for aesthetical reasons, but does not deliver stuff that people are calling for again and again - performance and dropping the GIL.
Oh, and for people saying "use multiple processes" have never really had a problem where the process has multiple gigabytes of state that it needs to do its work. The state that can be easily shared in multithreaded mode, but can be only shared with a great deal of manual work in multiprocess situation.
Yes, you then do distribution over multiple machines too, but that doesn't solve your problem that you cannot afford running a separate python process on each of 12 cores, each process taking 10gb of memory. Because you only have 32gb.
However complaining won't help. Maybe Unladen Swallow or PyPy eventually resolve the issue, but it doesn't look like it's on their priority list either. Well, though luck for us lazy programmers who are getting all this stuff for free.
So thank you Brett, Guido and others working relentlessly on Python. It's great. You know what bothers us, but even if you don't ever fix it, I'll still be grateful for all the hard work you did!