Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> To be fair, I allocated the stacks manually in one large allocation; otherwise it dies quite quickly running out of VM mappings.

Okay, so the test you did doesn't actually reflect the use case in practice. Can I expect to reach 200,000 threads if the threads are not all created at exactly the same moment? What if (God forbid) they're doing memory allocation? And if it does work out, will everything be handled efficiently?



Hope comex replies to your question. Typical green thread usage is spawn-em-as-you-need-em, so if in order to spawn lots of 1:1 threads I need to do it all up front, that could be very limiting or complicating.


Yeah, I just made a mistake - you can increase the maximum number of mappings using /proc/sys/vm/max_map_count; I tried doing that and switching back to normal stack allocation (but still specifying the minimum size of 16KB using pthread_attr_setstacksize) and it doesn't change the number of threads I was able to create.

...in fact, neither did removing the setstacksize call and having default 8MB stacks. I guess this makes sense: of course the extra VM space reserved for the stacks doesn't require actual RAM to back it; there is some page table overhead, but I guess it's not enough to make a significant difference at this number of allocations. Of course, on 32-bit architectures this would quickly exhaust the address space.

If increasing max_map_count hadn't worked, it would still be possible to allocate stacks on the fly - but you would get a bunch of them in one mmap() call, and therefore in one VM mapping, and dole them out in userland. However, in this case guard pages wouldn't separate different threads' stacks, you would have to generate code that manually checks the stack pointer to avoid security issues from stack overflows, rather than relying on crashing by hitting the guard page. Rust actually already does this, mostly unnecessarily; I'm not sure what Go is doing these days but I think it does too. Anyway, given that the above result I suspect this won't be an actual issue, at least until the number of threads goes up by an order of magnitude or something.


Thanks for your reply. Wonder how other platforms fare.


That doesn't seem that unrealistic — you could allocate your stacks using slab allocation, for example. I wonder why the Kernel allocator doesn't do a better job though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: