Hacker Newsnew | past | comments | ask | show | jobs | submit | 2011-11-06login
Stories from November 6, 2011
Go back a day, month, or year. Go forward a day, month, or year.
1.Missing Person - Tracy Williams, Technorati Employee (technorati.com)
525 points by sbisker on Nov 6, 2011 | 56 comments
2.Programmers' salaries at Google $250k (and up) (jacquesmattheij.com)
406 points by llambda on Nov 6, 2011 | 265 comments
3.Teachers write their own online textbook, save district $175,000 (therepublic.com)
312 points by tokenadult on Nov 6, 2011 | 100 comments
4.You're a developer, so why do you work for someone else? (intermittentintelligence.com)
234 points by cotsog on Nov 6, 2011 | 107 comments

From CTO of 10gen

First, I tried to find any client of ours with a track record like this and have been unsuccessful. I personally have looked at every single customer case that’s every come in (there are about 1600 of them) and cannot match this story to any of them. I am confused as to the origin here, so answers cannot be complete in some cases.

Some comments below, but the most important thing I wanted to say is if you have an issue with MongoDB please reach out so that we can help. https://groups.google.com/group/mongodb-user is the support forum, or try the IRC channel.

> 1. MongoDB issues writes in unsafe ways by default in order to win benchmarks

The reason for this has absolutely nothing to do with benchmarks, and everything to do with the original API design and what we were trying to do with it. To be fair, the uses of MongoDB have shifted a great deal since then, so perhaps the defaults could change.

The philosophy is to give the driver and the user fine grained control over acknowledgement of write completions. Not all writes are created equal, and it makes sense to be able to check on writes in different ways. For example with replica sets, you can do things like “don’t acknowledge this write until its on nodes in at least 2 data centers.”

> 2. MongoDB can lose data in many startling ways

> 1. They just disappeared sometimes. Cause unknown.

There has never been a case of a record disappearing that we either have not been able to trace to a bug that was fixed immediately, or other environmental issues. If you can link to a case number, we can at least try to understand or explain what happened. Clearly a case like this would be incredibly serious, and if this did happen to you I hope you told us and if you did, we were able to understand and fix immediately.

> 2. Recovery on corrupt database was not successful, pre transaction log.

This is expected, repairing was generally meant for single servers, which itself is not recommended without journaling. If a secondary crashes without journaling, you should resync it from the primary. As an FYI, journaling is the default and almost always used in v2.0.

> 3. Replication between master and slave had gaps in the oplogs, causing slaves to be missing records the master had. Yes, there is no checksum, and yes, the replication status had the slaves current

Do you have the case number? I do not see a case where this happened, but if true would obviously be a critical bug.

> 4. Replication just stops sometimes, without error. Monitor > your replication status!

If you mean that an error condition can occur without issuing errors to a client, then yes, this is possible. If you want verification that replication is working at write time, you can do it with w=2 getLastError parameter.

> 3. MongoDB requires a global write lock to issue any write

> Under a write-heavy load, this will kill you. If you run a blog, you maybe don't care b/c your R:W ratio is so high.

The read/write lock is definitely an issue, but a lot of progress made and more to come. 2.0 introduced better yielding, reducing the scenarios where locks are held through slow IO operations. 2.2 will continue the yielding improvements and introduce finer grained concurrency.

> 4. MongoDB's sharding doesn't work that well under load

> Adding a shard under heavy load is a nightmare. Mongo either moves chunks between shards so quickly it DOSes the production traffic, or refuses to more chunks altogether.

Once a system is at or exceeding its capacity, moving data off is of course going to be hard. I talk about this in every single presentation I’ve ever given about sharding[0]: do no wait too long to add capacity. If you try to add capacity to a system at 100% utilization, it is not going to work.

> 5. mongos is unreliable

> The mongod/config server/mongos architecture is actually pretty reasonable and clever. Unfortunately, mongos is complete garbage. Under load, it crashed anywhere from every few hours to every few days. Restart supervision didn't always help b/c sometimes it would throw some assertion that would bail out a critical thread, but the process would stay running. Double fail.

I know of no such critical thread, can you send more details?

> 6. MongoDB actually once deleted the entire dataset

> MongoDB, 1.6, in replica set configuration, would sometimes determine the wrong node (often an empty node) was the freshest copy of the data available. It would then DELETE ALL THE DATA ON THE REPLICA (which may have been the 700GB of good data)

> They fixed this in 1.8, thank god.

Cannot find any relevant client issue, case nor commit. Can you please send something that we can look at?

> 7. Things were shipped that should have never been shipped

> Things with known, embarrassing bugs that could cause data problems were in "stable" releases--and often we weren't told about these issues until after they bit us, and then only b/c we had a super duper crazy platinum support contract with 10gen.

There is no crazy platinum contract and every issue we every find is put into the public jira. Every fix we make is public. Fixes have cases which are public. Without specifics, this is incredibly hard to discuss. When we do fix bugs we will try to get to users as fast as possible.

> 8. Replication was lackluster on busy servers

This simply sounds like a case of an overloaded server. I mentioned before, but if you want guaranteed replication, use w=2 form of getLastError.

> But, the real problem:

> 1. Don't lose data, be very deterministic with data

> 2. Employ practices to stay available

> 3. Multi-node scalability

> 4. Minimize latency at 99% and 95%

> 5. Raw req/s per resource

> 10gen's order seems to be, #5, then everything else in some order. #1 ain't in the top 3.

This is simply not true. Look at commits, look at what fixes we have made when. We have never shipped a release with a secret bug or anything remotely close to that and then secretly told certain clients. To be honest, if we were focused on raw req/s we would fix some of the code paths that waste a ton of cpu cycles. If we really cared about benchmark performance over anything else we would have dealt with the locking issues earlier so multi-threaded benchmarks would be better. (Even the most naive user benchmarks are usually multi-threaded.)

MongoDB is still a new product, there are definitely rough edges, and a seemingly infinite list of things to do.[1]

If you want to come talk to the MongoDB team, both our offices hold open office hours[2] where you can come and talk to the actual development teams. We try to be incredibly open, so please come and get to know us.

-Eliot

[0] http://www.10gen.com/presentations#speaker__eliot_horowitz [1] http://jira.mongodb.org/ [2] http://www.10gen.com/office-hours

6.What being hopelessly single taught me about pitching tech celebs (geekwire.com)
184 points by iseff on Nov 6, 2011 | 48 comments
7.AI are taking jobs (economist.com)
184 points by bmahmood on Nov 6, 2011 | 154 comments

HN has a curious relationship to money some of the time.

a) Many of us watched our classmates go into management consulting, investment banking, finance, medicine, law, etc. These are all fields where $250k is not exactly outlandish for a 29 year old.

b) We have a lot of friendly, accessible fellow posters who write computer code and are financially very successful.

c) We talk about $X million valuations and $YY million acquisitions and $Z billion IPOs the way catering company owners discuss the price of tomatoes.

d) We have heard many credible people complain of how difficult it is to hire and retain engineers. So difficult, in fact, that they'd pay $10k+ just for an introduction or $50k+ for an actual placement.

e) We understand incentive structures and equity grants exist.

f) We routinely read industry news like "Google and Facebook in heated war for talent", "The going rate for an acquisition-to-hire is $1 million per engineer", "Four large technology firms were engaged in a gentlemen's agreement to conspire against their employees until the government told them it was cartelicious", "Productivity per engineer is going through the roof", "Company Z supports Q00,000 paying customers per engineer", "Efforts of individual engineers have succeeded in adding millions to billions of dollars of value to some companies", etc etc etc, and generally seem to be at least as savvy as C-students in Microecon 101.

and yet

g) Talk of engineers receiving wages above some magic threshold is met with disbelief, scorn, and a wee bit of jealousy.

It isn't hugely interesting to me what any particular person at a particular company is making, but is $250k an outlandish total compensation number? No, it is clearly achievable. Do you have to be programming demigod to achieve it? No, compensation moves along with several different axes and programming ability isn't the main one. Is this anecdote a freak of nature which we'll never see again? No, signs point that this will become increasingly more common over time.

9.Why the cheapest maple syrup is the best (theatlantic.com)
164 points by vwoolf on Nov 6, 2011 | 54 comments
10.Ask HN: Startup promised me a job, then backed out after the internship
151 points by throwaway87 on Nov 6, 2011 | 88 comments
11.1TB Hard Drive Prices up 180% in a Month (zorinaq.com)
143 points by mrb on Nov 6, 2011 | 53 comments
12.TouchFire: Finally a real keyboard for the iPad (kickstarter.com)
134 points by kiriappeee on Nov 6, 2011 | 50 comments
13.Steve Jobs and Apple's Influence on Gaming Massively Overstated (forbes.com/sites/insertcoin)
130 points by drey on Nov 6, 2011 | 96 comments
14.JavaScript's sweet spot (blog.mrale.ph)
115 points by paufernandez on Nov 6, 2011 | 18 comments
15.The Pure CSS3 Content Slider (iamceege.com)
111 points by J3L2404 on Nov 6, 2011 | 15 comments
16.Khan Academy Gets 5 Million to Expand Faculty & Platform & to Build a School (hackeducation.com)
99 points by apievangelist on Nov 6, 2011 | 31 comments
17.Voyager 2 to Switch to Backup Thruster Set (nasa.gov)
96 points by J3L2404 on Nov 6, 2011 | 56 comments
18.MongoGate — or let's have a serious NoSQL discussion (erlang.de)
94 points by zeit_geist on Nov 6, 2011 | 32 comments
19.The Tax Haven That's Saving Google Billions (businessweek.com)
95 points by bishnu on Nov 6, 2011 | 60 comments
20.Growl 1.3.1 is out, no longer free (itunes.apple.com)
91 points by avirambm on Nov 6, 2011 | 99 comments

Hi,

I run engineering for foursquare. About a year and a half ago my colleagues and I and made the decision to migrate to MongoDB for our primary data store. Currently we have dozens of MongoDB instances across several different data clusters storing over a TB of data and handling 10s of thousands of requests per second (mostly reads but the write load is reasonably high as well).

Have we run into problems with MongoDB along the way? Yes, of course we have. It is a new technology and problems happen.

Have they been problematic enough to seriously threaten our data? No they have not.

Has Eliot and the rest of his staff @ 10Gen been extremely responsive and helpful whenever we run into problems? Yes, absolutely. Their level of support is amazing.

MongoDB is a complicated beast (as are most datastores). It makes tradeoffs that you need to understand when thinking about using it. It's not necessarily for everyone. But it most certainly can be used by serious companies building serious products. Foursquare is proof of that.

I'm happy to answer any questions about our experience that the HN community might have.

-harryh

22.CoffeeScript Means Giving Up on JavaScript (w2lessons.com)
82 points by mwbiz on Nov 6, 2011 | 65 comments
23.Jawbone UP Review (shawnwall.tumblr.com)
81 points by shawnwall on Nov 6, 2011 | 27 comments
24.Show HN: 1M Song Dataset dev in 10 mins (mortardata.com)
77 points by kky on Nov 6, 2011 | 22 comments
25.Daylight Saving Time is a huge mess (johndcook.com)
76 points by ColinWright on Nov 6, 2011 | 51 comments
26.Regular Expression Engine in 14 lines of Python (lisp.org)
74 points by okal on Nov 6, 2011 | 38 comments

Maybe my case is unusual because YC takes applications online, but I don't like it when people walk up to me and "pitch" me by reciting some preformulated speech about their startup. I can almost never understand what they're talking about. And it makes me feel like a target, in much the same way it probably does to women when guys walk up to them and recite preformulated pickup lines.

The unit of conversation with a "tech celeb" need not be a pitch. I'd suggest trying an ordinary conversation instead. I don't know about other people, but it would definitely work better with me.


Been there, done that. Being a developer is a small fraction of what you need in order to create a successful product. If you are the kind of person who can do it, you don't need to read posts like these.

I don't know about this guy, but it looks like he hasn't done it, and he has little idea of what lies ahead. Best of luck to him.

29.Obviously Correct: implications for language design (ezyang.com)
66 points by samstokes on Nov 6, 2011 | 24 comments
30.Music NGram Viewer (peachnote.com)
65 points by sew on Nov 6, 2011 | 11 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: