I look at the 1990s picture of Brewster Kahle and think: He surely didn't get paid as much as me, but what did I do? Play insignificant roles in various software subscription services, many of which are gone now. And what did he do? Held on to an idea for decades.
The combined value of The Internet Archive -- whether we think just the infrastructure, just the value of the data, or the actual utility value to mankind -- vastly outperforms an individual contributor's at almost every well-paying internet startup. At the simple cost of not getting to pocket that value.
I thought this was going to be a longer rant about how python needs to... Go away. Which, as a long time python programmer and contributor, and at one time avid proponent of the language, I would entertain the suggestion. I think all of ML being in Python is a collosal mistake that we'll pay for for years.
The main reasons being it is slow, its type system is significantly harder to use than other languages, and it's hard to distribute. The only reason to use it is inertia. Obviously inertia can be sufficient for many reasons, but I would like to see the industry consider python last, and instead consider typescript, go, or rust (depending on use case) as a best practice. Python would be considered deprecated and only used for existing codebases like pytorch. Why would you write a web app in Python? Types are terrible, it's slow. There are way better alternatives.
This is something that always bothered me while I was working at Google too: we had an amazing compute and storage infrastructure that kept getting crazier and crazier over the years (in terms of performance, scalability and redundancy) but everything in operations felt slow because of the massive size of binaries. Running a command line binary? Slow. Building a binary for deployment? Slow. Deploying a binary? Slow.
The answer to an ever-increasing size of binaries was always "let's make the infrastructure scale up!" instead of "let's... not do this crazy thing maybe?". By the time I left, there were some new initiatives towards the latter and the feeling that "maybe we should have put limits much earlier" but retrofitting limits into the existing bloat was going to be exceedingly difficult.
Easy answer to your last point: Work machine and Non-work machine. If I'm working for a company and the company needs MS Office, they will give me a machine with MS Office. I will treat that machine like a radioactive zone. Full Hazmat suit. Not a shred of personal interaction with that machine. It exists only to do work on and that's that. The company can take care of keeping it up to date, and the company's IT department can do the bending over the table on my behest as MS approaches with dildos marked "Copilot" or "Recall" or "Cortana" or "React Native Start Menu" or "OneDrive" or whatever.
Meanwhile, my personal machine continues to be Linux.
This is what I'm doing at my work now. I'm lucky enough to have two computers, a desktop PC that runs Linux, and a laptop with Windows 11. I do not use that laptop unless I have to deal with xlsx, pptx or docx files. Life is so much better.
I recently moved to a Dutch municipality that runs its own non-profit ISP. They installed a symmetric 1 Gbps fiber connection with a static IP at my house for 40 euros per month.
The service is solid, there’s no upselling or throttling, and hosting things from home just works. I bring this up because when we talk about “open”, “fair” and “monopolies” the model of a local, non-profit ISP backed by the municipality could offer a real alternative. It doesn’t directly solve the peering issues, but it shifts the balance of power (and cost) somewhat.
> If they were ok with it being 2-4x slower, they would just write ordinary Python or Javascript.
Python and JavaScript are much more than 4x slower than C/C++ for workloads that are git-like (significant amount of compute, not just I/O bound)
> C (and C++) are fundamentally important because of their use in performance sensitive contexts like operating systems, browsers and libraries
That's a fun thing to say but it isn't really true. C/C++ are fundamentally important for lots of reasons. In many cases, folks choose C and C++ because that's the only way to get access to the APIs you need to get the job done.
This is all correct. I've been running my own servers for many years now, keeping things simple and saving a lot of money and time (setting anything up in AWS or Azure is horribly complicated and the UIs are terrible).
One thing to remember is that you do need to treat your servers as "cloud servers" in the sense that you should be able to re-generate your entire setup from your configuration at any time, given a bunch of IPs with freshly-installed base OSs. That means ansible or something similar.
If you insist on using cloud (virtual) servers, do yourself a favor and use DigitalOcean, it is simple and direct and will let you keep your sanity. I use DO as a third-tier disaster recovery scenario, with terraform for bringing up the cluster and the same ansible setup for setting everything up.
I am amused by the section about not making friends saying this :-) — most developer communities tend to develop a herd mentality, where something is either all the rage, or is "dead", and people are afraid to use their brains to experiment and make rational decisions.
Me, I'm rather happy that my competitors fight with AWS/Azure access rights management systems, pay a lot of money for hosting and network bandwith, and then waste time on Kubernetes because it's all the rage. I'll stick to my boring JVM-hosted Clojure monolith, deployed via ansible to a cluster of physical servers and live well off the revenue from my business.
"No more [...] slow compile times with complex ownership tracking."
Presumably this is referring to Rust, which has a borrow checker and slow compile times. The author is, I assume, under the common misconception that these facts are closely related. They're not; I think the borrow checker runs in linear time though I can't find confirmation of this, and in any event profiling reveals that it only accounts for a small fraction of compile times. Rust compile times are slow because the language has a bunch of other non-borrow-checking-related features that trade off compilation speed for other desiderata (monomorphization, LLVM optimization, procedural macros, crates as a translation unit). Also because the rustc codebase is huge and fairly arcane and not that many people understand it well, and while there's a lot of room for improvement in principle it's mostly not low-hanging fruit, requiring major architectural changes, so it'd require a large investment of resources which no one has put up.
Ubiquiti is awful, it's a cloud-centric ecosystem. The best "prosumer-grade" stuff is probably OpenWrt. If you need more power, opnSense or a plain Linux distro on an x86 machine.
1. Professional software engineers that can listen to learn about the problem space and are willing to come to understand that. This takes humility.
2. The people experiencing the problem. They might not write perfect code and it might not be maintainable long term. But their understanding of the problem and pain points is so good that they can solve just that and not write 90% of the code the professionals wrote...
I've seen this over and over again and can only tip my hat to the people that fixed their own problem. Just like for a dev, that means going into an unfamiliar domain and teaching yourself
Autoconf is one of the absolutely hilarious things about UNIX. On the one hand, we've got people optimizing kernels down to the individual instructions (often doing very unsafe dirty C tricks underneath), sometimes with super-clunky and overly complex APIs as a result...and on the other hand you have all the shell script absolute nuttery like the behemoth heap of kludges that is autoconf. It's crazy to me the disconnect.
And, oh by the way, underneath? Shells have some of the absolutely bonkersly dumb parsers and interpreters; absolutely embarrassingly dumb stuff like aliases that can override keywords and change the parsing of shell script. The fact that some (most) shells interpret a shell script one line at a time, and that "built-ins" like the syntax for conditions in ifs looks like syntax but might be little binaries underneath (look up how "[" and "]" is handled in some shells--it might not be!).
What a wild irony that all the ickiest parts of UNIX--shells and scripts and autoconf and all that stringly-typed pipe stuff, ended up becoming (IMHO) the most reusable, pluggable programming environment yet devised...
I concur with most of these arguments, especially about longevity. But, this only applies to smallish files like configurations because I don't agree with the last paragraph regarding its efficiency.
I have had to work with large 1GB+ JSON files, and it is not fun. Amazing projects such as jsoncons for streaming JSONs, and simdjson, for parsing JSON with SIMD, exist, but as far as I know, the latter still does not support streaming and even has an open issue for files larger than 4 GiB. So you cannot have streaming for memory efficiency and SIMD-parsing for computational efficiency at the same time. You want streaming because holding the whole JSON in memory is wasteful and sometimes not even possible. JSONL tries to change the format to fix that, but now you have another format that you need to support.
I was also contemplating the mentioned formats for another project, but they are hardly usable when you need to store binary data, such as images, compressed data, or simply arbitrary data. Storing binary data as base64 strings seems wasteful. Random access into these files is also an issue, depending on the use case. Sometimes it would be a nice feature to jump over some data, but for JSON, you cannot do that without parsing everything in search of the closing bracket or quotes, accounting for escaped brackets and quotes, and nesting.
The entire idea of push notifications on browsers was obviously toxic from the start, especially the privileged status "Do you want to enable notifications?" popups had.
I think the idea comes from the 2010's hype about Phone-Ifying The Desktop. Someone clearly thought they were recreating the Google Reader / RSS ecosystem (Mozilla had RSS in the browser in a flop)... but everyone else was just enthusiastic about dark patterns that were viable in mobile apps that didn't exist in a desktop browser.
uBlock Origin comes close, and surpasses in some ways (I used both for that reason) but lacks separate control of cookies, images, scripts, etc. So you can't accept a particular third party's images without also accepting its scripts, cookies, etc.
I mention it mainly in the hope that we can popularise its maintained fork 'nuTensor'.
After trying uBlock (as in attempting to also cover what I used to use uMatrix for) for a few weeks I think it's insufficient and nuTensor is the better option for me, but it quickly won't be if ~nobody uses it and it falls by the wayside.
Alternatively uBO could support the few details it lacks from uM? It seems like the problem basically was difficulty/time constraints in supporting both.. but I don't know why they were ever separate? There's plenty of overlap. If uBO had uM's granularity in 'advanced mode', that'd be perfect.
Any browser plugin is inferior to using a hosts file. Hosts file's blackhole any network request before even attempting to make a connection. These browser plugins only help if you're using the specific browser — they aren't going to help that electron/desktop app that's phoning home. They wont help block inline media links (Messages on a Mac pre-rendering links) that show up in your chat programs which attempt to resolve to Facebook. They also wont block any software dependency library that you install without properly checking if it's got some social media tracking engine built in.
I don't even waste time or cpu cycles with browser based blocking applications. Steven Black's[1] maintained hosts files are the best for blocking adware, malware, fakenews, gambling , porn and social media outlets.
The social fabric has been re-configured by the least socially adept people in society
I should know. I'm not terribly socially adept, I grew up on IRC channels and forums because I struggled to connect with people in person
But now everyone is on the internet, using social networks designed by people who aren't very social like me, or worse, people who only understand social interaction through a lens of "what can this person do for me"
We're in a really strange time.
I used to go online to get away from everyone and try to find other people like me
Now I have to go offline to in-person events hosting things that appeal to people like me, because everyone is online and there's no avoiding the crowd anymore
Right, because it's not actually a real truck. Real trucks are body on frame, a structure that doesn't distort under load. The CT is a unibody (monocoque) which can't support a real load from above or behind without deforming or snapping into pieces. It's a joke. It's a Hyundai Santa Fe that's been jacked up and weighed down with overthick steel panels that bring no value to the vehicle as a truck. It's a car pretending to be a pickup truck, which is fine for the suburban family that needs to grab a few bags of lawn fertilizer at Home Depot once a year, but it's not a real pickup truck and any comparison to even a 20 year old F-150 would be a crushing defeat in all normal pickup tasks.
That the language includes a package manager that fetches an assortment of libraries from who knows whom on demand doesn't exactly inspire confidence in the process to me. Alice's secure AES implementation might bring Eve's string padding function along for the ride.
Rust(TM) the language might be (memory) safe in theory but I have serious issues (t)rusting (t)rust and anything built with it.
This argument always feels like a motte and bailey to me. Users don't literally care what what tech is used to build a product. Of course not, why would they?
But that's not how the argument is used in practice. In practice this argument is used to justify bloated apps, bad engineering, and corner-cutting. When people say “users don’t care about your tech stack,” what they really mean is that product quality doesn’t matter.
Yesterday File Pilot (no affiliation) hit the HN frontpage. File Pilot is written from scratch and it has a ton of functionality packed in a 1.8mb download. As somebody on Twitter pointed out, a debug build of "hello world!" in Rust clocks in at 3.7mb. (No shade on Rust)
Users don't care what language or libraries you use. Users care only about functionality, right? But guess what? These two things are not independent. If you want to make something that starts instantly you can't use electron or java. You can't use bloated libraries. Because users do notice. All else equal users will absolutely choose the zippiest products.
Not just you, archive today has a beef with cloudflare. I wasn’t even using cloudflare intentionally but iirc Firefox has a dns privacy setting that I had to disable.
Since May 2018[35][36] Cloudflare's 1.1.1.1 DNS service would not resolve archive.today's web addresses, making it inaccessible to users of the Cloudflare DNS service. Both organizations claimed the other was responsible for the issue. Cloudflare staff stated that the problem was on archive.today's DNS infrastructure, as its authoritative nameservers return invalid records when Cloudflare's network systems made requests to archive.today. archive.today countered that the issue was due to Cloudflare requests not being compliant with DNS standards, as Cloudflare does not send EDNS Client Subnet information in its DNS requests
A lot of languages are doing single static binary deploys now. Rust, Nim, Go. It's a really nice pattern.
Static binaries are so much easier that the gross PHP / Ruby / Python pattern that has to ship directories full of files that (usually) have to be put in the correct place.
It's also easier than shipping a runtime like a JVM.
With a single binary, containers get even slimmer.
The whole push to the cloud has always fascinated me. I get it - most people aren't interested in babysitting their own hardware. On the other hand, a business of just about any size that has any reasonable amount of hosting is better off with their own systems when it comes purely to cost.
All the pro-cloud talking points are just that - talking points that don't persuade anyone with any real technical understanding, but serve to introduce doubt to non-technical people and to trick people who don't examine what they're told.
What's particularly fascinating to me, though, is how some people are so pro-cloud that they'd argue with a writeup like this with silly cloud talking points. They don't seem to care much about data or facts, just that they love cloud and want everyone else to be in cloud, too. This happens much more often on sites like Reddit (r/sysadmin, even), but I wouldn't be surprised to see a little of it here.
It makes me wonder: how do people get so sold on a thing that they'll go online and fight about it, even when they lack facts or often even basic understanding?
I can clearly state why I advocate for avoiding cloud: cost, privacy, security, a desire to not centralize the Internet. The reason people advocate for cloud for others? It puzzles me. "You'll save money," "you can't secure your own machines," "it's simpler" all have worlds of assumptions that those people can't possibly know are correct.
So when I read something like this from Fastmail which was written without taking an emotional stance, I respect it. If I didn't already self-host email, I'd consider using Fastmail.
There used to be so much push for cloud everything that an article like this would get fanatical responses. I hope that it's a sign of progress that that fanaticism is waning and people aren't afraid to openly discuss how cloud isn't right for many things.
The combined value of The Internet Archive -- whether we think just the infrastructure, just the value of the data, or the actual utility value to mankind -- vastly outperforms an individual contributor's at almost every well-paying internet startup. At the simple cost of not getting to pocket that value.
I wish I believed in something this much.