Building a fast all-SSD NAS on a budget

erulabs · on July 27, 2022

Jeff previously reviewed our all-SSD NAS, which is far more budget, if anyone is interested: https://www.jeffgeerling.com/blog/2021/kubesails-pibox-mini-...

Love seeing these projects from him, but this is a rare miss in my opinion. This is the strange middle ground where it's closer to professional than "budget", by quite a bit. At some point, only a tiny tiny fraction of users need 40TB of space. I guess what I'm saying is, this isn't so much a NAS as a specialized youtuber video recording appliance. We have a number of home-hosting users on our platform that run entire racks filled to the brim - and while that's awesome - it's just not what "budget" or "NAS" implies, and it's an extremely limited audience.

dylan604 · on July 28, 2022

At the end of the video, the config he settled on doesn't have 40TB of useable space. He only has 16TB of useable. Essentially, a RAID-10 with a hot swap. He's only getting the speed of striping 2 drives together. Sounds like he's playing it safe choosing a bit of redundancy rather than just straight speed.

geerlingguy · on July 28, 2022

In this case, the majority of my income depends on the edit server being online (thus the decision to go basically RAID10 + hot spare). I have a full backup plus an offsite copy, but they are over 2.5 Gbps on spinning disk, or take 6+ hours to restore from offsite, so they're emergency use only.

I'm also trying to get my homelab set up to be a bit more robust / automated / reproducible, and trying out a bunch of different ideas. I'll likely settle on something else in a year's time, but the main motivation for now was to get all the storage into my rack on my 10 Gbps network.

dylan604 · on July 28, 2022

I've felt the pain of everything you're experiencing. I've managed storage pools with 48 individual drives of spinning rust. Originally, for the lulz, we striped all 48 as RAID0 just to see the speeds essentially saturating the bus. We then reconfigured to 4 arrays with 16 drives each into RAID6, then striped those into RAID60. We started playing with removing a few drives from each array as hot spares.

At one point when I was younger, that was all fun. Now, I just want it to work so I can go back to having a life and let others do all the experimental stuff. But now I get to live vicariously through people like you posting their results. Good stuff!

jmole · on July 27, 2022

Ehhhh, he said "All-SSD NAS on a budget", not a "Budget SSD NAS".

Agree though that this was a bit of a miss; the cost of drives here dwarfs everything else, and it makes the choice of case, MB, etc. nearly irrelevant and needlessly cost optimized. Why not spend an extra $60 and get a 5.25->6x 2.5 hot swap cage, for example?

woleium · on July 28, 2022

I suspect he just wanted a sponsor for the kit he didn't already have :)

ahepp · on July 28, 2022

If you're going to make an all SSD NAS, wouldn't it at least make sense to have a 10gbps network link and more than 3gbps PCI bandwidth?

intothemild · on July 28, 2022

Yeah this was the big criticism on r/homelab, that it's not a budget build at all, it's very misleading. Most of that build was free because he is who he is.

The only other thing that annoyed me is calling TrueNAS Core (FreeBSD) a "Distro" in the same sentence as a Linux distro.

liuliu · on July 27, 2022

TBH, not sure if spending $3500 on 40TB storage v.s. ~$800 with rotating disks at the same storage capacity. You can put $200 on top with a 2TB NVMe SSD as cache.

The reason to question this is that 40TB seems small if you want to have a NAS for small video editing studios. And for personal use, you probably not going to need more than 2TB work set paged in at any given moment.

walrus01 · on July 27, 2022

personally if I had to do this I would go with rotating disk for bulk storage in a NAS, and something like two 2TB to 4TB size NVME SSDs in a proper video editing workstation motherboard directly attached to pci-e 4.0 bus.

This will be considerably faster for working with "immediate" needs of video files rather than over a 10GbE network.

like, a difference of 900MBps over network vs 2500MBps with local sequential read/writes on NVME SSD on same motherboard.

TheDudeMan · on July 27, 2022

"I find myself needing a central storage solution disconnected from my main workstation."

walrus01 · on July 27, 2022

there would still be such in what I described, just that it would have a much higher capacity composed of 16-18TB each spinning disks.

geerlingguy · on July 28, 2022

In this case the storage is used directly for the editing, though. I am able to close a project on one computer, hop over to another and within seconds start editing at full speed (without requiring proxy footage).

HDDs can't provide the latency that I need, unless I was running a dozen or two to overcome the horrible seek speeds (or had at least double the RAM so the entire project would be cached in RAM).

derefr · on July 28, 2022

What's wrong with proxy footage, though?

Especially if the NAS was hyperconverged-in-the-small (i.e. had the spare compute to render out proxy footage for uploaded files in the background.)

geerlingguy · on July 28, 2022

Nothing really, but it's just one extra layer of inconvenience. If you can edit full res, it's nice to be able to (especially for color grading and cropping purposes).

jjcm · on July 27, 2022

Possibly for smaller projects, for anything remotely sizable, 2TB is likely not going to cut it. 5k prores is 1TB for every 30min of footage, which means you're only getting an hour out of a 2TB drive.

Storage needs for any pro video workflow get very large, very quick.

Dylan16807 · on July 27, 2022

Well you're talking about 60fps ProRes 4444 (XQ?) footage, and the article is talking about 30fps ProRes 422 LT footage. That's 6.5x (10x?) less data for the same size project, and not much difference in how "pro" it is.

tiernano · on July 27, 2022

That is true... But in that case, double or quadruple your SSD...

sorenjan · on July 27, 2022

I'm not familiar with NAS file systems. Is it fairly straight forward to use hard drives with SSDs as transparent cache, and make it look like a single file share?

js2 · on July 28, 2022

I had written a long comment about ZFS caching and still didn't do a good job, so let me link you to this instead:

https://www.truenas.com/docs/references/slog/

The short version is:

- ZFS caches reads into RAM. Gobs of RAM helps. You can also add a secondary read cache device (L2ARC).

- ZFS can caches synchronous writes to an SLOG device, but if you've disabled synchronous writes it won't make a difference. An SLOG device makes no difference for asynchronous writes.

I use ZFS for my Time Machine backups (among other things) and have synchronous writes disabled for the Time Machine datasets.

Before we had ZFS, the traditional way to speed up NAS was a raid controller with a battery-backed RAM cache.

sorenjan · on July 28, 2022

Thank you for your answer. I take it that's a no to the "fairly straight forward" part of the question, but I'll look into TrueNAS if I'll decide to build a NAS.

liuliu · on July 28, 2022

Before ZFS, you can layering bcache on top of your file system. But otherwise, ZFS L2ARC I believe nowadays also does persistence L2ARC.

pdimitar · on July 28, 2022

How do you use your ZFS NAS for Time Machine backups? Very curious, I'd like to do the same.

js2 · on July 28, 2022

I'm using TrueNAS which makes it pretty easy. I'm doing it via SMB (Samba). Previously I was doing it via AFP (Netatalk). This looks more or less correct:

https://www.reddit.com/r/MacOS/comments/lh0yjc/configure_a_t...

Basically it creates a dataset and a Samba share, and the configures Samba with a couple plugins so that: 1) a dataset is created for each user that connects to the share, so there's per-user datasets; 2) a ZFS snapshot is created automatically whenever Time Machine disconnects from the share. (2) is so that you can roll back if need be.

In the past, I've had Time Machine decide that the sparse image it uses for a backup is corrupt and it wants to zero it out and start over. The ZFS snapshots lets you recover from that, though I haven't had it occur under macOS Monterey and/or since I switched to SMB from AFP.

On the parent dataset I disabled sync.

If you google for "TrueNAS Time Machine" you'll find some discussions of all this in the TrueNAS forums.

sgarland · on July 28, 2022

That sounds great, but my biggest complaint with Free/TrueNAS is its insistence on the GUI. I expose a ZFS pool running in Debian to my network. All configuration for it an any other VM is handled in Ansible, and the OS images themselves are built with Packer + Ansible.

I suspect the "Purpose: Multi-user Time Machine" selection is doing a lot of heavy lifting, but it's hidden away under a shiny frontend.

js2 · on July 28, 2022

I totally understand if TrueNAS isn't for you. I use a mix of configuring it via the GUI and via the terminal. FWIW, it stores its entire config (everything but ssh keys) in an sqlite DB which you can export/import so despite the GUI, if you really wanted, you could configure it using Ansible or whatever with a bit of hackery.

The "Multi-user Time Machine" is doing a bit of heavy-lifting yes. If you're really interested I can pull-out the details of the Samba config and how the ZFS dataset is configured. I do think it's using some TrueNAS specific Samba plugins, but I guess in theory those are open-source and could be compiled for Debian.

sgarland · on July 28, 2022

It's a very nice product, and I've installed it before and imported my pool just to try it out. But since I already have monitoring, and my day job is automating all the things, it wasn't a good fit. For people who want an eminently reliable datastore with decent observability, it is perfect.

> If you're really interested I can pull-out the details of the Samba config and how the ZFS dataset is configured.

I would be thrilled to try implementing that, yes!

js2 · on July 29, 2022

> my day job is automating all the things

So is mine. but good golly, who the heck wants to keep doing their day job at home? For my home gear, I try to keep the IT stuff to a minimum. I've been using FreeNAS/TrueNAS for maybe a decade now and it just chugs along, no muss, no fuss. Migration and upgrades have been trivial.

But anyway...

Here's my samba config:

https://gist.github.com/jaysoffian/43636b535ec37fdd31f0514bc...

The "ixnas:" options require the Samba ixnas VFS. It's here, but I actually don't see these options in it:

https://github.com/truenas/samba/blob/release/22.02.3/source...

Here's a discussion of what they do though it's pretty obvious from their names:

https://www.truenas.com/community/threads/configuration-opti...

The fruit VFS is part of the normal Samba distribution.

Here's the dataset properties:

https://gist.github.com/jaysoffian/2d2ac21d60bc40194e01f362c...

Good luck.

rektide · on July 27, 2022

For compare $16/TB is pretty awesome[1], which for that budget would be about 217GB. ~5.5X.

[1] https://diskprices.com/?locale=us&condition=new&capacity=12-...

mattalex · on July 28, 2022

He also has a hard drive based NAS, this is an additional local buffer for current work.

walrus01 · on July 27, 2022

I would be extremely cautious about using any consumer grade TLC or quad-level-cell SSD in a "NAS" for serious purposes because of well known write lifespan issues.

There's a reason that a big difference in price exists between a quad-level-cell 2TB SSD and an expensive enterprise grade one with a much higher TB-write-before-dead rating.

This might look cool but check back in a few years and see how much of the drives' cumulative write lifespan is worn out.

I also cannot even imagine spending $4000+ on a home file server/NAS with copper only 10GbE NIC and it not having at least one 10G SFP+ interface network card.

Okay, so he wants it to be tiny? But in a home environment the major problem is more power consumption and noise, so you can often go with a well ventilated 4U height rackmount case for full size ATX motherboard, which is roughly the size of a midtower PC case turned on its side.

This lets you use motherboards that will have enough PCI-E 3.0 x8 slots for at least one dual-port Intel SFP+ 10G NIC which are very, very cheap on ebay these days.

derefr · on July 28, 2022

> I would be extremely cautious about using any consumer grade TLC or quad-level-cell SSD in a "NAS" for serious purposes because of well known write lifespan issues.

I don't know what you're using your NAS for, but the author is using it as scratch space for raw video files. It's not an OLTP DBMS or anything. It just needs really fast ingest of files beyond the capacity that a DRAM cache can provide.

> I also cannot even imagine spending $4000+ on a home file server/NAS with copper only 10GbE NIC and it not having at least one 10G SFP+ interface network card.

The author's editing rig doesn't necessarily have a 10G NIC, let alone being attached to a 10G switch with runs of CAT6a; and there's only one device talking to this NAS at a time (as the author cannot be in two places at once.) So what'd be the point?

walrus01 · on July 28, 2022

If the author's editing rig doesn't have a 10G port, what is the point of spending money in $ per TB for the SSD on the other end of a network connection, because something like a 6 drive raid of spinning disks can very easily saturate approx. 100MBps on a 1000baseT link.

kalleboo · on July 28, 2022

Latency for scrubbing in projects

mbreese · on July 27, 2022

TrueNAS used to be designed to boot off of smaller SataDOMs that were used only for boot. They were effectively WORM. At least, it used to be a few years ago. Everything that was written for the server was either a RAM disk or spread out amongst the RAID drives (as a separate partition, which has its own issues, but still).

I had assumed this is what he was using the TLC SSD for. If that’s the case, so long as there isn’t much writing to it, it should be fine.

walrus01 · on July 27, 2022

The 8TB Samsung QVO drives are quad level cell consumer grade drives, that's his main storage.

awiesenhofer · on July 28, 2022

> well known write lifespan issues

Reports of SSDs write span issues have been greatly exaggerated ;)

At least nowadays even with QVOs thats not something consumers have to think of much anymore.

These SSDs have 8TB, so to exceed its write endurance Jeff would need to write 4TB to all of them each day for 3 full years.

cyounkins · on July 28, 2022

SMART will report expected remaining lifespan. Another thing you can do is write k TB per drive where k is 0, 1, ... N-1 where N is the number of drives. This staggers the endurance so they don't all fail simultaneously.

Dylan16807 · on July 27, 2022

> I would be extremely cautious about using any consumer grade TLC or quad-level-cell SSD in a "NAS" for serious purposes because of well known write lifespan issues.

How many full-drive writes does your average video editing server need? I would expect a pretty small number. The average source file is sitting there for weeks or months.

geerlingguy · on July 28, 2022

Yeah, my bet is that the rarity of writes (some metadata in the video project files) will give these drives plenty of longevity.

For a use case like database transactions, log storage and frequent data dumps, the game changes quite a bit. I would definitely shy away from the QVO drives for that use case.

I've had these drives in service for about 8 months in my regular NAS, before transferring them to this new build, and they are all checking out okay still.

But this is also why I'm doing the striped mirror plus a hot spare. The only real challenge would be if the drives have a firmware issue, and they all die at the same moment after like 4 years due to a bug (like how HN's servers died...).

technofiend · on July 28, 2022

Jeff edits on a Mac with a 10 gigabit nic; he says as much in the article. Unless he's in an odd duck situation with an SFP device connected to his Mac I'm not sure what value going to SFP and back would add?

geerlingguy · on July 28, 2022

SFP is nice for the switch-to-device connection. It lets you choose between DAC (cheap and easy to route) if you're close enough, fiber, or copper (your choice). If I had an option, I'd use SFP+ everywhere.

But copper is usually a little simpler for consumer/prosumer devices. Someone does make a Thunderbolt to SFP+ adapter but that things like $300!

technofiend · on July 28, 2022

Yeah that $300 device didn't seem like anything you'd use for a budget setup, so unless you need the distance advantage of SFP+ over copper, I don't see a reason to use it. Other than it's cool, of course. Which may be reason enough. :-)

icehawk · on July 28, 2022

Personally I'd not be too worried since the last QVO devices I put in a NAS lasted three years, and had about 300TB of reads and 500TB of writes before they triggered the SMART endurance alerts and were replaced.

At 500x whole-drive rewrites I think I got my money's worth out of an $84 1TB drive.

fomine3 · on July 28, 2022

Samsung QVO SSDs aren't very cheap to justify its lower TBW, especially if you write much like that. There are many TLC drives with great TBW rating at similar price, like Seagate Firecuda. Some are DRAM-less but HMB is good enough for vs QLC drives.

nichch · on July 27, 2022

I was thinking the same thing, but wouldn’t these be okay if his workload is mainly WORM?

hatware · on July 27, 2022

This is definitely an engineering disaster. Sometimes we get too caught up in how to do something that we never ask ourselves if we should.

mrb · on July 28, 2022

"I was able to get consistent performance using a striped RAIDZ mirror pool with four SSDs and a hot spare"

That doesn't make sense. The author is not using raidz. In ZFS terminology, he's mirroring two striped vdevs (like raid10), plus using a hotspare. And that is a bad choice, as he gets only 16 TB usable, and the mirrored stripes could fail with certain combinations of two drives failures.

Instead, he should have set up a raidz2 across 5 drives: 50% more usable space (24 TB), can tolerate any two drive failing, and it would give him higher performance on sequential I/O.

I have a raidz2 pool across 6 spinning rust 18TB hard drives, and my server can handle 900 MB/s sequential reads, and 700 MB/s sequential writes (benchmarked locally). It it was built on 5 SAMSUNG 870 QVO SATA III SSD drives like the author, it would certainly get to 1.5+ GB/s in both sequential reads and writes. In other words the bottleneck would shift to the 10 Gb/s network (1.25 GB/s). For comparison the author discloses his writes are limited to 700 MB/s (over the network, so his bottleneck is not the network link but local I/O contention).

That's right, a 6-HDD raidz2 matches the write performance of his 5-SSD raid10 setup. And that's because the write overhead of raid10 is 100%, while the write overhead of a 6-drive raidz2 is only 50%.

Edit: Oh and the strange performance issue that he noticed "disappeared" after a while is most likely due to cache recovery. The drive can sustain writes of about 490 MB/s for a little while then it drops to 170 MB/s, but after idling 5 minutes it can recover the initial speed. See https://www.tomshardware.com/reviews/samsung-870-qvo-sata-ss...: "As we noticed with the 1TB model, the 8TB model’s cache recovery mechanisms work similarly. After letting the drive rest at idle for 30 seconds, the 870 QVO gains back 6GB of its cache. It recovers fully with 5 minutes of idle time. "

geerlingguy · on July 28, 2022

Note that the 8TB model has about 50TB of cache (quite a bit more than the 1TB model—it doubles in size with each doubling of storage capacity).

One concern Wendell had with RAIDZ2 was the potential for the older Xeon to be a bottleneck for writes. It probably wouldn't be, but I didn't do too much testing with that layout. I might still, we'll see.

mrb · on July 29, 2022

For reference, my server CPU is an AMD Ryzen 7 PRO 5750G (65W TDP, 3.8-4.6GHz, 8-core, 16-threads) and CPU usage is around 8% when writing at 700 MB/s on my raidz2. In other words, only a little more than 1 of the 16 hardware threads is utilized to handle this workload. Your older Xeon D-1521 should be able to handle that, but it's true that being an older processor, it is pushing it closer to its limits. Handling concurrently 10 Gb/s of network traffic would likely push it to more than 50% utilization. That's one reason I like to not cheap out too much on the CPU, especially for a 10GbE ZFS NAS.

CharlesW · on July 27, 2022

In this case "on a budget" means $4,329. That's reasonable if it speeds up billable work, but sadly the cost puts it a bit out of reach for my home office.

dylan604 · on July 27, 2022

"I edit videos non-stop nowadays."

For a video editor, at least one that's been around long enough to remember DAS and SAN solutions, $4300k for 40TB of edit capable storage is cheap.

Perspective is everything.

bobdvb · on July 28, 2022

The 60TB Isilon I ordered for video storage was 115k... (5 years ago granted)

dylan604 · on July 28, 2022

The first Isilon system I worked on was in 2008. It was less than 60TB and more expensive than 115k and used Gigabit connections. The first fibre connected SAN I used was in 2006, and was closer to 300k by the time all of the HBA cards, SFPs, 1000 per seat licenses, and the drive arrays were added together for around 48TB.

And peeps wonder why we think $4500 is cheap!

PaywallBuster · on July 27, 2022

the budget option is < 800$ ?

magicalhippo · on July 27, 2022

Still quite a lot. All you really need is an old i7 and a 10GbE Mellanox Connect-X2 or Connect-X3 card from eBay for $10-20.

CTDOCodebases · on July 27, 2022

I agree with your message but it's hard to find an i7 that supports ECC RAM and the Mellanox Connect-X 2 has lost support in modern distros.

Best bet is to pick up an old HP Z620 of find someone who is upgrading their old Xeon homelab. Generally its a choice of cheap, quiet, energy efficient and you can only pick two of these options.

geerlingguy · on July 28, 2022

In my case, one of the main goals was quiet, which ruled out some decent used 1U servers I found at the recycler :(

Someday I'll have a proper closet but right now my rack is under the kitchen and my wife complains when the servers get above about 60 dB.

CTDOCodebases · on July 28, 2022

Yeah I live in a studio and quiet is hard. I am also fussy and don’t want anything that looks out of place.

I actually built my desktop as a near silent PC with the goal of it becoming my NAS when I upgrade but I didn’t understand the implications of running ZFS without ECC at the time.

magicalhippo · on July 28, 2022

FWIW I've run ZFS on my NAS since 2009, and I have had no detectable file corruption due to not using ECC. There might be something somewhere but it's not worse than other filesystems or anything in that regard.

So while I'd certainly sleep sounder with ECC on it, it's not a complete horror show like someone likes to portray it as.

CTDOCodebases · on July 28, 2022

Yes I guess it comes down to the likely hood of the RAM going bad without the user realising it.

I have had one stick give me issues over the last 10 years but that is out of probably 20 sticks.

antisthenes · on July 28, 2022

Would your home office benefit from 40TB of SSD space?

Or is it one of those things that would be cool to have but not really necessary?

neilv · on July 27, 2022

That's a neat 2U case design, and will fit in some very shallow wall-mount network switch cabinets.

For installing outside of a machine room/closet/center, if you're using 2U of height, you might also fit a PSU with a larger and quieter fan, since all the Flex PSUs I've had come with noticeably loud fans. (I replace them with Noctuas, but it isn't a fun kind of soldering, IMHO.)

The components from the build would also fit in a Supermicro 1U short-depth chassis, especially if you can go a little deeper in your cabinet. (My new K8s server got a used Supermicro 1U chassis for ~$60 shipped, including a PSU. In the photo on https://www.neilvandyke.org/kubernetes/ , it's the 1U immediately below the 4U.)

gigatexal · on July 27, 2022

Anyone know if any core level work on ZFS where there’s an effort to audit the code base for speed ups given the big differences in designs between spinning rust and SSDs?

mastax · on July 27, 2022

I do know that if you have very fast NVMe SSDs (>6000MB/s or so) ZFS is not currently able to give you the whole performance, due to time spent memcpying to/from the ARC[0]. Direct IO support could eventually alleviate this[1].

[0]: https://github.com/openzfs/zfs/issues/8381

[1]: https://github.com/openzfs/zfs/pull/10018

jsmith99 · on July 27, 2022

Oracle themselves have been selling all flash ZFS appliances for a long time so I imagine this is a development focus.

mgerdts · on July 28, 2022

SATA SSDs top out at about 550 MB/s. ZFS does ok with these. NMVe SSDs top out at somewhere between 3000 MB/s (typical enterprise PCIe Gen 3) and 14000 MB/s (mythical just released PCIe Gen 5 drives). ZFS (out of the box, anyway) can’t drive PCIe Gen 4 or newer drives very hard.

If your IO all goes over a 10 Gb link, that will be your bottleneck before ZFS is.

Since 1 MB record size was used, this may be a candidate for a bunch of mirrored HDDs with a few TB of nvme for l2arc and ZIL. But with a bunch of SSDs already on hand, why bother?

FullyFunctional · on July 28, 2022

My "budget" NAS can trivially saturate 10 GbE with a pair of Samsung SSD 980 PRO 2TB. It cannot saturate my 100 GbE connection. While it's true that TrueNAS perhaps can't quite get the same performance that you could get from XFS on the same hardware, that's really irrelevant for me as the features (snapshots and send) and data integrity is far more important.

mgerdts · on July 29, 2022

10 Gb/s =~ 1.2 GB/s

Each 980 Pro can do sequential reads at 7 GB/s. ZFS with record size 128 KiB will do IOs at a size that will perform about as well as sequential regardless of the actual pattern.

AFAICT ZFS has no need to push your drives beyond 10% of their overall capability assuming the clients are performing sequential reads or the reads are at least 128k. A more stressful read test would be a 4k random read where the working set doesn’t fit in the ARC. This would trigger a lot of read inflation.

The write side is a bit more complicated as the drives may be able to write at over 5 GB/s if they have 30+% of their NAND erased, else the write rate will be closer to 1.4 GB/s. With a mirror this is pretty close to the network bandwidth. With some reads mixed in for COW the drive could become a bottleneck before the network or ZFS.

FullyFunctional · on Aug 1, 2022

I think we agree so I'm not sure what you meant. My point was that you could have made the stated "budget" all-SSD much much cheaper than what he did. A pair of 980 Pro would easily beat it on bandwidth. How close it would get on IOPS I don't know, but if you stay away for 128 KiB record size you should be in the neighborhood.

mbreese · on July 27, 2022

I doubt any Oracle SSD performance enhancements will make it into OpenZFS though.

sorenjan · on July 27, 2022

> Every minute of 4K ProRes LT footage (which is a very lightweight format, compared to RAW) is 3 GB of space

Do you really need to save all of that footage? I would think keeping the pro res footage for the current projects on the workstation and reencoded archive video on a NAS would be sufficient. I'm not a video professional, but I suspect it's easy to fall in the trap of thinking that you need to save everything in highest possible quality in case you need it later, but what are the realistic chances of that? If you end up needing some old footage again, AV1 coded 4K or even HEVC 1080p would probably be just fine. The final result are Youtube videos after all.

I know he mentions editing from it, but that's enough space for more than a week of pro res video.

dylan604 · on July 28, 2022

> I'm not a video professional

I am a video professional, and you always keep your originals. However, the part being left out of the conversation is that as a storage pool for editing goes, this isn't deep storage. Content is typically only left on the edit storage while the project is active. Even the TFA mentions he goes all the way to cloud cold storage.

Edit storage has always been expensive, and maintaining capacity was always a juggling act. Just because terms like NAS are being used, one should not think of this type of storage as a dump it and forget it type of storage.

There are many levels of professional. On the high end, the footage from the camera is copied to multiple hard drives on set. These are the backups, and us old timers still use the terms camera originals. These are as sacrosanct as film negatives, only there's magically multiple copies. This data gets transferred to the editor's storage. Once the edit is completed, the edit session and other content used in the edit may be transferred to the camera original drives for archival. On the other end of professional, you take the SD card out of the camera and transfer directly to the edit storage. In these situations, woe be unto thee that doesn't make proper backups. At that point, it's really more pro-sumer than professional, but hey, if they're getting paid, they're professional enough.

sorenjan · on July 28, 2022

So lets say you release a 10 minute Youtube video. You now have 60 minutes of raw footage with multiple takes, you sitting in front of the rolling camera going over the script, ums and ahs, and maybe a take where the phone rang in the middle of a sentence. What are you going to use that footage for in the future, and is this hypothetical use case absolutely dependent on having raw 4K video? I understand that your answer is from a general video production standpoint, but my question was more in the line of "is this really necessary for this kind of content creation?".

And yes, I did see the part about editing, but are you really editing nine days worth of footage at once? Would it make more sense to put less but faster storage directly in the workstation?

dylan604 · on July 28, 2022

Again, you're limiting to just YouTube creator styles, and I'm including full on editorial for things like corporate training, episodic, feature, music videos, etc. That's my world. Obviously, the longer the final piece, the more footage necessary. However, something short like a music video can require a huge amount of footage that you might not consider. All of those quick edits of just a few frames to seconds of a shot wasn't that short from the camera. Some of those types of shots might not have ever been intended to be used and be something in between takes. It's all creative, so there are no rules in music videos.

Doing that kind of editing does require access to all of that footage at any moment. If you're doing supervised sessions with the client sitting next to you means that they on a whim could decide they want to work on a totally different part of the content. When the client is footing the bill by the hour, you don't get to waste time "loading" content.

It's really one of those things that until you've walked a mile in another person's shoes, it's best to not go blindly suggesting "better" based on one's limited knowledge. Asking questions for a better understanding is a totally acceptable way of learning, and I'm all for it. We're danger close to the former. Hopefully, I'm not sounding like an asshat leaning towards the latter.

sorenjan · on July 28, 2022

The article is written by a Youtuber for his Youtube content creation. If you're going to extend the scope to feature films and music videos because that's what you're used to, obviously my assumptions won't be valid. I'm not biking to work on a $15k carbon fiber bike or storing my left over dinner in a walk in freezer either.

And I repeat another point. Using his own numbers, this editing NAS has space for over 220 hours, more than 9 consecutive days. What kind of music video production even has that much video footage? Like I wrote in my initial comment, I fully understand using plenty of uncompressed video while editing, it's the saving of hundreds of GB per finished and delivered Youtube video I question.

geerlingguy · on July 28, 2022

I typically have 5-10 videos in production in a given week (doesn't mean I'm working on them actively, but I'll sometimes shoot parts of different videos depending on the schedule).

Some videos are 10 minutes, others are more like 20. On average I end up with 100-200 GB/video by the end.

So you're correct that I don't need 14TB (that's overkill) for now... but anyone who plans storage knows you're better front loading the extra capacity if you can afford it, because needs always seem to grow faster as time goes on (e.g. 4K to 8K, ProRes LT to something with more color space maybe...).

And right now I keep all my "A-roll" (scripted takes, usually with about double the run time of the actual video), but I'm considering setting up a script to automatically convert those clips to H.265 afterwards so I can still have reasonably good archival footage but at like 1/100th the file size.

All B-roll shots (Dolly shots, handheld closeups, illustrations, etc.) I save at full original resolution, because they are often useful in follow-up videos. But that's usually only like 30-40% of that 100-200 GB per project.

(Edited to add: sometimes there's a unicorn video that also has raw Timelapse footage where you roll at normal rate then generate a Timelapse afterwards (and use some of the real-time footage). Those can have hours of footage and a couple have gotten beyond 400 GB. But I try to limit that just because of the hassle.

Easier to have one camera devoted to the Timelapse, and another for some other shots for real-time use.)

sorenjan · on July 28, 2022

> I'm considering setting up a script to automatically convert those clips to H.265 afterwards so I can still have reasonably good archival footage but at like 1/100th the file size.

FFmpeg has gotten decent AV1 support with SVT-AV1 recently. I think it would make for an interesting video to compare different codecs and see which one is "best" for a certain set of trade offs. You could make a script that encodes a video snippet while varying parameters like codec, preset, crf, grain, 10-bit, hardware or software encoding, etc, use one of the automatic quality metrics like Netflix's VMAF, and plot size, quality, encode time, power usage against each other.

dylan604 · on July 28, 2022

slight nitpick, as timelapse is an actual thing for me. your sped up footage isn't timelapse. tomato/tomahto, old dog says words have meanings, blah blah. On tape, we'd call that varispeed playback. Now it's more of a timewarp (even if you don't do the ramping).

Please, don't cheapen my career shooting actual timelapse! ;P

geerlingguy · on July 28, 2022

Note that I shoot both, depending on the project needs ;)

I usually like to use my Nikon D700 for that.

dylan604 · on July 28, 2022

>The article is written by a Youtuber for his Youtube content creation.

This creator can't do other editing gigs besides his own content? Granted, he's probably pretty busy keeping up with his own stuff, but don't put baby in the corner by suggesting they only work on YT content. You'd be amazed at how many balls an editor can juggle. I currently have a couple of corporate gigs in various stages, wrapping up a music video, and a plethora of personal side projects on my system right now. Currently in preproduction meetings with new client to specifically help them start their YT and other social media content creation. So because the conversation started with a YT content creator means the conversation can't be expanded to introduce you to the bigger picture of the same topic?

vineyardmike · on July 28, 2022

Love the write up and insight into the world of a professional.

Don’t take this the wrong way, I’m genuinely curious… what is a video professional doing on HN? Was it a past life? Is tech just a side interest? I see many non-tech professionals on here (doctors lawyers too) and I’m always curious what drew people in!

dylan604 · on July 28, 2022

I've been banging code since I was 10 copying BASIC from the back of a magazine into a C64. Got into video production in high school. Saw an entire table full of existing video equipment get replaced by a single video card inside of an Amiga. Decided then and there that I'd be doing video on computers. Got a job during college that put me in really cool video/film post house. Learned the entire process of working with film from developing, transferring to video, editing that video, then taking that edit back to film negative editing. Once it went full digital and streaming, because of my knowledge of how the post process works, I knew how to prep video to remove interlacing, 3:2 pulldown, etc for handing off to streaming platforms. That work can require a lot of custom code to automate. So when I get bored and burned out of the video engineering/coding jobs, I go back to production for a little while. When I get tired of being broke as an "artist", I go back to coding for money. Covid killed my last round of production work, so I went back to video engineering for couple of start ups.

probably more than you really wanted to know, but that's it summed up in graph. i've learned that if a job requires working with a computer in any way, knowing how to bang code will come in handy even if it's not a coding job. just knowing how to automate a few things here and there makes cubicle life less mundane.

rektide · on July 27, 2022

Personally I kind of fright from the "have you considered throwing away your best copies" as a sales strategy.

On the flip side, I think it would help a lot a lot a lot if there was more discourse out there about compressing high-dynamic-range content. AV1 supposedly has some capability to do a good-ish job with HDR. The idea of taking reels of raw video and spending a couple days squishing it to 1/10th the sizes but preserving the quality/flexibility very very well is a value proposition that I dont think is clearly attainable, even though it seems technically perhaps within reach.

Right now, I think the general feeling is, raw is raw & everything else bakes in a vast amount of assumptions & constrictions. Some advocacy that compressed video can be as flexible, as dynamic, as capable need to be more present, elaborated, & proven before anyone's going to be comfortable throwing away the bits. These are people's life's works & a couple hundred or thousand dollars a year more in storage isnt a real factor for such integral, near work.

sorenjan · on July 28, 2022

I don't have a sales strategy, I'm not selling anything. I'm questioning the need of buying needless storage for data hoarding you're never going to use.

Have you seen Jeff's videos? They're mostly him talking to the camera in what I think is his basement, or closeups of various PCBs. If you're running a stock photo company I get wanting to keep footage at max quality, but once your Youtube video is published, do you really need to save all those alternate takes and cut out bits? I just don't see them being very useful in the future, and if you do end up needing some of it for B roll or whatever, would a couple of seconds of compressed video really make a drastic difference for the project as a whole? Especially since the final result ends up on Youtube and consumed on a phone screen or a TV two meters away.

Much of the content that's produced today is not trying to be timeless classics with endless rewatch value, it's ephemeral and only really relevant for a short time. If you can reupload your videos to other video hosts in the future that's probably good enough. Nobody is watching reviews of three year old Raspberry Pi add on boards for the low noise in dark parts of the image, or the accurate color reproduction, or even the 4K instead of 1080p resolution. Kill your darlings!

vineyardmike · on July 28, 2022

For many people it’s an art. Just because it’s a raspberry pi video doesn’t mean people don’t want to be proud of the video quality and color accuracy and all that jazz!

Why comment your code or name your variables well if the project will be over soon? No one is buying your product for the clean code behind it. If you need it later just use a reverse compiler. It’s good enough for a project you may never need!

People do things for the art and to be proud of the quality. And you don’t want to throw that out!

smilekzs · on July 28, 2022

Not a video pro, but a counterpoint: "lossy encode, then process" is inferior to "process, then lossy encode". Reasoning: Some processing steps may amplify the noise introduced by lossy encoding. Other processing steps may be sensitive to SNR. Such steps chained together in a pipeline can make the otherwise small extra noise unacceptably big.

sorenjan · on July 28, 2022

Note that I suggested reencoding footage for archiving, not processing. Although again, the final result ends up on Youtube and gets reencoded by them and viewed on all kind of devices.

aetherspawn · on July 27, 2022

At this speed I’m thinking you’re probably going to bottleneck on the network/switch.

Ubiquiti have a cheap fiber optic switch you could try. You could also try a 2x 10G SFP+ configuration, which would give you 20 Gbps (but only 10Gbps per client).

karmicthreat · on July 27, 2022

I just went through getting the parts for my own NAS. All SSD was way overkill for my needs, I ended up going with spinning disks and a SLOG cache. I kept waffling about the motherboard to use but I ended up with a x470d4u motherboard with a Ryzen 7 4700GE which brings the TDP down to 35W. I wanted this to be kind of quiet. I will put a 10Gb network card on it eventually.

Maybe not THE BEST (tm) choices. But I was getting bad decision paralysis choosing parts.

alexk307 · on July 27, 2022

Huge fan of Jeff’s work on YouTube! Highly recommend checking it out if this blog interests you

tambourine_man · on July 28, 2022

I love Jeff’s channel.

Very entertaining and I learn a ton.

I have a Raspberry Pi NAS with 10TB (2TB of redundancy) but it’s in a box made out of MDF. And while I had fun making it, let’s just say I’m not a woodworker.

I’m amazed at how professional the stuff he does looks all while making it seem easy.

nope96 · on July 28, 2022

ECC RAM, yay! I've read so many posts like this where they did NOT use ECC RAM.

abracadaniel · on July 28, 2022

It's surprisingly hard to do on a budget, since you're essentially locked into Xeon products and while CPUs are available used, getting a motherboard to support them isn't easy. The combo in this post, or waiting for a a full system to show up on ebay are probably the only real options. I'm really looking forward to DDR5 getting wide adoption, as that's really going to expand our options.

peterbraden · on July 28, 2022

What do you use to mount this on a mac? My bottleneck with my NAS is the atrocious speeds I get over samba or NFS, and even after spending a ton of time messing with the configs, I can’t get more than sluggish speeds.

ahepp · on July 28, 2022

I followed some guides I found online and ended up with the following:

    ~ cat /etc/auto_master 
    #
    # Automounter master map
    #
    +auto_master            # Use directory service
    #/net                   -hosts          -nobrowse,hidefromfinder,nosuid
    /home                   auto_home       -nobrowse,hidefromfinder
    /Network/Servers        -fstab
    /-                      -static
    /-                 auto_nfs     -nobrowse,nosuid

    ~ cat /etc/auto_nfs 
    /System/Volumes/Data/Users/ahepp/nfs/public -fstype=nfs,vers=4,resvport,rw nfs://10.128.1.2:/zpa/nfs/public

I haven't done a lot of performance testing, but I have no problem playing 10MB/s blurays off the NFS share (and that's via wifi)

On the FreeBSD server in the closet:

    root@tlon:~ # cat /etc/exports 
    V4: /
    /zpa/nfs/public -mapall=nobody

retcon · on July 27, 2022

Video is nicely sequential and eminently suitable for spinning disk service.

What's really confusing me is the 10G lan configuration because it's leaving little headroom above uncompressed 4K 4:4:4 10bit (12bit is broadcast and where source permits archive standard). What about multiple streams for a/b'ing a grade or first/ second camera edit roll?

Ed. added "really" replaced "for" with "above"

jeffbee · on July 27, 2022

I would have been pretty tempted to build this with a W480 Xeon platform having 2x thunderbolt ports. Conceivably that could have broken through the 1GB/s ceiling the article is seeing with 10g ethernet.

mbreese · on July 27, 2022

1.1GB/sec throughput is pretty good over a 10Gb/sec network. That’s 88% saturation. Right?

jeffbee · on July 27, 2022

That's pretty much as fast as it can possibly go, yes. That's why I would try thunderbolt instead. It's faster.

mbreese · on July 28, 2022

That’s what I didn’t understand. If you want this accessible over a network, you wouldn’t want to use a direct thunderbolt connection. As mentioned below, you could try 25Gbps Ethernet, but you’d also need to upgrade the clients. I assume the clients are currently 10GigE (copper?). As it is, you’re pretty much saturating the network bandwidth. Really, he could have also saturated the bandwidth with spinning HDDs too, given enough of them.

jeffbee · on July 28, 2022

The network sounded pretty small in this case, so 2x thunderbolt ports would have done the job. Anyway the means of attachment would be ip-over-thunderbolt, which is a thing that is supported by both Mac OS and Linux, so it can all be bridged to a traditional ethernet if needed for other clients.

geerlingguy · on July 28, 2022

Someday I plan on upgrading to 25 Gbps or greater, and this build may still provide headroom since it has the free x16 PCIe 3.0 slot. But storage bandwidth would probably fall off before network bandwidth at that point.

technofiend · on July 28, 2022

Jeff, if you do end up with a noise-friendly place to rack a server then options open up. The last dl360 I bought was less than $1k and came with a Mellonox dual port 40 gigabit NIC. Equipment is still cheap in the used market because those who haven't just moved to the cloud are still upgrading to newer and faster servers and networks as older servers and switches are fully depreciated and scrapped. Eventually that stream will diminish but the cycle will repeat every few years.

gfody · on July 28, 2022

for about the same budget you could buy a 4-bay 2.5" nvme nas enclosure and a single 30tb ssd, and have 3 bays to expand if/when you need to

nuker · on July 28, 2022

He should’ve better used Thunderbolt 3 DAS chassis.

rr888 · on July 27, 2022

Has anyone tried to replace NAS with a cloud service? If you have gig internet it should but I'm not sure if dropbox etc can keep up.

hatware · on July 27, 2022

Usually you go the other way with that.

smilekzs · on July 28, 2022

If you're paranoid about data security and availability (not a bad thing IMO!), using both NAS and some kind of off-site backup would be ideal.