Should i use zfs




















As a side note, if it's not soldered on, I recommend replacing the SSD. They have a consumable lifetime, and they're not especially expensive. Unless you have very good reason to, just choose the file system suggested by the installer. This will give you a configuration that is very well tested and therefore with the least surprises.

Also if you encrypt your hard drive, you better have very well tested backups as well or you should be prepared to lose data. Testing is both backup and restore. Add a comment. Active Oldest Votes. ZFS is great on servers with lots of ram, lots of cpu, and lots of disks. Improve this answer.

My disk is only GB. So what do you reckon about possible difference that it makes? I don't see a lot of point in using anything except ext4 for something that small.

The defaults with LUKS and ext4 in the ubuntu installer should work well. Why is XFS better than ext4 for large spinning disks? Many of its internal data structures are b-trees instead of the linear lists that ext4 generally uses.

Above 2T, xfs starts becoming advantageous, and above 4T ext4's linear data structures start becoming a serious problem. In my experience, XFS has better read performance than ext4 with lots of small files, but write performance is significantly worse. It's popular on web servers, where reads greatly outnumber writes, but on a personal laptop where I sometimes compile things, I still prefer ext4.

Definitely simpler to use than anything else. But performance-wise it sucked big time, at least on Solaris. ZFS, on Solaris, not robust?

Not just me and my employer, but many many others rely on ZFS for critical production storage, and have done so for many years. Most likely not - although by staying away from de-dup, enough RAM and adhere the pretty much general recommendation to use mirror vdevs only in your pools, it can be competitive. Something solid with data integrity guarantees? This reminds me. We had one file server used mostly for package installs that used ZFS for storage.

One day our java package stops installing. The package had become corrupt. So I force a manual ZFS scrub. No dice. I download a slightly different version. No problems. I grab the previous problematic package and put it in a different directory with no other copies on the file system - again it becomes corrupt. If I had to guess it was getting the file hash confused. Performance would often degrade heavily and unpredictably on such occasions.

We didn't loose data more often than with other systems. Also JBOD only. Sounds like you turned on dedupe, or had an absurdly wide stripe size. You do need to match your array structure to your needs as well as tune ZFS. And you're just wrong about snapshots and filesystem counts. ZFS is no speed demon, but it performs just fine if you set it up correctly and tune it.

Stripe size could have been a problem, though we just went with the default there afair. Most of the first tries was just along the Sun docs, we later only changed things until performance was sufficient. Dedupe wasn't even implemented back then. Maybe you also don't see as massive an impact because your hardware is a lot faster. Xs were predominantly meant to be cheap, not fast. No cache, insufficient RAM, slow controllers, etc.

Xs were the devil's work. Terrible BMC, raid controller, even the disk caddies were poorly designed. The BMC controller couldn't speak to the disk controller so you had no out-of-band storage management. I had to Run a fleet of of them, truly an awful time.

It uses its own cache layer, and eats RAM like hotcakes. Sort of. But no snapshots. Wanna use LVM for snapshots? I've never been able to see any difference at the workloads I run, whereas with LVM it was pervasive and inescapable. That was with the old LVM snapshots. Modern CoW snapshots have a much smaller impact. Plus XFS developers are working on internal snapshots, multi-volume management, and live fsck live check already works, live repair to come. I don't doubt this but do you have any documentation?

You would have to look at the implementation directly. The user documentation isn't great for documenting performance considerations, sadly. Essentially it comes down to this: a snapshot LV contains copies of old blocks which have been modified in the source LV. Whenever a block is updated in the source LV, LVM will need to check if that block been previously copied into all corresponding snapshot LVs.

For each source LV where this is not the case, it will need to copy the block to the snapshot LV. This means that there is O n complexity in the checking and copying. And in the case of "thin" LVs, it will also need to allocate the block to copy to, potentially for every snapshot LV in existence, making the process even slower. The effect is write amplification effectively proportional to the total number of snapshots. ZFS snapshots, in comparison, cost essentially the same no matter how many you create, because the old blocks are put onto a "deadlist" of the most recent snapshot, and it doesn't need repeating for every other snapshot in existence.

Older snapshots can reference them when needed, and if a snapshot is deleted, any blocks still referenced are moved to the next oldest snapshot. Blocks are never copied and only have a single direct owner. This makes the operations cheap. That's for the old "fat" LVM snapshots, right?

No way the new CoW thin LVs have such a big overhead for snapshots. There will be a much bigger overhead in accounting for all of the allocations from the "thin pool". The overlying filesystem also lacks knowledge of the underlying storage. The snapshot much be able to accommodate writes up to and including the full size of the parent block device in order to remain readable, just like the old-style snapshots did. That's the fundamental problem with LVM snapshots; they can go read-only at any point in time if the space is exhausted, due to the implicit over-commit which occurs every time you create a snapshot.

The overheads with ZFS snapshots are completely explicit and all space is fully and transparently accounted for. You know exactly what is using space from the pool, and why, with a single command. With LVM separating the block storage from the filesystem, the cause of space usage is almost completely opaque. Much obliged in advance, thank you kindly. What is bad about playing around with leftover hardware? Annatar on Jan 10, root parent next [—]. Nothing at all; it's what was done to that hardware that's the travesty here.

It takes an extraordinary level of incompetence and ignorance to even get the idea to slap Linux with dmraid and LVM on that hardware and then claim that it was faster and more robust without understanding how unreliable and fragile that constelation is and that it was faster because all the reliability was gone. If a sector goes bad between the time when you last scrubbed and the time when you get a disk failure which is pretty much inevitable with modern disk sizes , you're screwed. You lose a single disk's performance due to the checksumming.

The one thing I will say is that it does struggle to keep up with NVMe SSDs, otherwise I've always seen it run at drive speed on anything spinning, no matter how many drives. Jonnax on Jan 9, root parent prev next [—]. Have you seen any benchmarks for the scenario you've described? Have you got any info on how to do the required tuning that's geared towards a home NAS? Group your disks in bunches of 4 or 5 per Raidz, no more.

And have them on the same controller or SAS-expander per bunch. Use striping over the bunches. Don't use hotspares, for performance maybe avoid RAIDz6. Try out and benchmark a lot. I think the optimal number of RAIDz5 disks is 3, if you just want performance. But this wastes lots of space of course. Thats why I don't think there is a recipy, you have to try it out for each new kind of hardware. And as another thread pointed out, stripe size is also an important parameter.

However since nobody really knows what the hell z1 and z2 is and z1 is easy to mix up with RAID1 for nonZFS people, calling it z5 and z6 is far less confusing. It's the number of parity disks, pretty simple. There has been occasional talk of making the number arbitrary, though presently only raidz1, raidz2, and raidz3 exist. I think speed is not the primary reason many most? Wowfunhappy on Jan 9, parent prev next [—].

Expecting all Linux drivers to be GPL-licensed is unrealistic and just leads to crappy user experiences. Linux is able to run proprietary userspace software. Even most open source zealots agree that this is necessary. Why are all drivers expected to use the GPL? There's a lot of technical underpinnings here that I'll readily admit I don't understand.

If I speak out of ignorance, please feel free to correct me. I am also not an expert in this space - but if I understand correctly the reason the linux Nvidia driver sucks so much is that it is not GPL'd or open source at all.

I think the answer to this is: drivers are expect to use the GPL if they want to be mainlined and maintained - as Linus said: other than that you are "on your own".

My experience is that linux nvidia drivers are better than the competitors open source drivers. But it means I can't use Wayland. If the driver were open-source someone would have submitted a GBM patch and i wouldn't be stuck in this predicament. You can do that on Intel and AMD drivers and other open source graphics drivers, which due to being open source allow 3rd parties like redhat to patch in GBM support in drivers and mesa when required.

Nvidia driver does not support GBM code paths. Therefore wayland does not work on nvidia. And because nvidia driver is not open source, someone else cannot patch GBM in. Technically, you can use Wayland. What you cannot use is applications that use OpenGL or Vulkan acceleration. If your Wayland clients use just shm to communicate with compositor, it will work.

Is that experience recent? AMD drivers used to be terrible and Intel isn't even competition. Vega is fine, Raven Ridge had weird bugs last time I looked, with rx I couldn't even boot the proxmox 6.

Why is Intel not a competition? In laptops, I want only Intel, nothing else. Performance wise, Intel is streets behind. I know. But do you need that performance for what you do on the computer? For most uses, Intel GPU is fine. But if you do need that performance, Intel isn't an option. If you don't, there is no reason to even consider Nvidia.

They serve different needs. I'm currently running a AMD card because I thought the drivers were better. I was mistaken, I still have screen tearing that I can't fix. No doubt someone more knowledgeable about Linux could fix this issue, but I never had any issues with my nVidia blobs.

That's not to say nVidia don't have their own issues. I eventually bought an NVidia card to replace it so I could stop having problems. It's been smooth ever since. I have both an Nvidia and an AMD card. This was true until relatively recently, but no longer. There is just no incentive for this that I can see. Linux is an open source effort.

Linus had said that he considers open source "the only right way to do software". Out of tree drivers are tolerated, but the preferred outcome is for drivers to be open sourced and merged to the main Linux tree.

The idea that Linux needs better support for out of tree drivers is like someone going to church and saying to the priest "I don't care about this Jesus stuff but can I have some free wine and cookies please". Full disclosure my day job is to write out of tree drivers for Linux :. I would expect a large fraction of Nvidia's GPU sales to be from customers wanting to do machine learning. What platform do these customers typically use?

RussianCow on Jan 9, root parent next [—]. But because it's not GPLed, it will never be mainlined into the kernel, so you have to install it separately. Critically, Nvidia has a GPL'd shim. In the kernel code, which lets them keep a stable ABI. The kind of shim Linus isn't interested in for ZFS. But I boot into Windows when I want battery life and quiet fans on my laptop.

You make it sound like the idea is "if you GPL your driver, we'll maintain it for you", which is kinda bullshit. For one, kernel devs only really maintain what they want to maintain. They'll do enough work to make it compile but they aren't going to go out of their way to test it. Regressions do happen. More importantly though, they very purposefully do no maintain any stability in the driver ABI.

The policy is actively hostile to the concept of proprietary drivers. Which is really kinda of hilarious considering that so much modern hardware requires proprietary firmware blobs to run. Nullabillity on Jan 9, root parent prev next [—]. Nvidia is pretty much the only remaining holdout here on the hardware driver front. But it is a problem that you can't reliably have out-of-tree modules. Also, Linus is wrong: there's no reason that the ZoL project can't keep the ZFS module in working order, with some lag relative to updates to the Linux mainline, so as long as you stay on supported kernels and the ZoL project remains alive, then of course you can use ZFS.

And you should use ZFS because it's awesome. Wowfunhappy on Jan 9, root parent next [—]. That is the bit I'm trying to get at. Yes it would be best if ZFS was just part of Linux, and maybe some day it can be after Oracle is dead and gone or under a new leadership and strategy. But it's almost beside the point. Every other OS supports installing drivers that aren't "part" of the OS.

I don't understand why Linux is so hostile to this very real use case. Sure it's not ideal, but the world is full of compromises. I'm not sure Linux is especially hostile. A new OS version of, say, Windows can absolutely break drivers from a previous version. Linux absolutely is especially hostile. Windows will generally try to support existing drivers, even binary-only ones, and give plenty of notice for API changes.

Linux explicitly refuses to offer any kind of stability for its API i. Linux is generally not happy about seeing any out of tree drivers. But that is also not without reason, in a certain way Linux balances in a field where they are and want to stay open source. But a lot of users and someteimes the companies paying some "contributors", too are companies which are not always that happy about open source.

Through take that argument with a large grain of salt , there are counter arguments for it, too. There's a unique variable here and that's Oracle. That shouldn't actually matter; it should just depend on the license. But millions in legal fees says otherwise.

As a Linux user and an ex android user, I absolutely disagree and would add that the GPL requirement for drivers is probably the biggest feature Linux has! Yes, the often times proprietary android linux driver for are such a pain.

Not only make they it harder to reuse the hardware outside of android e. But they also tend to cause delays with android updates and sometimes make it impossible to update a phone to a newer android version even if the phone producer wants to do so. Android did start making this less of problem with HAL and stuff, but it's still a problem, just a less big one. There is a big difference between a company distributing a proprietary Linux driver, and the linux project merging software of a gpl incompatible license.

In the first case it is the linux developers who can raise the issue of copyright infringement, and it is the company that has to defend their right to distribute. In the later the roles are reversed with the linux developers who has to argue that they are within compliance of the copyright license.

A shim layer is a poor legal bet. It assumes that a judge who might not have much technical knowledge will agree that by putting this little technical trickery between the two incompatible works then somehow that turn it from being a single combined work into two cleanly separated works.

It could work, but it could also very easily be seen as meaningless obfuscation. It is this relationship that distinguish two works from a single work.

A easy way to see this is how a music video work. If a create a file with a video part and a audio part, and distribute it, legally this will be seen as me distributing a single work. I also need to have additional copyright permission in order to create such derivative work, rights that goes beyond just distributing the different parts. If I would argue in court that I just am distributing two different works then the relationship between the video and the music would be put into question.

A userspace software is generally seen as independent work. One reason is that such software can run on multiple platforms, but the primary reason is that people simply don't see them as an extension of the kernel. There is an "approved" method - write an publish your own kernel module. However if your module is not GPL licensed it cannot be published in the linux kernel itself, and you must keep up with the maintenance of the code.

This is a relatively fair requirement imo. XorNot on Jan 10, root parent next [—]. The issue here is which parts of the kernel API are allowed for non-GPL modules has been decided to be a moving target from version to version, which might as well be interpreted as "just don't bother anymore". I wonder if this was exactly what they intended, i.

And ZFS might just have been accidentally hit by this but is in a situation where it can't put thinks into the tree MaxBarraclough on Jan 12, root parent prev next [—]. It's a feature, not a bug.

Linux is intentionally hostile to binary-blob drivers. Torvalds described his decision to go with the GPLv2 licence as the best thing I ever did. Again, Linux took over the world! As for nVidia's proprietary graphics drivers, they're an unusual case. Because of the 'derived works' concept.

The GPL wasn't intended to overreach to the point that a GPL web server would require that only GPL-compatible web browsers could connect to it, but it was intended to block the creation of a non-free fork of a GPL codebase. There are edge-cases, as there are with everything, such as the nVidia driver situation I mentioned above. The problem is already addressed: if someone wants to contribute code to the project then it's licensing must be compatible with the prior work contributed to project.

That's it. But why are all drivers expected to be "part of the project"? We don't treat userspace Linux software that way. We don't consider Windows drivers part of Windows. DSMan on Jan 9, root parent next [—]. It's pretty simple, once they expose such an API they'd have to support it forever, hindering options for refactoring that happens all the time.

With all the drivers in the tree, they can simply update every driver at the same time to whatever new in-kernel API they're rolling out or removing. And being that the majority of drivers would arguably have to be GPL anyway, and thus open-source, the advantages of keeping all the drivers in tree are high, and the disadvantages low.

With that, they do expose a userspace filesystem driver interface, FUSE. While it's not the same as an actual kernel FS driver performance in particular , it effectively allows what you're asking for by exposing an API you can write a filesystem driver against without it being part of the kernel code.

Given that the kernel is nearly 30 years old, do you not find it slightly incredible that there has been no effort to stabilise the internal ABI while every other major kernel has managed it, including FreeBSD? There are ways and means to do this. It would be perfectly possible to have a versioned VFS interface and permit filesystems to provide multiple implementations to interoperate with different kernel versions. I can understand the desire to be unconstrained by legacy technical debt and be able to change code at will.

I would find that liberating. However, this is no longer a project run by dedicated amateurs. It made it to the top, and at this point in time, it seems undisciplined and anachronistic. You know, come to think of it, is there anything stopping Linux from having a Due to the parallel, it would only be the work of a couple hours to port any existing FUSE server over into being such a kernel module.

And, given how much code could be shared with FUSE support, adding support for this wouldn't even require much of a patch. Seems like an "obvious win", really. It's not the context switch that kills you for the most part, but the nature of the API and it's lack of direct access to the buffer cache and VMM layer. Making a stable FKSE leads to the same issues. That's why Windows moved WSL2 to being a kernel running on hyper-v rather than in kernel. Their IFS installable filesystem driver stack screws up where the buffer cache manager is, and it was pretty much impossible to change.

At that point, the real apples to apples comparison left NT lacking. Running a full kernel in another VM ended up being faster because of this. DSMan on Jan 9, root parent prev next [—]. I mean, it doesn't really work that way, you can't just port a userspace program into a kernel module.

Both of those things are not easily fixable - FKSE would still effectively be using the FUSE APIs, and syscalls don't translate directly into callable kernel functions and definitely not the ones you should be using. Like, scheduling, talking to storage, the network, whatever is needed to actually implement a useful filesystem. The problem is that nobody is interested in doing that and that's why we are in this situation in the first place. Yes, which Linus has also poo-pooed: "People who think that userspace filesystems are realistic for anything but toys are just misguided.

I mean, he's right. Nearly every system that puts the FS in user space has abysmal performance; the one exception I can think of off the top of my head is XOK's native FS which is very very very different than traditional filesystems at every layer in the stack, and has abysmal performance again once two processes are accessing the same files. Oh, I totally agree. Which is fine if that's his attitude, but if so I do wish he'd be more direct about it rather than making claims that are questionable at best e.

And yet people use them all the damn time because they're incredibly useful and even more importantly are relatively easy to put together compared to kernel modules. Linus is just plain wrong on this one. DSMan on Jan 11, root parent next [—]. But for something like your root filesystem? Not going to happen. His point is that FUSE is useful and fine for things that aren't performance critical, but it's fundamentally too slow for cases where performance is relevant. The problem with FUSE file systems is not that they aren't part of the kernel's VCS repo, but that it requires a context switch to user-space.

It is the policy of linux development at work. Linux kernel doesn't break userspace, you could safely upgrade kernel, your userspace would work nice. But Linux kernel breaks inner APIs easily, and kernel developers take responsibility for all the code. So if a patch in memory management subsystem broke some drivers, kernel developers would find breakages and fix them. Yeah, because Windows kernel less frequently breaks backward compatibility in kernel space, and because hardware vendors are ready to maintain drivers for Windows.

You can license both kernel modules or FUSE implementation any way you see fit. That's a non-issue. It seems that some people are oblivious to the actual problem, which is some people want their code to be mixed into the source code of a software project without having to comply with the rightholder's wishes, as if their will shouldn't be respected.

I'm not sure you can commit your source code to the windows kernel project. No, no one wants to force ZFS into the Linux kernel. I think anyone agrees that it needs to be out-of tree the way thinks are currently. Because running proprietary binaries in kernel space is not a good idea nor is it compatible with the vision of Linux?

ZFS isn't proprietary it's merely incompatible with the gpl. But in practice this can only be decided on a case-by-case basis.

The only way Linux could work on this is by: 1. Adding a exception to there GPL license to exclude kernel modules from GPL constraints which obviously won't happen for a bunch of reasons. Turn Linux into a micro kernel with user-land drivers and interfaces for that drivers which are not license encumbered which again won't happen because this would be a completely different system 3.

Apache v2. Guess, what that won't happen either, or at last I would be very surprised. Vendors are expected to merge their drivers in mainline because that is the path to getting a well-supported and well-tested driver. Drivers that get merged are expected to use a GPL2-compatible license because that is the license of the Linux kernel.

If you're wondering why the kernel community does not care about supporting an API for use in closed-source drivers, it's because it's fundamentally incompatible with the way kernel development actually works, and the resulting experience is even more crappy anyway.

Variations of this question get asked so often that there are multiple pages of documentation about it [0] [1]. The tl;dr is that closed-source drivers get pinned to the kernel version they're built for and lag behind. When the vendor decides to stop supporting the hardware, the drivers stop being built for new kernel versions and you can basically never upgrade your kernel after that. In practice it means you are forced to use that vendor's distro if you want things to work properly.

All that says to me is that if you want your hardware to be future-proof, never buy nvidia. All the other Linux vendors have figured out that it's nonsensical to sell someone a piece of hardware that can't be operated without secret bits of code. If you ever wondered why Linus was flipping nvidia the bird in that video that was going around a few years ago To answer your excellent question and ignore the somewhat unfortunate slam on people who seem to differ with your way of thinking , it is an intentional goal of software freedom.

The idea of a free software license is to allow people to obtain a license to the software if they agree not to distribute changes to that software in such a way so that downstream users have less options than they would with the original software. Some people are at odds with the options available with licenses like the GPL. Some think they are too restrictive. Some think they are too permissive. Some think they are just right.

With respect to you question, it's neither here nor there if the GPL is hitting a sweet spot or not. What's important is that the original author has decided that it did and has chosen the license.

I don't imagine that you intend to argue that a person should not be able to choose the license that is best for them, so I'll just leave it at that. The root of the question is "What determines a change to the software".

Is it if we modify the original code? What if we add code? What if we add a completely new file to the code? What if we add a completely new library and simply link it to the code? What if we interact with a module system at runtime and link to the code that way? The answers to these questions are not well defined. Some of them have been tested in court, while others have not.

There are many opinions on which of these constitutes changing of the original software. These opinions vary wildly, but we won't get a definitive answer until the issues are brought up in court. Before that time period, as a third party who wishes to interact with the software, you have a few choices. You can simply take your chances and do whatever you want.

You might be sued by someone who has standing to sue. You might win the case even if you are sued. It's a risk. In some cases the risk is higher than others probably roughly ordered in the way I ordered the questions.

Another possibility is that you can follow the intent of the original author. You can ask them, "How do you define changing of the software". Such as:. Unlike most files systems, ZFS combines the features of a file system and a volume manager. This means that unlike other file systems, ZFS can create a file system that spans across a series of drives or a pool.

Not only that but you can add storage to a pool by adding another drive. ZFS will handle partitioning and formatting. Copy-on-write is another interesting and cool features. On most files system, when data is overwritten, it is lost forever. On ZFS, the new information is written to a different block.

Once the write is complete, the file systems metadata is updated to point to the new info. This ensures that if the system crashes or something else happens while the write is taking place, the old data will be preserved. It also means that the system does not need to run fsck after a system crash. Copy-on-write leads into another ZFS feature: snapshots. ZFS uses snapshots to track changes in the file system. No additional space is used. As new data is written to the live file system, new blocks are allocated to store this data.

So, snapshots are mainly designed to track changes to files, but not the addition and creation of files. Snapshots can be mounted as read-only to recover a past version of a file.

It is also possible to rollback the live system to a previous snapshot. All changes made since the snapshot will be lost. Whenever new data is written to ZFS, it creates a checksum for that data. When that data is read, the checksum is verified. If the checksum does not match, then ZFS knows that an error has been detected. ZFS will then automatically attempt to correct the error. RAID-Z2 required at least two storage drives and two drive for parity. RAID-Z3 requires at least two storage drives and three drive for parity.



0コメント

  • 1000 / 1000