mirror of
https://github.com/koverstreet/bcachefs-tools.git
synced 2025-12-08 00:00:12 +03:00
add Documentation directory
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This commit is contained in:
parent
7d5817d9c2
commit
9694ddaba1
186
Documentation/CodingStyle.rst
Normal file
186
Documentation/CodingStyle.rst
Normal file
@ -0,0 +1,186 @@
|
|||||||
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
|
bcachefs coding style
|
||||||
|
=====================
|
||||||
|
|
||||||
|
Good development is like gardening, and codebases are our gardens. Tend to them
|
||||||
|
every day; look for little things that are out of place or in need of tidying.
|
||||||
|
A little weeding here and there goes a long way; don't wait until things have
|
||||||
|
spiraled out of control.
|
||||||
|
|
||||||
|
Things don't always have to be perfect - nitpicking often does more harm than
|
||||||
|
good. But appreciate beauty when you see it - and let people know.
|
||||||
|
|
||||||
|
The code that you are afraid to touch is the code most in need of refactoring.
|
||||||
|
|
||||||
|
A little organizing here and there goes a long way.
|
||||||
|
|
||||||
|
Put real thought into how you organize things.
|
||||||
|
|
||||||
|
Good code is readable code, where the structure is simple and leaves nowhere
|
||||||
|
for bugs to hide.
|
||||||
|
|
||||||
|
Assertions are one of our most important tools for writing reliable code. If in
|
||||||
|
the course of writing a patchset you encounter a condition that shouldn't
|
||||||
|
happen (and will have unpredictable or undefined behaviour if it does), or
|
||||||
|
you're not sure if it can happen and not sure how to handle it yet - make it a
|
||||||
|
BUG_ON(). Don't leave undefined or unspecified behavior lurking in the codebase.
|
||||||
|
|
||||||
|
By the time you finish the patchset, you should understand better which
|
||||||
|
assertions need to be handled and turned into checks with error paths, and
|
||||||
|
which should be logically impossible. Leave the BUG_ON()s in for the ones which
|
||||||
|
are logically impossible. (Or, make them debug mode assertions if they're
|
||||||
|
expensive - but don't turn everything into a debug mode assertion, so that
|
||||||
|
we're not stuck debugging undefined behaviour should it turn out that you were
|
||||||
|
wrong).
|
||||||
|
|
||||||
|
Assertions are documentation that can't go out of date. Good assertions are
|
||||||
|
wonderful.
|
||||||
|
|
||||||
|
Good assertions drastically and dramatically reduce the amount of testing
|
||||||
|
required to shake out bugs.
|
||||||
|
|
||||||
|
Good assertions are based on state, not logic. To write good assertions, you
|
||||||
|
have to think about what the invariants on your state are.
|
||||||
|
|
||||||
|
Good invariants and assertions will hold everywhere in your codebase. This
|
||||||
|
means that you can run them in only a few places in the checked in version, but
|
||||||
|
should you need to debug something that caused the assertion to fail, you can
|
||||||
|
quickly shotgun them everywhere to find the codepath that broke the invariant.
|
||||||
|
|
||||||
|
A good assertion checks something that the compiler could check for us, and
|
||||||
|
elide - if we were working in a language with embedded correctness proofs that
|
||||||
|
the compiler could check. This is something that exists today, but it'll likely
|
||||||
|
still be a few decades before it comes to systems programming languages. But we
|
||||||
|
can still incorporate that kind of thinking into our code and document the
|
||||||
|
invariants with runtime checks - much like the way people working in
|
||||||
|
dynamically typed languages may add type annotations, gradually making their
|
||||||
|
code statically typed.
|
||||||
|
|
||||||
|
Looking for ways to make your assertions simpler - and higher level - will
|
||||||
|
often nudge you towards making the entire system simpler and more robust.
|
||||||
|
|
||||||
|
Good code is code where you can poke around and see what it's doing -
|
||||||
|
introspection. We can't debug anything if we can't see what's going on.
|
||||||
|
|
||||||
|
Whenever we're debugging, and the solution isn't immediately obvious, if the
|
||||||
|
issue is that we don't know where the issue is because we can't see what's
|
||||||
|
going on - fix that first.
|
||||||
|
|
||||||
|
We have the tools to make anything visible at runtime, efficiently - RCU and
|
||||||
|
percpu data structures among them. Don't let things stay hidden.
|
||||||
|
|
||||||
|
The most important tool for introspection is the humble pretty printer - in
|
||||||
|
bcachefs, this means `*_to_text()` functions, which output to printbufs.
|
||||||
|
|
||||||
|
Pretty printers are wonderful, because they compose and you can use them
|
||||||
|
everywhere. Having functions to print whatever object you're working with will
|
||||||
|
make your error messages much easier to write (therefore they will actually
|
||||||
|
exist) and much more informative. And they can be used from sysfs/debugfs, as
|
||||||
|
well as tracepoints.
|
||||||
|
|
||||||
|
Runtime info and debugging tools should come with clear descriptions and
|
||||||
|
labels, and good structure - we don't want files with a list of bare integers,
|
||||||
|
like in procfs. Part of the job of the debugging tools is to educate users and
|
||||||
|
new developers as to how the system works.
|
||||||
|
|
||||||
|
Error messages should, whenever possible, tell you everything you need to debug
|
||||||
|
the issue. It's worth putting effort into them.
|
||||||
|
|
||||||
|
Tracepoints shouldn't be the first thing you reach for. They're an important
|
||||||
|
tool, but always look for more immediate ways to make things visible. When we
|
||||||
|
have to rely on tracing, we have to know which tracepoints we're looking for,
|
||||||
|
and then we have to run the troublesome workload, and then we have to sift
|
||||||
|
through logs. This is a lot of steps to go through when a user is hitting
|
||||||
|
something, and if it's intermittent it may not even be possible.
|
||||||
|
|
||||||
|
The humble counter is an incredibly useful tool. They're cheap and simple to
|
||||||
|
use, and many complicated internal operations with lots of things that can
|
||||||
|
behave weirdly (anything involving memory reclaim, for example) become
|
||||||
|
shockingly easy to debug once you have counters on every distinct codepath.
|
||||||
|
|
||||||
|
Persistent counters are even better.
|
||||||
|
|
||||||
|
When debugging, try to get the most out of every bug you come across; don't
|
||||||
|
rush to fix the initial issue. Look for things that will make related bugs
|
||||||
|
easier the next time around - introspection, new assertions, better error
|
||||||
|
messages, new debug tools, and do those first. Look for ways to make the system
|
||||||
|
better behaved; often one bug will uncover several other bugs through
|
||||||
|
downstream effects.
|
||||||
|
|
||||||
|
Fix all that first, and then the original bug last - even if that means keeping
|
||||||
|
a user waiting. They'll thank you in the long run, and when they understand
|
||||||
|
what you're doing you'll be amazed at how patient they're happy to be. Users
|
||||||
|
like to help - otherwise they wouldn't be reporting the bug in the first place.
|
||||||
|
|
||||||
|
Talk to your users. Don't isolate yourself.
|
||||||
|
|
||||||
|
Users notice all sorts of interesting things, and by just talking to them and
|
||||||
|
interacting with them you can benefit from their experience.
|
||||||
|
|
||||||
|
Spend time doing support and helpdesk stuff. Don't just write code - code isn't
|
||||||
|
finished until it's being used trouble free.
|
||||||
|
|
||||||
|
This will also motivate you to make your debugging tools as good as possible,
|
||||||
|
and perhaps even your documentation, too. Like anything else in life, the more
|
||||||
|
time you spend at it the better you'll get, and you the developer are the
|
||||||
|
person most able to improve the tools to make debugging quick and easy.
|
||||||
|
|
||||||
|
Be wary of how you take on and commit to big projects. Don't let development
|
||||||
|
become product-manager focused. Often time an idea is a good one but needs to
|
||||||
|
wait for its proper time - but you won't know if it's the proper time for an
|
||||||
|
idea until you start writing code.
|
||||||
|
|
||||||
|
Expect to throw a lot of things away, or leave them half finished for later.
|
||||||
|
Nobody writes all perfect code that all gets shipped, and you'll be much more
|
||||||
|
productive in the long run if you notice this early and shift to something
|
||||||
|
else. The experience gained and lessons learned will be valuable for all the
|
||||||
|
other work you do.
|
||||||
|
|
||||||
|
But don't be afraid to tackle projects that require significant rework of
|
||||||
|
existing code. Sometimes these can be the best projects, because they can lead
|
||||||
|
us to make existing code more general, more flexible, more multipurpose and
|
||||||
|
perhaps more robust. Just don't hesitate to abandon the idea if it looks like
|
||||||
|
it's going to make a mess of things.
|
||||||
|
|
||||||
|
Complicated features can often be done as a series of refactorings, with the
|
||||||
|
final change that actually implements the feature as a quite small patch at the
|
||||||
|
end. It's wonderful when this happens, especially when those refactorings are
|
||||||
|
things that improve the codebase in their own right. When that happens there's
|
||||||
|
much less risk of wasted effort if the feature you were going for doesn't work
|
||||||
|
out.
|
||||||
|
|
||||||
|
Always strive to work incrementally. Always strive to turn the big projects
|
||||||
|
into little bite sized projects that can prove their own merits.
|
||||||
|
|
||||||
|
Instead of always tackling those big projects, look for little things that
|
||||||
|
will be useful, and make the big projects easier.
|
||||||
|
|
||||||
|
The question of what's likely to be useful is where junior developers most
|
||||||
|
often go astray - doing something because it seems like it'll be useful often
|
||||||
|
leads to overengineering. Knowing what's useful comes from many years of
|
||||||
|
experience, or talking with people who have that experience - or from simply
|
||||||
|
reading lots of code and looking for common patterns and issues. Don't be
|
||||||
|
afraid to throw things away and do something simpler.
|
||||||
|
|
||||||
|
Talk about your ideas with your fellow developers; often times the best things
|
||||||
|
come from relaxed conversations where people aren't afraid to say "what if?".
|
||||||
|
|
||||||
|
Don't neglect your tools.
|
||||||
|
|
||||||
|
The most important tools (besides the compiler and our text editor) are the
|
||||||
|
tools we use for testing. The shortest possible edit/test/debug cycle is
|
||||||
|
essential for working productively. We learn, gain experience, and discover the
|
||||||
|
errors in our thinking by running our code and seeing what happens. If your
|
||||||
|
time is being wasted because your tools are bad or too slow - don't accept it,
|
||||||
|
fix it.
|
||||||
|
|
||||||
|
Put effort into your documentation, commit messages, and code comments - but
|
||||||
|
don't go overboard. A good commit message is wonderful - but if the information
|
||||||
|
was important enough to go in a commit message, ask yourself if it would be
|
||||||
|
even better as a code comment.
|
||||||
|
|
||||||
|
A good code comment is wonderful, but even better is the comment that didn't
|
||||||
|
need to exist because the code was so straightforward as to be obvious;
|
||||||
|
organized into small clean and tidy modules, with clear and descriptive names
|
||||||
|
for functions and variables, where every line of code has a clear purpose.
|
||||||
105
Documentation/SubmittingPatches.rst
Normal file
105
Documentation/SubmittingPatches.rst
Normal file
@ -0,0 +1,105 @@
|
|||||||
|
Submitting patches to bcachefs
|
||||||
|
==============================
|
||||||
|
|
||||||
|
Here are suggestions for submitting patches to bcachefs subsystem.
|
||||||
|
|
||||||
|
Submission checklist
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
Patches must be tested before being submitted, either with the xfstests suite
|
||||||
|
[0]_, or the full bcachefs test suite in ktest [1]_, depending on what's being
|
||||||
|
touched. Note that ktest wraps xfstests and will be an easier method to running
|
||||||
|
it for most users; it includes single-command wrappers for all the mainstream
|
||||||
|
in-kernel local filesystems.
|
||||||
|
|
||||||
|
Patches will undergo more testing after being merged (including
|
||||||
|
lockdep/kasan/preempt/etc. variants), these are not generally required to be
|
||||||
|
run by the submitter - but do put some thought into what you're changing and
|
||||||
|
which tests might be relevant, e.g. are you dealing with tricky memory layout
|
||||||
|
work? kasan, are you doing locking work? then lockdep; and ktest includes
|
||||||
|
single-command variants for the debug build types you'll most likely need.
|
||||||
|
|
||||||
|
The exception to this rule is incomplete WIP/RFC patches: if you're working on
|
||||||
|
something nontrivial, it's encouraged to send out a WIP patch to let people
|
||||||
|
know what you're doing and make sure you're on the right track. Just make sure
|
||||||
|
it includes a brief note as to what's done and what's incomplete, to avoid
|
||||||
|
confusion.
|
||||||
|
|
||||||
|
Rigorous checkpatch.pl adherence is not required (many of its warnings are
|
||||||
|
considered out of date), but try not to deviate too much without reason.
|
||||||
|
|
||||||
|
Focus on writing code that reads well and is organized well; code should be
|
||||||
|
aesthetically pleasing.
|
||||||
|
|
||||||
|
CI
|
||||||
|
--
|
||||||
|
|
||||||
|
Instead of running your tests locally, when running the full test suite it's
|
||||||
|
preferable to let a server farm do it in parallel, and then have the results
|
||||||
|
in a nice test dashboard (which can tell you which failures are new, and
|
||||||
|
presents results in a git log view, avoiding the need for most bisecting).
|
||||||
|
|
||||||
|
That exists [2]_, and community members may request an account. If you work for
|
||||||
|
a big tech company, you'll need to help out with server costs to get access -
|
||||||
|
but the CI is not restricted to running bcachefs tests: it runs any ktest test
|
||||||
|
(which generally makes it easy to wrap other tests that can run in qemu).
|
||||||
|
|
||||||
|
Other things to think about
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
- How will we debug this code? Is there sufficient introspection to diagnose
|
||||||
|
when something starts acting wonky on a user machine?
|
||||||
|
|
||||||
|
We don't necessarily need every single field of every data structure visible
|
||||||
|
with introspection, but having the important fields of all the core data
|
||||||
|
types wired up makes debugging drastically easier - a bit of thoughtful
|
||||||
|
foresight greatly reduces the need to have people build custom kernels with
|
||||||
|
debug patches.
|
||||||
|
|
||||||
|
More broadly, think about all the debug tooling that might be needed.
|
||||||
|
|
||||||
|
- Does it make the codebase more or less of a mess? Can we also try to do some
|
||||||
|
organizing, too?
|
||||||
|
|
||||||
|
- Do new tests need to be written? New assertions? How do we know and verify
|
||||||
|
that the code is correct, and what happens if something goes wrong?
|
||||||
|
|
||||||
|
We don't yet have automated code coverage analysis or easy fault injection -
|
||||||
|
but for now, pretend we did and ask what they might tell us.
|
||||||
|
|
||||||
|
Assertions are hugely important, given that we don't yet have a systems
|
||||||
|
language that can do ergonomic embedded correctness proofs. Hitting an assert
|
||||||
|
in testing is much better than wandering off into undefined behaviour la-la
|
||||||
|
land - use them. Use them judiciously, and not as a replacement for proper
|
||||||
|
error handling, but use them.
|
||||||
|
|
||||||
|
- Does it need to be performance tested? Should we add new performance counters?
|
||||||
|
|
||||||
|
bcachefs has a set of persistent runtime counters which can be viewed with
|
||||||
|
the 'bcachefs fs top' command; this should give users a basic idea of what
|
||||||
|
their filesystem is currently doing. If you're doing a new feature or looking
|
||||||
|
at old code, think if anything should be added.
|
||||||
|
|
||||||
|
- If it's a new on disk format feature - have upgrades and downgrades been
|
||||||
|
tested? (Automated tests exists but aren't in the CI, due to the hassle of
|
||||||
|
disk image management; coordinate to have them run.)
|
||||||
|
|
||||||
|
Mailing list, IRC
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Patches should hit the list [3]_, but much discussion and code review happens
|
||||||
|
on IRC as well [4]_; many people appreciate the more conversational approach
|
||||||
|
and quicker feedback.
|
||||||
|
|
||||||
|
Additionally, we have a lively user community doing excellent QA work, which
|
||||||
|
exists primarily on IRC. Please make use of that resource; user feedback is
|
||||||
|
important for any nontrivial feature, and documenting it in commit messages
|
||||||
|
would be a good idea.
|
||||||
|
|
||||||
|
.. rubric:: References
|
||||||
|
|
||||||
|
.. [0] git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
|
||||||
|
.. [1] https://evilpiepirate.org/git/ktest.git/
|
||||||
|
.. [2] https://evilpiepirate.org/~testdashboard/ci/
|
||||||
|
.. [3] linux-bcachefs@vger.kernel.org
|
||||||
|
.. [4] irc.oftc.net#bcache, #bcachefs-dev
|
||||||
108
Documentation/casefolding.rst
Normal file
108
Documentation/casefolding.rst
Normal file
@ -0,0 +1,108 @@
|
|||||||
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
|
Casefolding
|
||||||
|
===========
|
||||||
|
|
||||||
|
bcachefs has support for case-insensitive file and directory
|
||||||
|
lookups using the regular `chattr +F` (`S_CASEFOLD`, `FS_CASEFOLD_FL`)
|
||||||
|
casefolding attributes.
|
||||||
|
|
||||||
|
The main usecase for casefolding is compatibility with software written
|
||||||
|
against other filesystems that rely on casefolded lookups
|
||||||
|
(eg. NTFS and Wine/Proton).
|
||||||
|
Taking advantage of file-system level casefolding can lead to great
|
||||||
|
loading time gains in many applications and games.
|
||||||
|
|
||||||
|
Casefolding support requires a kernel with the `CONFIG_UNICODE` enabled.
|
||||||
|
Once a directory has been flagged for casefolding, a feature bit
|
||||||
|
is enabled on the superblock which marks the filesystem as using
|
||||||
|
casefolding.
|
||||||
|
When the feature bit for casefolding is enabled, it is no longer possible
|
||||||
|
to mount that filesystem on kernels without `CONFIG_UNICODE` enabled.
|
||||||
|
|
||||||
|
On the lookup/query side: casefolding is implemented by allocating a new
|
||||||
|
string of `BCH_NAME_MAX` length using the `utf8_casefold` function to
|
||||||
|
casefold the query string.
|
||||||
|
|
||||||
|
On the dirent side: casefolding is implemented by ensuring the `bkey`'s
|
||||||
|
hash is made from the casefolded string and storing the cached casefolded
|
||||||
|
name with the regular name in the dirent.
|
||||||
|
|
||||||
|
The structure looks like this:
|
||||||
|
|
||||||
|
* Regular: [dirent data][regular name][nul][nul]...
|
||||||
|
* Casefolded: [dirent data][reg len][cf len][regular name][casefolded name][nul][nul]...
|
||||||
|
|
||||||
|
(Do note, the number of NULs here is merely for illustration; their count can
|
||||||
|
vary per-key, and they may not even be present if the key is aligned to
|
||||||
|
`sizeof(u64)`.)
|
||||||
|
|
||||||
|
This is efficient as it means that for all file lookups that require casefolding,
|
||||||
|
it has identical performance to a regular lookup:
|
||||||
|
a hash comparison and a `memcmp` of the name.
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
---------
|
||||||
|
|
||||||
|
Several designs were considered for this system:
|
||||||
|
One was to introduce a dirent_v2, however that would be painful especially as
|
||||||
|
the hash system only has support for a single key type. This would also need
|
||||||
|
`BCH_NAME_MAX` to change between versions, and a new feature bit.
|
||||||
|
|
||||||
|
Another option was to store without the two lengths, and just take the length of
|
||||||
|
the regular name and casefolded name contiguously / 2 as the length. This would
|
||||||
|
assume that the regular length == casefolded length, but that could potentially
|
||||||
|
not be true, if the uppercase unicode glyph had a different UTF-8 encoding than
|
||||||
|
the lowercase unicode glyph.
|
||||||
|
It would be possible to disregard the casefold cache for those cases, but it was
|
||||||
|
decided to simply encode the two string lengths in the key to avoid random
|
||||||
|
performance issues if this edgecase was ever hit.
|
||||||
|
|
||||||
|
The option settled on was to use a free-bit in d_type to mark a dirent as having
|
||||||
|
a casefold cache, and then treat the first 4 bytes the name block as lengths.
|
||||||
|
You can see this in the `d_cf_name_block` member of union in `bch_dirent`.
|
||||||
|
|
||||||
|
The feature bit was used to allow casefolding support to be enabled for the majority
|
||||||
|
of users, but some allow users who have no need for the feature to still use bcachefs as
|
||||||
|
`CONFIG_UNICODE` can increase the kernel side a significant amount due to the tables used,
|
||||||
|
which may be decider between using bcachefs for eg. embedded platforms.
|
||||||
|
|
||||||
|
Other filesystems like ext4 and f2fs have a super-block level option for casefolding
|
||||||
|
encoding, but bcachefs currently does not provide this. ext4 and f2fs do not expose
|
||||||
|
any encodings than a single UTF-8 version. When future encodings are desirable,
|
||||||
|
they will be added trivially using the opts mechanism.
|
||||||
|
|
||||||
|
dentry/dcache considerations
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
Currently, in casefolded directories, bcachefs (like other filesystems) will not cache
|
||||||
|
negative dentry's.
|
||||||
|
|
||||||
|
This is because currently doing so presents a problem in the following scenario:
|
||||||
|
|
||||||
|
- Lookup file "blAH" in a casefolded directory
|
||||||
|
- Creation of file "BLAH" in a casefolded directory
|
||||||
|
- Lookup file "blAH" in a casefolded directory
|
||||||
|
|
||||||
|
This would fail if negative dentry's were cached.
|
||||||
|
|
||||||
|
This is slightly suboptimal, but could be fixed in future with some vfs work.
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
----------
|
||||||
|
|
||||||
|
(from Peter Anvin, on the list)
|
||||||
|
|
||||||
|
It is worth noting that Microsoft has basically declared their
|
||||||
|
"recommended" case folding (upcase) table to be permanently frozen (for
|
||||||
|
new filesystem instances in the case where they use an on-disk
|
||||||
|
translation table created at format time.) As far as I know they have
|
||||||
|
never supported anything other than 1:1 conversion of BMP code points,
|
||||||
|
nor normalization.
|
||||||
|
|
||||||
|
The exFAT specification enumerates the full recommended upcase table,
|
||||||
|
although in a somewhat annoying format (basically a hex dump of
|
||||||
|
compressed data):
|
||||||
|
|
||||||
|
https://learn.microsoft.com/en-us/windows/win32/fileio/exfat-specification
|
||||||
30
Documentation/errorcodes.rst
Normal file
30
Documentation/errorcodes.rst
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
|
bcachefs private error codes
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
In bcachefs, as a hard rule we do not throw or directly use standard error
|
||||||
|
codes (-EINVAL, -EBUSY, etc.). Instead, we define private error codes as needed
|
||||||
|
in fs/bcachefs/errcode.h.
|
||||||
|
|
||||||
|
This gives us much better error messages and makes debugging much easier. Any
|
||||||
|
direct uses of standard error codes you see in the source code are simply old
|
||||||
|
code that has yet to be converted - feel free to clean it up!
|
||||||
|
|
||||||
|
Private error codes may subtype another error code, this allows for grouping of
|
||||||
|
related errors that should be handled similarly (e.g. transaction restart
|
||||||
|
errors), as well as specifying which standard error code should be returned at
|
||||||
|
the bcachefs module boundary.
|
||||||
|
|
||||||
|
At the module boundary, we use bch2_err_class() to convert to a standard error
|
||||||
|
code; this also emits a trace event so that the original error code be
|
||||||
|
recovered even if it wasn't logged.
|
||||||
|
|
||||||
|
Do not reuse error codes! Generally speaking, a private error code should only
|
||||||
|
be thrown in one place. That means that when we see it in a log message we can
|
||||||
|
see, unambiguously, exactly which file and line number it was returned from.
|
||||||
|
|
||||||
|
Try to give error codes names that are as reasonably descriptive of the error
|
||||||
|
as possible. Frequently, the error will be logged at a place far removed from
|
||||||
|
where the error was generated; good names for error codes mean much more
|
||||||
|
descriptive and useful error messages.
|
||||||
79
Documentation/future/idle_work.rst
Normal file
79
Documentation/future/idle_work.rst
Normal file
@ -0,0 +1,79 @@
|
|||||||
|
Idle/background work classes design doc
|
||||||
|
=======================================
|
||||||
|
|
||||||
|
Right now, our behaviour at idle isn't ideal, it was designed for servers that
|
||||||
|
would be under sustained load, to keep pending work at a "medium" level, to
|
||||||
|
let work build up so we can process it in more efficient batches, while also
|
||||||
|
giving headroom for bursts in load.
|
||||||
|
|
||||||
|
But for desktops or mobile - scenarios where work is less sustained and power
|
||||||
|
usage is more important - we want to operate differently, with a "rush to
|
||||||
|
idle" so the system can go to sleep. We don't want to be dribbling out
|
||||||
|
background work while the system should be idle.
|
||||||
|
|
||||||
|
The complicating factor is that there are a number of background tasks, which
|
||||||
|
form a hierarchy (or a digraph, depending on how you divide it up) - one
|
||||||
|
background task may generate work for another.
|
||||||
|
|
||||||
|
Thus proper idle detection needs to model this hierarchy.
|
||||||
|
|
||||||
|
- Foreground writes
|
||||||
|
- Page cache writeback
|
||||||
|
- Copygc, rebalance
|
||||||
|
- Journal reclaim
|
||||||
|
|
||||||
|
When we implement idle detection and rush to idle, we need to be careful not
|
||||||
|
to disturb too much the existing behaviour that works reasonably well when the
|
||||||
|
system is under sustained load (or perhaps improve it in the case of
|
||||||
|
rebalance, which currently does not actively attempt to let work batch up).
|
||||||
|
|
||||||
|
SUSTAINED LOAD REGIME
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
When the system is under continuous load, we want these jobs to run
|
||||||
|
continuously - this is perhaps best modelled with a P/D controller, where
|
||||||
|
they'll be trying to keep a target value (i.e. fragmented disk space,
|
||||||
|
available journal space) roughly in the middle of some range.
|
||||||
|
|
||||||
|
The goal under sustained load is to balance our ability to handle load spikes
|
||||||
|
without running out of x resource (free disk space, free space in the
|
||||||
|
journal), while also letting some work accumululate to be batched (or become
|
||||||
|
unnecessary).
|
||||||
|
|
||||||
|
For example, we don't want to run copygc too aggressively, because then it
|
||||||
|
will be evacuating buckets that would have become empty (been overwritten or
|
||||||
|
deleted) anyways, and we don't want to wait until we're almost out of free
|
||||||
|
space because then the system will behave unpredicably - suddenly we're doing
|
||||||
|
a lot more work to service each write and the system becomes much slower.
|
||||||
|
|
||||||
|
IDLE REGIME
|
||||||
|
-----------
|
||||||
|
|
||||||
|
When the system becomes idle, we should start flushing our pending work
|
||||||
|
quicker so the system can go to sleep.
|
||||||
|
|
||||||
|
Note that the definition of "idle" depends on where in the hierarchy a task
|
||||||
|
is - a task should start flushing work more quickly when the task above it has
|
||||||
|
stopped generating new work.
|
||||||
|
|
||||||
|
e.g. rebalance should start flushing more quickly when page cache writeback is
|
||||||
|
idle, and journal reclaim should only start flushing more quickly when both
|
||||||
|
copygc and rebalance are idle.
|
||||||
|
|
||||||
|
It's important to let work accumulate when more work is still incoming and we
|
||||||
|
still have room, because flushing is always more efficient if we let it batch
|
||||||
|
up. New writes may overwrite data before rebalance moves it, and tasks may be
|
||||||
|
generating more updates for the btree nodes that journal reclaim needs to flush.
|
||||||
|
|
||||||
|
On idle, how much work we do at each interval should be proportional to the
|
||||||
|
length of time we have been idle for. If we're idle only for a short duration,
|
||||||
|
we shouldn't flush everything right away; the system might wake up and start
|
||||||
|
generating new work soon, and flushing immediately might end up doing a lot of
|
||||||
|
work that would have been unnecessary if we'd allowed things to batch more.
|
||||||
|
|
||||||
|
To summarize, we will need:
|
||||||
|
|
||||||
|
- A list of classes for background tasks that generate work, which will
|
||||||
|
include one "foreground" class.
|
||||||
|
- Tracking for each class - "Am I doing work, or have I gone to sleep?"
|
||||||
|
- And each class should check the class above it when deciding how much work to issue.
|
||||||
38
Documentation/index.rst
Normal file
38
Documentation/index.rst
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
|
======================
|
||||||
|
bcachefs Documentation
|
||||||
|
======================
|
||||||
|
|
||||||
|
Subsystem-specific development process notes
|
||||||
|
--------------------------------------------
|
||||||
|
|
||||||
|
Development notes specific to bcachefs. These are intended to supplement
|
||||||
|
:doc:`general kernel development handbook </process/index>`.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
:numbered:
|
||||||
|
|
||||||
|
CodingStyle
|
||||||
|
SubmittingPatches
|
||||||
|
|
||||||
|
Filesystem implementation
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
Documentation for filesystem features and their implementation details.
|
||||||
|
At this moment, only a few of these are described here.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
:numbered:
|
||||||
|
|
||||||
|
casefolding
|
||||||
|
errorcodes
|
||||||
|
|
||||||
|
Future design
|
||||||
|
-------------
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
future/idle_work
|
||||||
Loading…
x
Reference in New Issue
Block a user