Crate arc_swap[−][src]
Expand description
Making Arc
itself atomic
This library provides the ArcSwapAny
type (which you probably don’t want to use
directly) and several type aliases that set it up for common use cases:
ArcSwap
, which operates onArc<T>
.ArcSwapOption
, which operates onOption<Arc<T>>
.IndependentArcSwap
, which uses slightly different trade-off decisions ‒ see below.
Note that as these are type aliases, the useful methods are defined on
ArcSwapAny
directly and you want to look at its documentation, not on the
aliases.
This is similar to RwLock<Arc<T>>
, but it is faster, the readers are never blocked
(not even by writes) and it is more configurable.
Or, you can look at it this way. There’s Arc<T>
‒ it knows when it stops being used and
therefore can clean up memory. But once there’s a Arc<T>
somewhere, shared between
threads, it has to keep pointing to the same thing. On the other hand, there’s
AtomicPtr
which can be changed even when shared between
threads, but it doesn’t know when the data pointed to is no longer in use so it
doesn’t clean up. This is a hybrid between the two.
Motivation
First, the C++ shared_ptr
can act this way. The fact that it’s only the surface
API and all the implementations I could find hide a mutex inside wasn’t known to me when I
started working on this. So I decided Rust needs to keep up there.
Second, I like hard problems and this seems like an apt supply of them.
And third, I actually have few use cases for something like this.
Performance characteristics
It is optimised for read-heavy situations with only occasional writes. Few examples might be:
- Global configuration data structure, which is updated once in a blue moon when an operator manually does some changes, but looked into through the whole program all the time. Looking into it should be cheap and multiple threads should be able to look into it at the same time.
- Some in-memory database or maybe routing tables, where lookup latency matters. Updating the routing tables isn’t an excuse to stop processing packets even for a short while.
Lock-free readers
All the read operations are always lock-free. Most of the time, they are actually
wait-free. The only one that is only lock-free is the first load
access in each
thread (across all the pointers).
So, when the documentation talks about contention, it talks about multiple CPU cores having to sort out who changes the bytes in a cache line first and who is next. This slows things down, but it still rolls forward and stop for no one, not like with the mutex-style contention when one holds the lock and other threads get parked.
Unfortunately, there are cases where readers block writers from completion. It’s much more
limited in scope than with Mutex
or RwLock
and steady stream of readers
will not prevent an update from happening indefinitely (only a reader stuck in a critical
section could, and when used according to recommendations, the critical sections contain no
loops and are only several instructions short).
Speeds
The base line speed of read operations is similar to using an uncontended Mutex
.
However, load
suffers no contention from any other read operations and only slight
ones during updates. The load_full
operation is additionally contended only on
the reference count of the Arc
inside ‒ so, in general, while Mutex
rapidly
loses its performance when being in active use by multiple threads at once and
RwLock
is slow to start with, ArcSwapAny
mostly keeps its
performance even when read by many threads in parallel.
Write operations are considered expensive. A write operation is more expensive than access to
an uncontended Mutex
and on some architectures even slower than uncontended
RwLock
. However, it is faster than either under contention.
There are some (very unscientific) benchmarks within the source code of the library.
The exact numbers are highly dependant on the machine used (both absolute numbers and relative between different data structures). Not only architectures have a huge impact (eg. x86 vs ARM), but even AMD vs. Intel or two different Intel processors. Therefore, if what matters is more the speed than the wait-free guarantees, you’re advised to do your own measurements.
However, the intended selling point of the library is consistency in the performance, not
outperforming other locking primitives in average. If you do worry about the raw performance,
you can have a look at Cache
.
Choosing the right reading operation
There are several load operations available. While the general go-to one should be
load
, there may be situations in which the others are a better match.
The load
usually only borrows the instance from the shared ArcSwapAny
. This makes
it faster, because different threads don’t contend on the reference count. There are two
situations when this borrow isn’t possible. If the content of ArcSwapAny
gets changed, all
existing Guard
s are promoted to contain an owned instance.
The other situation derives from internal implementation. The number of borrows each thread can
have at each time (across all Guard
s) is limited. If this limit is exceeded, an
onwed instance is created instead.
Therefore, if you intend to hold onto the loaded value for extended time span, you may prefer
load_full
. It loads the pointer instance (Arc
) without borrowing, which is
slower (because of the possible contention on the reference count), but doesn’t consume one of
the borrow slots, which will make it more likely for following load
s to have a slot
available. Similarly, if some API needs an owned Arc
, load_full
is more
convenient.
There’s also load_signal_safe
. This is the only method guaranteed to be
safely usable inside a unix signal handler. It has no advantages outside of them, so it makes
it kind of niche one.
Additionally, it is possible to use a Cache
to get further speed improvement at the
cost of less comfortable API and possibly keeping the older values alive for longer than
necessary.
Atomic orderings
It is guaranteed each operation performs at least one SeqCst
atomic read-write operation,
therefore even operations on different instances have a defined global order of operations.
Customization
While the default ArcSwap
and load
is probably good enough for most of
the needs, the library allows a wide range of customizations:
- It allows storing nullable (
Option<Arc<_>>
) and non-nullable pointers. - It is possible to store other reference counted pointers (eg. if you want to use it with a
hypothetical
Arc
that doesn’t have weak counts), by implementing theRefCnt
trait. - It allows choosing internal fallback locking strategy by the
LockStorage
trait.
Examples
extern crate arc_swap; extern crate crossbeam_utils; use std::sync::Arc; use arc_swap::ArcSwap; use crossbeam_utils::thread; fn main() { let config = ArcSwap::from(Arc::new(String::default())); thread::scope(|scope| { scope.spawn(|_| { let new_conf = Arc::new("New configuration".to_owned()); config.store(new_conf); }); for _ in 0..10 { scope.spawn(|_| { loop { let cfg = config.load(); if !cfg.is_empty() { assert_eq!(**cfg, "New configuration"); return; } } }); } }).unwrap(); }
Alternatives
There are other means to get similar functionality you might want to consider:
Mutex<Arc<_>>
and RwLock<Arc<_>>
They have significantly worse performance in the contented scenario but are comparable in uncontended cases. They are directly in the standard library, which means better testing and less dependencies.
The same, but with parking_lot
Parking lot contains alternative implementations of Mutex
and RwLock
that are faster than
the standard library primitives. They still suffer from contention.
crossbeam::atomic::ArcCell
This internally contains a spin-lock equivalent and is very close to the characteristics of
parking_lot::Mutex<Arc<_>>
. This is unofficially deprecated. See the
relevant issue.
crossbeam-arccell
It is mentioned here because of the name. Despite of the name, this does something very
different (which might possibly solve similar problems). It’s API is not centric to Arc
or
any kind of pointer, but rather it has snapshots of its internal value that can be exchanged
very fast.
AtomicArc
This one is probably the closest thing to ArcSwap
on the API level. Both read and
write operations are lock-free, but neither is wait-free, and the performance of reads and
writes are more balanced ‒ while ArcSwap
is optimized for reading, AtomicArc
is „balanced“.
The biggest current downside is, it is in a prototype stage and not released yet.
Features
The unstable-weak
feature adds the ability to use arc-swap with the Weak pointer too,
through the ArcSwapWeak type. This requires the nightly Rust compiler. Also, the interface
and support is not part of API stability guarantees and may be arbitrarily changed or
removed in future releases (it is mostly waiting for the weak_into_raw
nightly feature to
stabilize before stabilizing it in this crate).
Re-exports
pub use cache::Cache; |
Modules
access | Abstracting over accessing parts of stored value. |
cache | Caching handle into the ArcSwapAny. |
gen_lock | Customization of where and how the generation lock works. |
Structs
ArcSwapAny | An atomic storage for a reference counted smart pointer like |
Guard | A temporary storage of the pointer. |
Traits
RefCnt | A trait describing smart reference counted pointers. |
Type Definitions
ArcSwap | An atomic storage for |
ArcSwapOption | An atomic storage for |
IndependentArcSwap | An atomic storage that doesn’t share the internal generation locks with others. |