Preface
I'm currently writing an AMD hypervisor in Rust and there I need to allocate memory for lots of different structures. And not all of them use the same memory, so I decided to create a helper structure to allocate and deallocate it for me.
My Implementation
My implementation has 2 different types:
Normal
: non-paged pool memory allocated withExAllocatePoolWithTag
Contiguous
: physical memory allocated withMmAllocateContiguousMemorySpecifyCacheNode
The first implementation looked like this:
;
// Deref/DerefMut implementations skipped
Doesn't look too complicated, huh? That's exactly what I thought, but there is one big problem with this. Can you spot it?
Let's look at the following example code:
Now, what do you think happens at the end of the scope?
- Both allocations will be dropped and deallocated
- Only
bar
will be dropped and deallocated - Only
foo
will be dropped and deallocated - Nothing will be deallocated
You might have guessed it. Answer 3 is correct. What happens to bar
? It's simple: Because we aren't capturing T
inside AllocatedMemory
, the compiler thinks it can drop the allocation immediately. For some reason, the compiler even optimized the pointer to 0x0
, so if you run the code, the driver would crash due to an access violation ( 0xffffffffc0000005
) because we are passing a null pointer to ExFreePool
/MmFreeContiguousMemory
. We can't just add a check for it, because then we would be leaking memory.
Here's what I think is happening:
// End of scope: Foo is dropped.
In the disassembly, it looked like this. You can see the allocation with MmAllocateContiguousMemorySpecifyCacheNode
and then in the last lines it immediately tries to deallocate it.
This bug was driving me insane because I didn't know what was causing it. I tried so many different things that didn't work:
- Disabled all unneeded features
- Disabled LTO (link-time-optimization)
- Tried a ton of different compilation
- Remove AllocType, to check if it was a compiler bug/optimization
- Replaced
NonNull<T>
with*mut T
After none of my fixes worked, I decided to consult the Rust Nomicon. Unfortunately, that didn't solve my problems either. I learned about tricky scenarios and common mistakes again, but none of them helped me figure out a solution.
Looking at existing implementations
I already knew about Box<T>
and Vec<T>
, so I wondered how they are implemented behind the scenes because they work similarly.
Vec<T>
Vec<T>
is backed by RawVec
which uses Unique<T>
for the data pointer. Unique<T>
is a wrapper around NonNull<T>
, but it behaves as if it was an instance of T
. That's exactly what we want.
pub
The Drop
implementation also uses #[may_dangle]
to assert that the destructor of a generic type is guaranteed to not access any expired data. You can read more about that in the Rust Nomicon - An Escape Hatch.
unsafe
I tried both implementing Unique<T>
and #[may_dangle]
, but that didn't solve my problem.
Box<T>
This one is much simpler because it's just a Unique<T>
pointer (so it asserts that Box
owns the data).
;
There are many different allocator implementations, but I was only interested in one thing: the Drop
implementation. I thought: "Maybe they do some black magic, to make sure that the inner elements are dropped correctly". And yes, they do, because there's literally nothing.
unsafe
Since none of my previous attempts worked, I tried to replace AllocatedMemory<T>
with Box<T>
, but it still didn't work. I tried a few different examples and figured out the problem.
Here's the what the structures look like:
For example, this code works:
Yet these two examples don't work:
Box::new_zeroed()
refers to the MaybeUninit::zeroed()
documentation, which mentions the following:
Note that dropping a
MaybeUninit<T>
will never call T's drop code. It is your responsibility to make sure T gets dropped if it got initialized.
So it is actually not the Rust compiler that's wrong, it's me. I'm still not sure why my AllocatedMemory
implementations were not working, but I now know that lots of effort has been put into making Box<T>
sound and safe.
Final solution
While going through the source code of Box<T>
I also noticed the new allocator api. Because I needed to redesign the AllocatedMemory
structure, I decided to use Box
with custom allocators instead.
I already have a global allocator which allocates pool memory, so I only need to write one for physical memory. We only have to implement the Allocator
trait, and we are good to go:
;
unsafe
Now we can either pass alloc::Global
or PhysicalAllocator
to the functions like this:
let foo: = Box new_in;
Instead of passing alloc::Global
(the global allocator) to Box::new_in
, we can also just use the default implementation Box::new
.
static GLOBAL: KernelAlloc = KernelAlloc;
let foo: = Box new_in;
let foo: = Box new;
There's even fallible allocations if you want to handle the allocation errors yourself. Box::new_in
is just a wrapper around Box::try_new_in
and calls handle_alloc_error(layout)
upon failure.
let foo: = match Box try_new_in ;
So now, instead of using AllocatedMemory<T>
, we can use Box<T>
and get rid of all the bugs and crashes.
Conclusion
I actually really like this change. We can reuse existing implementations and allocate it wherever we want. We also reduced the amount of code we have to manage and other people can understand the code more easily.
I also think I'd have a hard time trying to implement Drop
myself when Box
is using the compiler for that. There's probably a good reason for that. If you know why, please let me know.
Thanks for reading.