Strings, C# and Unamanaged
Memory
Hibernating Rhinos
Michael Yarichuk
https://www.imgflip.net
Strings in C#
• Immutable and reference type
• Allocated on the managed heap
New allocation
New allocation
New allocation
One line of code  three new allocations
(Invisible) cost of Garbage Collector
Massive
usage of
strings
Lots of
allocations
Lots of GC
cycles
https://www.imgflip.net
Custom strings
implementation
http://www.memegen.com/
First, a little example…
• Import into RavenDB
• Total size of data ~4GB
• 4.5 millions of documents
Memory allocations
Custom Strings .Net Strings
Time spent in GC
Custom Strings .Net Strings
Time spent in GC
Custom Strings .Net Strings
So, when are such strings can be useful?
• Serialization
• Large scale text processing
• Мemory mapped files + json/xml
• Custom network protocols
The idea
Unmanaged
memory
byte* Ptr
Int Size = 20kb
String 1
20kb
String 2
46kb
byte* Ptr
Int Size = 46kb
The definition
Pointer arithmetics!
Why byte and not char?
EFACABD6B7D79CEFAD8BD79D20D7A2EFAD8BD79CD6B7D79D
UTF8 Encoding
48656C6C6F20776F726C64
Length != Size
‫ם‬ ַ‫ל‬ ‫וֹ‬ ‫ע‬ ‫ם‬ ‫וֹ‬ ‫ל‬ַ‫שׂ‬
H e l l o w o r l d
Implementation detail example
Implementation detail example
More differences…
VS
P/Invoke to memcmp
System.String interop
Encoding Type
Optimizations…
• Centralized Factory for managing lifetime of strings
• Can be used with Flyweight, Object Pool and caching
Some useful techniques for
implementation
Memory issues
• Buffer overlaps
• Using memory block after release
• Leaks
Buffer overlaps – Electric Fence
NOACCESS NOACCESS
READWRITE
Memory Leaks – Reference Counting
Reference Count = 1 Reference Count = 2
UString UString
Memory Leaks/Usage after release – Finalizer
Using unmanaged memory doesn’t mean we can’t use
.Net mechanisms!
There is no silver bullet…
https://openclipart.org/
Questions?
michael.yarichuk@hibernatingrhinos.com
https://openclipart.org/ 26
https://github.com/ravendb/ravendb/tree/v4.0/src/Sparrow

Strings, C# and Unmanaged Memory

Editor's Notes

  • #26 The optimizations I am going to talk about should not be used for all cases, just like microscope should not be used