Software. Efficiency. Scalability.

Entia non sunt multiplicanda praeter necessitatem

Posts Tagged ‘dev

Understanding GC: Freeing memory (part I)

with one comment

Let me start by saying that I am a very casual driver. I have never bothered looking under the hood of my Accord. In fact, the reason I drive a Honda is because I don’t have to. I don’t want to debug my car; I want it to just work. Having said that, if I were to depend on my car, whether I did racing or farming, I would make sure that I understood how the machinery worked: what it could, what it could not do, and what would be the best way to keep it running smoothly. After all, you don’t want you car to fail on you when you really can’t afford it.

For me, Garbage Collection is one of those “under the hood” things. It’s a fuel pump for your applications. It is extremely important to understand how it works.

Memory management does two things: allocates and releases memory. Garbage Collection, a memory management systems, made it to mainstream in late 90s, mostly thanks to Java and .NET. It helped resolve a lot of issues that plagued development at the time:

  1. Memory leaks. It was a huge problem then and it is still a problem now. Even companies like Google can’t keep their products without leaks. GC has made it a lot harder to introduce memory leaks. While the issue remains, it doesn’t happen nearly as much.
  2. Having references to “dead” objects. As a consequence of fighting against #1, you can get a reference to a piece of memory that had already been freed. If memory manager allocated something else in the same space you were in a lot of trouble. GC completely resolved this.
  3. Allocating on top. If you programmed for WIN32 API you might remember ERROR_INSUFFICIENT_BUFFER and all kinds of jumping through hoops to figure out how much memory was needed. GC also completely resolved this issue since you can actually allocate objects as you need in your libraries and then return them to the caller.
  4. Concurrency. In multi-threaded systems it wasn’t always quite clear if a thread could release an object. One of the most common ways to deal with it was reference counting. Usually it had to be done under a lock, and it scaled poorly with the number of threads. With GC, you can share your objects between threads without worrying about anything.

All these things come at a price. GC is by no means cheap nor simple. In fact, a good GC implementation is an extremely complex task to undertake. Even for such successful projects as MONO it can take years to implement a compacting GC:

“During the last years Paolo and Mark have been developing a copying garbage collector for Mono… The code is still experimental and should not be used in any kind of production environment. It has been checked into Mono’s trunk repository and is available as of revision 61240, it turned on using the –with-gc=sgen, but it is not recommended production use yet (as of July 15th 2009).”

Here is a short list of possible complications:

  1. GC either needs to halt the system or implement read/write barriers to protect the data while GC is working.
  2. Generational GC tend to implement write barriers to keep “old” data separated from the “new” data, as its more likely that the “new” data points to the “old” data.
  3. Allocating large objects tends to trash gen0 too fast. That’s why you see things as Large Object Heap.
  4. Pinned memory regions. Let’s say you requested an I/O read into a region of memory. This region is now “pinned” and can’t be compacted. This usually causes arenas to break down into smaller pieces.
  5. Applications that don’t fit generational GC model (most of the young objects die in gen0) incur a much greater collecting costs.
  6. Collecting becomes more and more expensive as your memory footprint grows. For instance, Cached# starts triggering gen2 collections too often at around 8gb and ends up spending most of the CPU time in GC reaching 14gb (2×4 Xeon X5550 2.66Ghz, 24GB of RAM). Of course, you can start 7x2GB instances of Cached#  to avoid this issue, but it’s a GC issue nevertheless.

As you can see, there are quite a few trade offs. For majority of applications the pros significantly outweigh the cons. However, the cons are still there, and developers need to be aware of them. I’ve noticed that more and more people think that GC is a “silver bullet” with no downsides. It is simply not true.

Written by Mikhail Opletayev

December 28, 2009 at 8:11 pm

Posted in development

Tagged with , , , ,

SQLite vs. SqlServerCE

leave a comment »

Yesterday, I switched one of my projects from Microsoft SQL Server CE 3.5 to SQLite. The project crawls through a set of files and stores index information in a simple database. I will later use this information to organize the files and do searches.

I went and downloaded SQLite ADO.NET on a whim and plugged it into my project. The results were pretty stunning.

On the test data set (around 50MB worth of files), the initial indexing went around 30% faster. The size of the index file went down from 12MB to 7MB. All this on top of the fact that there is no run-time installation for SQLite!

Kudos to SQLite team. It’s a very impressive product, a lean and speedy small database.

Written by Mikhail Opletayev

December 16, 2009 at 6:50 pm

Posted in db, software

Tagged with , ,

Optimize your string operations

leave a comment »

Optimization does not always mean optimizing your code logic. In fact, more often than not, it means optimizing your data structures. Here is how a little trick can save a lot of CPU cycles on common string operations.

Strings are usually stored as an array of chars with a #0 at the end. One of the most common optimizations is putting string length in the negative offset. You spend time calculating it once and every time you do something with a string that requires its length you capitalize on using an extra piece data. Another very useful thing is to padding strings with ” at the end. Let me explain what I mean by this:

When you allocate a chunk of memory for a string you always get a memory chunk aligned to the machine word. For example, if you are trying to allocate 3 bytes you will get 4 bytes on a 32-bit machine and 8 bytes on a 64-bit machine. What happens to the extra bytes at the end? Usually nothing. They are wasted- but they don’t HAVE to be.

What you can do instead is spend a couple extra cycles and set them all to #0.

Let’s take a look at the string “HELLO”. It takes 8 bytes on both 32- and 64-bit CPUs.  Normally it looks like this:

[ 'H', 'E', 'L', 'L', 'O', '#0', '?', '?']

But, you could easily make it look like this:

[ 'H', 'E', 'L', 'L', 'O', '#0', '#0', '#0']

It will take a several CPU cycles but consider the benefits for a 64-bit application:

1) Less cycles spent on string comparison, one of the most common string operations. In this example, it only takes 1 compare to find out if the content of two “HELLO” strings is equal!

2) It is much cheaper to calculate the length of a string. You do one compare per 8 bytes and then do 1 byte scan of the tail:

#include <stdio.h>

__int64 str_len_64( char* str ) {
	char *e = str;
	// main 64 bit scan
	while( *(__int64*)e && 0xFF00000000000000 ) 
		e += sizeof( __int64 );
	// tail 8 bit scan
	while( *e ) e++;
	return e - str;
}

void main( char** argv, int argc ) {
	// even for a short string
	char* str = "HELLO, WORLD\0\0\0\0";
	printf( "string: %s\n", str );
	// we can reduce the number of compares from 13 to 6
	printf( "length: %d\n", str_len_64( str ) );
}

3) Most of the other operations (including copy) can safely treat a string as an array of __int64, greatly reducing the number of CPU operations needed.

It’s a little trick, but the savings can be very impressive on a large scale system, especially if you go 64 bit. Just think about how many times you compare and copy strings.

Written by Mikhail Opletayev

December 14, 2009 at 10:30 am

Posted in software

Tagged with , , , ,

Anemic Domain Model

leave a comment »

Today I stumbled across a conversation about the Anemic Domain Model.

It sounds like a some sort of terrible disease. Martin Fowler blasted it as an anti-pattern. Wikipedia page states 1 benefit to 7 liabilities.

Even people that humor it tend to say it is limited to only inexperienced teams and small projects:

Jimmy Bogard writes:

Times when you can’t do a full domain model can include situations where you don’t have access to a domain expert, your team isn’t experienced with OO, or time constraints limit the conversations and analysis needed to build a decent domain model. If the domain isn’t complex, or is largely data-in, data-out, then those are contexts where a full domain model likely isn’t needed as well.

Greg Yong writes:

Our domain is not extremely complex…. Our team is not highly functional in object oriented design and analysis. When I say this I mean that team members must be highly functional in order to attempt creating a domain model (all other attempts will be doomed to failure, likely as an anemic domain model).

Sounds pretty bad, to be honest. Still, somehow I am not convinced. The reason I am not convinced is very simple: complexity. Look at the quotes above, they both mention ADM being not half as bad for “not so complex” domains. But do you really want to have your domains that complex?

Complex systems tend to require lots of configuration glue to keep them together. They are hard to learn and require developers to understand the whole thing before they can become fully productive. In the real world, complex projects tend to grow larger and more complex, original designers move on or move up, new features and entities keep being are added, and eventually such projects turn into a Katamari ball.

Anemic Data Model, on the other hand, means that you have very simple data objects (if they grow too big or too complex it’s a sign that you need to split them) and a set of well defined interfaces for processing these objects. Instead of making your model more complex you separate components and make them interact using interfaces, shifting towards a loosely coupled model from a complex monolithic model.

It’s a bit of an old Unix paradigm: make one component do one thing really well, define it’s interface, interact with other components when you need to.

Loosely coupled projects are a lot easier to maintain, evolve, and scale. Simple data structures and simple interfaces allow for just that. If your project is getting too complex, figuring out how to simplify data structures and interfaces might be a better choice than working out a more complex data model.

P.S. This is one of the reasons why protocol buffer messages produced by proto-cs are declared as sealed. They are data messages, there is no need to inherit from them.

Written by Mikhail Opletayev

December 4, 2009 at 4:43 am

Posted in software

Tagged with , , ,

Follow

Get every new post delivered to your Inbox.