Alo Sarv
lead developer

Donate via

Latest Builds

version 0.3
tar.gz tar.bz2
Boost 1.33.1 Headers
MLDonkey Downloads Import Module Development
Payment completed
Development in progress.
Developer's Diary

Saturday, September 03, 2005

CPU usage fixed, ipv4addr is broken, but what about that chunkselector?

The performance hog was discovered - namely, seems the statusbar (the very last line in console that displays speeds in real-time) started consuming a lot more resources recently. This could be explained by increased amount of events being processed, since the statusbar was updated in real-time (during event loops). Now, the statusbar is only updated once per 100ms - fast enough to give accurate information, but no longer consuming so much cpu resources. CPU usage dropped roughly 2 times (if statusbar was enabled - this patch has no effect when running with -b {background} or --disable-status flags).

Another issue was identified today related to IPV4Address class - namely, apparently it expects host-endian port value, but expects (and stores) ip address in network byte order. This causes a lot of strange mess (for example, I was parsing compact tracker responses last night {apparently not so optional after all - some trackers reject clients who dont support it}, and was wondering why on earth is it sending ip in one byte ordering, and port in other).

The new chunkselector(tm) isn't progressing though - I'v given it a lot of thought today, but with little progress. There's a lot of issues to be addressed - performance, memory usage and effectivness are the primary concern, and balancing between those three is tricky. The ideal solution would:
The original approach that attempted to address these targets relied on per-byte chunks, e.g. each chunk takes 16 bytes memory (all size management in hydranode uses 64-bit variables). However, for BT this could quickly lead to 80kb memory being stored for a chunkmap (plus various overheads), which is unacceptable.

The really tricky busyness here is the rarest-chunk selection. If we were to deal with single-layer, e.g. choose the rarest from one layer only - for example BT would always choose the rarest BT chunk, disregarding any ed2k chunk information - it would be simple. But what I'd like to achieve is that if a 9500kb chunk is completely missing on ed2k network, BT module would prioritize that one, and vice versa. This would lead to most effective chunk-selection in multi-network downloads, with each network selecting whichever chunk is rarest across all networks. How to actually implement this, and even further - implement this so that it scales well even in case of thousands of chunks - still eludes me.

Madcat, ZzZz


Regarding your inprogress optimization of the chunk selection algorithm, seems to me like you can minimize traversal by appropriately using composite keys ( For instance, looking up on ID_Partial seems to be followed always by a manual lookup on m_useCnt: in this case, you can gain some speed by making ID_Avail use a composite key on (m_partial,m_useCnt). Drop me a line if you'd like to discuss this further. HTH,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo
Thanks, I had completely forgotten about composite keys. This should provide very useful in a number of contexts (not only the ChunkSelector) :)

Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?