Alo Sarv
lead developer

Donate via
MoneyBookers

Latest Builds

version 0.3
tar.gz tar.bz2
Boost 1.33.1 Headers
MLDonkey Downloads Import Module Development
Payment completed
Development in progress.
Developer's Diary
irc.hydranode.com/#hydranode

Saturday, September 03, 2005

CPU usage fixed, ipv4addr is broken, but what about that chunkselector?

The performance hog was discovered - namely, seems the statusbar (the very last line in console that displays speeds in real-time) started consuming a lot more resources recently. This could be explained by increased amount of events being processed, since the statusbar was updated in real-time (during event loops). Now, the statusbar is only updated once per 100ms - fast enough to give accurate information, but no longer consuming so much cpu resources. CPU usage dropped roughly 2 times (if statusbar was enabled - this patch has no effect when running with -b {background} or --disable-status flags).

Another issue was identified today related to IPV4Address class - namely, apparently it expects host-endian port value, but expects (and stores) ip address in network byte order. This causes a lot of strange mess (for example, I was parsing compact tracker responses last night {apparently not so optional after all - some trackers reject clients who dont support it}, and was wondering why on earth is it sending ip in one byte ordering, and port in other).

The new chunkselector(tm) isn't progressing though - I'v given it a lot of thought today, but with little progress. There's a lot of issues to be addressed - performance, memory usage and effectivness are the primary concern, and balancing between those three is tricky. The ideal solution would:
The original approach that attempted to address these targets relied on per-byte chunks, e.g. each chunk takes 16 bytes memory (all size management in hydranode uses 64-bit variables). However, for BT this could quickly lead to 80kb memory being stored for a chunkmap (plus various overheads), which is unacceptable.

The really tricky busyness here is the rarest-chunk selection. If we were to deal with single-layer, e.g. choose the rarest from one layer only - for example BT would always choose the rarest BT chunk, disregarding any ed2k chunk information - it would be simple. But what I'd like to achieve is that if a 9500kb chunk is completely missing on ed2k network, BT module would prioritize that one, and vice versa. This would lead to most effective chunk-selection in multi-network downloads, with each network selecting whichever chunk is rarest across all networks. How to actually implement this, and even further - implement this so that it scales well even in case of thousands of chunks - still eludes me.

Madcat, ZzZz



Comments:
Hello,

Regarding your inprogress optimization of the chunk selection algorithm, seems to me like you can minimize traversal by appropriately using composite keys (http://tinyurl.com/acw5v). For instance, looking up on ID_Partial seems to be followed always by a manual lookup on m_useCnt: in this case, you can gain some speed by making ID_Avail use a composite key on (m_partial,m_useCnt). Drop me a line if you'd like to discuss this further. HTH,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo
 
Thanks, I had completely forgotten about composite keys. This should provide very useful in a number of contexts (not only the ChunkSelector) :)

Madcat.
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?