The performance hog was discovered - namely, seems the statusbar (the very last line in console that displays speeds in real-time) started consuming a lot more resources recently. This could be explained by increased amount of events being processed, since the statusbar was updated in real-time (during event loops). Now, the statusbar is only updated once per 100ms - fast enough to give accurate information, but no longer consuming so much cpu resources. CPU usage dropped roughly 2 times (if statusbar was enabled - this patch has no effect when running with -b {background} or --disable-status flags).
Another issue was identified today related to IPV4Address class - namely, apparently it expects host-endian port value, but expects (and stores) ip address in network byte order. This causes a lot of strange mess (for example, I was parsing compact tracker responses last night {apparently not so optional after all - some trackers reject clients who dont support it}, and was wondering why on earth is it sending ip in one byte ordering, and port in other).
The new chunkselector(tm) isn't progressing though - I'v given it a lot of thought today, but with little progress. There's a lot of issues to be addressed - performance, memory usage and effectivness are the primary concern, and balancing between those three is tricky. The ideal solution would:
- Select rares chunks, from multiple variable-sized chunk maps (e.g. from a combination of 9500kb and 256kb chunks, choose the rarest sub-chunk).
- Prefer completing partially-downloaded chunks over starting new chunks
- Prefer used-defined chunks when starting new chunks (e.g. first/last pieces, specific files in BT case)
The original approach that attempted to address these targets relied on per-byte chunks, e.g. each chunk takes 16 bytes memory (all size management in hydranode uses 64-bit variables). However, for BT this could quickly lead to 80kb memory being stored for a chunkmap (plus various overheads), which is unacceptable.
The really tricky busyness here is the rarest-chunk selection. If we were to deal with single-layer, e.g. choose the rarest from one layer only - for example BT would always choose the rarest BT chunk, disregarding any ed2k chunk information - it would be simple. But what I'd like to achieve is that if a 9500kb chunk is completely missing on ed2k network, BT module would prioritize that one, and vice versa. This would lead to most effective chunk-selection in multi-network downloads, with each network selecting whichever chunk is rarest across all networks. How to actually implement this, and even further - implement this so that it scales well even in case of thousands of chunks - still eludes me.
Madcat, ZzZz