Alo Sarv
lead developer

MLDonkey Downloads Import Module Development
Payment completed
Development in progress.
Developer's Diary
irc.hydranode.com/#hydranode
Monday, February 28, 2005
Downloads completing working (almost) properly now ;)
Seems I fell asleep last night before I even got to blog post; sorry about that. To make up for that, here's a post of two-days-worth fixes and improvements.
I guess the most important improvement is that it's now possible to actually complete downloads successfully, w/o crashing or corrupting or anything else. I admit - completing stuff is still somewhat flacky - it generally shouldn't crash anymore. The problem, however, is, that when PartData is destroyed, the object is deleted before ed2k clients handle the events, so there's a LOT of calls being made into already-freed memory space, so basically, it still SHOULD crash. The problem is rather generic though, e.g. object posting EVT_DESTROY, and the owner handling it and deleting the object, while "watchers" or "users" cleaning up their stuff. I'm currently looking into possible solutions, one of which involves setting the object owner to receive the event as the last handler (boost::signals allows setting the order of handlers).
There were also a ton of bugs in chunks verification and final rehashes, all of which should be fixed now. Note that if you had any previous downloads, final rehashes may fail, but hydranode recovers from them smoothly and re-downloads corrupt chunks (the reason being that for a cpl days here, the chunk-hashes verification code wasn't operating at all, so data downloaded during that may have errors in it).
SharedFile is now also somewhat smarter. Namely, it will self-destruct if it detects that the physical file it pointed to disappeared. This check is only done when someone attempts to read from the file, so it comes w/o any additional performance overhead. Also, moving file from temp to incoming is also now done in WorkThread. There we have a small optimization still to make - namely, when temp and incoming are on same partition, the moving could be done in-place in main thread, since it's (nearly) instantanous.
I also updated MSVC compatibility, however, the compiler is again crashing, on four files in ed2k module. The symtoms seem somewhat similar to what we already know of from gcc 3.3.5 problems, however, there are still no definate solutions. Current recommendation is use gcc 3.4 (this includes darwin too, where 3.3.5 is default compiler apparently).
Not posting change-log tonight, it's way too long - just browse the
CVS list archives if you'r interested in the gory details.
Madcat, ZzZz
PS: Had hydranode running all-nighters, and later some more 5-10h sessions w/o crashing. It's still dependant on the server it's connected to, and download speed slowly drops because mules drop us from queues since we don't do reasks, but in general, it seems relativly stable.
Update: Received additional information from MultiIndex library author regarding the compilers crashing problems, so we just might get the dead compilers back to life tomorrow (too tired to test it right now) :)
Friday, February 25, 2005
SourceExchange v1 support + the usual fix-set
The fix and/or improvements set for today includes the following items:
- Completely disabled the chunk-request rotational thingie. It's way too error-prone, I even saw (at least) one eMule who got it wrong (or I got it wrong). For whatever the reason, this thing isn't very useful, and quickly leads to errors (and hence double data trasnfering), and since it's completely optional apparently (all clients seem to work fine w/o using it), hydranode won't be doing it anymore. (It still supports it if remote clients does it, but it doesn't send this stuff itself anymore).
- Fixed a rather obscure bug in RangeList::getContains() method, which returned different (but equally) correct range, however, PartData relied on it returning the first range that contains, not any range that contains. This caused some obscure errors with having file at 100% but not completing (missing 300 bytes etc), and unable to complete.
- Ensure that when a chunk completes, the chunk-hash-job is posted to HashThread before PartData has chance to submit fulljob; delay fulljob until all pending-jobs are finished (because if there are chunk-hashes still pending, it doesn't make sense to do a full rehash yet).
- Fixed PartData::addSourceMask() to update a proper range of ChunkMap (e.g. based on chunksize). It currently worked, but would'v broken when dealing with multi-net, multi-chunksize downloads.
- Added PartData::delSourceMask(), which, as the name says, removes a source mask.
- ED2k.Clients now properly handles the source mask adding/removing; no more duplicate source masks etc. Also, Client::ChunkMap is now shared between SourceInfo and DownloadInfo, so they don't get out of sync again.
- Beginning to add operations/data to output from modules (to future GUI), ED2k.ServerList now exports the server names, as well as allows connecting to specific servers. Go to /modules/ed2k/serverlist in hnshell and type `help` for more information.
- ED2K: Added support for SourceExchange v1, one of many eMule extended protocol features. Currently the support is only enabled in listen-mode - we respond to queries, but don't send queries ourselves - this is done in order have further time to investigate (and watch other client's behaviour) for when and how often are we allowed to do so. v2 [adds client userhash] and v3 [supports Hybrid ID's (e.g. *.*.*.0 high-id IPs)] support will most likely be added in near future (while v2 sounds useless, v3 is indeed useful, and you can't get v3 w/o also supporting v2).
Madcat, too tired to write any funny (?) stuff tonight, ZzZz
Thursday, February 24, 2005
zzzzzzz
Bah, how come whenever I happen around here [e.g. blog] I'm so tired that I don't feel chatty at all, so I just end up copy/pasting last cvs commits and falling to sleep? *sighs*. This probably ain't too fun to read at all ... I mean - what could be more boring than reading some tired developer's cvs commits logs ...
Well, I did fix a bunch of bugs, added command history support to hnshell (use up/down arrows for navigation)... cleaned up hydranode output, trace stuff is now disabled by default (--enable-trace configure flag to enable it - it's a major cpu hog). .. this means hydranode output is now pretty nice and clean ...
Oh, and when log messages are printed to hnshell, your typed-in command no longer disappears ... well, actually it does disappear, but it is now automagically restored to it's previous state (you shouldn't notice it ever did disappear :P).
chemical made some improvements to the statistics graphs script, however, I haven't gotten around to checking it myself yet ... was so damn busy watching TWO designers fighting over hydranode logo and web-site designs ... lots of blood and violence ... but hey - you don't argue about taste, you fight about it :P Hopefully they can come up with something real cool for the site/logo ... some ideas were really promising.
ZombieCat, ZzZz
Wednesday, February 23, 2005
Scheduler fixed; DOCS R.I.P.
Bah, I'm geting out of sync again, its 14:35 already, and I'm still awake ... ohwell.
Probably the most notable fix today was getting rid of that goddamn ssocket.h check failure, which triggered always after ~hour uptime. I was cursing and swearing every morning for past few days why on earth did hydranode crash last night just as I turned off my monitor, just as it had started gaining up some nice 60-90kb/s transfer rates ... until it stroke me - THAT's what triggers the bug - namely, hitting download speed limit. So -> quickly lowered download limit to 1kb/s, and reran it - voala - it crashed. After that, it was just hunt & destroy - apparently, we were very sure of ourselves when we wrote scheduler (yes, I'm aware I'm talking of myself in plural...), and somewhere along the lines went over the line. The bug was that when we got SOCK_READ event, we never bothered to check if we already had a DownloadReq alive. Thus, if there was, and we constructed here another one, and inserted to the relevant map, it overwrote the existing one, leaving a dangling pointer (which non-the-less was used by SchedBase, thus leading up to a crash on certain circumstances).
Other notable news was the parting of DOCS - Dynamic Object Creation System. Perhaps you remember it from the time it was originally created -
sept 23 2004 blog post. Today, with a relief, we send the subsystem to it's final journey. Ashes to ashes, dust to dust, may you rest in piece.
Madcat, ZzZz
Tuesday, February 22, 2005
Some less fixes
Wasn't too interesting day ... mostly testing things, and relaxing abit ... still, some improvements:
- Fixed sockets leakage problem. We were leaving open sockets that failed outgoing connection. Doesn't happen anymore.
- Now detects few more (unknown) clients who dont do chunk-rotation properly.
- [patch by chemical] Autosaves config and tempfiles every 10 minutes also
- Don't drop full file when final rehash fails - only drop currupt chunks (if possible). [untested]
- reqDownload() is only called when connection is active (caused unhandled exception before)
That last one caused me to lose one of my testfiles (which completed, 744mb one), cos one or more chunkhashes failed :(. So back to 0% with that one (well, it's actually back up at 9%). The second one is now at 48% ... *sigh*
I also discovered that emule drops clients from uploadqueue that haven't sent OP_REASKFILEPING for a hour, so seems UDP support just became the top-needed feature in our todo list.
Madcat, ZzZz
PS: Yes, I'm aware the site is damn slow; nothing I can do about it, just hope it clears up soon.
Monday, February 21, 2005
Duplicate sources fixed; more fixes/fine-tuning
Another full day of stabilization and fine-tuning. Da list:
- No more crashes when running out of servers in serverlist
- hnanalyzer perl script (by chemical), which parses hydranode.log and generates nice statistics graphs. Found in utils/ subdir, resulting graphs go to $(HOME)/.hydranode/graph1.png
- Fixed bug #6 (crash in Hash::operator=).
- Handle the situation where client sends StartUploadReq WITHOUT including filehash, and hasnt been given UploadInfo (or QueueInfo) status yet gracefully (e.g. ignore the packet).
- Some clients (using ClientID 0xe8) send AcceptUploadReq when transfering is already in progress, this caused us to send chunkreqs multiple times, leading to some chunks being transfered multiple times. Now chunkreqs are only sent when needed.
- Lower select() timeout to 50ms. This affects entire application running speed (used to be 100ms), thus raising the precision of speed calculations, and other things.
- Added passive source aquisition - when client wants a file from us, and we are also downloading it, request download from that client too. (Just for the record: eMule seems to do it also)
- Fixed duplicate sources problem; ClientList now uses multi_index lookups for id-lookups.
- Hydranode should recover from some more odd network behaviours on remote client side more gracefully.
- make dist works correctly again. Entire testsuit also now compiles again.
- Added latest source tarball to downloads page.
Next up... don't know yet. There's UDP stuff, but I'm not sure I want to go there just yet - would like to get more TCP-based protocol features done, like secident, some useful packets (multipacket for example), source exchange, and some more...
Madcat, ZzZz
PS: The two test-files are now at 58% and 40% completed, respectivly. Last night the core ran 7 hrs before crashing (the server connection bug), we'll see how long it survives this time.
Sunday, February 20, 2005
New features? No wait ... MORE bugfixes.
After having a discussion with our ProjectMaster(tm), we came to the conclusion that I could spend my entire life trying to perfect things in core, without ever getting anywhere. So - here's the new me, heading towards the land of new features and lots of bugs, so that I can actually come out with a real product some day.
Thus:
- Handle exceptions in scheduler gracefully, logging the error but continueing non-the-less.
- Handle exceptions in ed2kclient API when requesting downloads.
- Handle exceptions while reading shared file gracefully.
- Divide bandwidth equally again between requests. Speeds up hnshell again.
- Dont store all filenames we receive from network - we dont do anything useful with them (right now), and they just waste useless cpu time (in metadb.cpp:370 loop), useless memory time (they are stored both in two places), and in the end, they are COMPLETELY useless for hydranode operation.
Note: This will be re-added when some user interface needs it, but until then, we can keep it disabled. - Suppress warnings about unhandled tags in hello packets (waaay too many of those).
- Suppress warnings from unhandled MuleInfo packet tags that we dont care about.
- Also check for m_sourceInfo before dropping clients. (this was the reason we were losing 30% of sources all the time).
- Dont disconnect a client just for sending NoFile packet at wrong time.
- Dont notify about SOCK_CONNECTED twice (the event is emitted from networking subsystem, no need to emit it from scheduler too) (was causing us to send Hello packets TWICE to each and every client).
- Hexdumps in hnshell now look correct again.
- Fixed scheduler speed calculations - they were off by one 100ms value, leading to somewhat lower actual limit than was expected.
- FileDesc (e.g. comment) packets are now properly handled and contents displayed.
Ok ok, so not new features, just another set of bugfixes ... but I'm getting there... Based on those improvements, I'v been able to run hydranode for 4-5 hour sessions, without any major problems (leaving it running overnight again to see how it goes for even longer session). If there are no more fatal bugs, we can de-feature-freeze the code finally and dive into actual new features we'v all been waiting for.
One session's log output analyzed and put into graph is viewable
here.The current open topic is ed2k clientlist, where we fail to detect duplicate sources properly, and thus during every 20 minutes, we double the sources (the new sources are received from server, and we fail to detect we already have them). The source of the bug is still unknown, but it's been around like forever. While we'r on the topic - I'v been thinking about using boost multi_index again at clientlist, since we need a damn lot of different lookups there, and some are rather tricky... based on my experience with it in partdata, think it'd simplify things a lot, however, we'v gotta be careful to not turn it into another compiler-killer - clients.cpp already takes like 7 seconds and uses 300+mb ram to compile even on high-end systems, adding multi_index lib in there would probably be the last straw that broke the camels back ...
On other news, official hydranode bugtracker (experimental) is up now, all bugs are welcome. Anonymous submitting is allowed (until someone starts abusing it ofcourse), altough registration is strongly encouraged. The url:
http://bugs.hydranode.com/Madcat, ZzZz
PS: The two big testfiles - had to reset them today due to some problems. However, 7 hrs after the reset, they are both again at ~16.5% completed.
Saturday, February 19, 2005
Testing and tracking and then all goes blurry...
Thanks to chemical, the partdata was fixed for real this time. As noted earlier, there were problems in flushing method, where we were unable to seek to right positions, properly overwrite data etc w/o destroying existing contents. The trick was to open the file in
"rw+b" mode (
std::fstream myfile(loc, std::ios::in|std::ios::out|std::ios::binary) in C++ terms) - only then we could do what we needed to do with it.
I also discovered we were losing a lot of data to curruption. Investigation turned out that while we did save temp files on shutdown, we didn't flush them on shutdown, thus all data in buffers was lost on restart, while partdata thought it was there. This shouldn't happen anymore.
The more serious problems however, seem to be somewhere around scheduler. As we already know, the speed limiter isn't working. I know the source and symtom of the bug, but I haven't gotten to fix it yet though. I'v been running long tests, and discovered that after roughly an hour or so uptime, something goes wrong and all network activity stops - even hnshell becomes un-available. Traces lead to - again, the scheduler, somewhere deep inside the events/buffers handling code, but there it gets all fuzzy and blurry. Whatever the reason, SOCK_READ events are emitted from Networking Subsystem, but never reach client code - and only explanation is they somehow get lost in scheduler (which sits between the networking subsystem and client code).
Other interesting stuff - the long-running test involves downloading two large files (777mb and 744mb). After roughly 8 hours (and numerous restarts and restarts and so on, due to various bugs), I'm already at 24.41% and 3.99%, respectivly. For the record - the files have total of ~150 sources available to me (emule finds 1400 sources globally, but hydranode is limited to current server only). We'll see how this first large-file download progresses over the next days.
Also wrote a little app to parse hydranode.log, output .csv file from it, imported to excel and generated this
statistics graph of one of my sessions. Nothing fancy, just (hopefully) interesting to see.
Madcat, ZzZz
Friday, February 18, 2005
More on PartData
"
All programmers are optimists. Perhaps this modern sorcery especially attracts those who believe in happy endings and fairy godmothers. Perhaps the hundreds of nitty frustrations drive away all but those who habitually focus on the end goal. Perhaps it is merely that computers are young, programmers are younger, and the young are always optimists. But however the selection process works, the result is indisputable: "This time it will surely run," or "I just found the last bug."" - Frederic P. Brooks, "The Mythical Man-Month"
What did I say about finding the last bug in PartData last night? Well, think again. But this time - really - this time I found the last of those PartData bugs. Really. Believe me, this is the last one! Now if I could only figure out how to fix it...
Thing is, apparently, writing data to disk never really worked at all. "How come?", you may ask. Well, when we'r downloading very small files, we get the data sequencially, and write to disk sequencially, and all is nice and groovy. However, as soon as things get complex, we need to seek around the file, writing data to random positions. Indeed I had the seeking code there, and naturally assumed that when I did
myfile.seekp(begin); myfile.write(mydata);, it wrote the data to the indicated offset (wx, mfc and stdio behave this way). Dang, no it didn't. Instead, it seeked to the EOF, and wrote it there (incase where the offset was past EOF). So, then I go and say "mkay, so you don't want to seek past EOF... let's fill the file with nulls, and do our stuff then." Dang, now it INSERTS the data into the indicated position, instead of overwriting the nulls. Hum, ok, apparently, if I do
myfile.seekp(amount, std::ios::end), it seeks past EOF correctly too, but the inserting problem still remains. I must be seriously missing something important here, because this can't be happening...
On other news, I discovered mldonkeys are sending us chunks twice ... at first, it looked rather wierd - why on earth would they send us chunks twice... but then I realized - mules (and compatible clients) use chunk-request-rotation system - you request chunk1, chunk2, chunk3, you get chunk1, then you request chunk2, chunk3, chunk4, and get chunk2, etc. However, mldonkeys (some new ones, which use id 0x0a), apparently don't understand this logic, and start sending chunk2 and chunk3 twice, in the above example. Now hydranode detects these clients, and requests each chunk only once from them.
Madcat, ZzZz
Thursday, February 17, 2005
PartData fixed; misc fixes on ED2K::Client side
You know - if you tell people long and hard enough to leave you alone and stop disturbing you - they finally get the point, and do so. Thus, we'r back in action (for now, at least).
First and foremost, managed to fix PartData - the problem in there which caused file completition to fail was really stupid - I forgot to flush buffers before scheduling full rehash, so the last buffer contents were never written to disk, thus causing the hash to fail.
Also, made several fixes and improvements on ed2k side - namely, now it properly removes it's event handlers when partdata is destroyed (using boost::signals::connection objects, automatically disconnected when sourceinfo is destroyed also). FileStatus / FileName packets can, apparently, arrive at arbitary times, so when they do, we now handle them gracefully, and assume the sender is source for those (otherwise it wouldn't send them), if it's not already.
There seems to be some bug left in scheduler, namely, in handleDownloads() method:
uint32_t amount = getFreeDown() / pendingReqs;void getFreeDown() { return getDownLimit() - getDownSpeed(); } For some reason, getDownSpeed() returned larger number than getDownLimit(), which caused integer underflow, and moments later std::bad_alloc exception, since the amount variable was used to allocate data buffer. The assumption behind that code was that speed never exceeds limit, however, now that I start to think about it, I saw 160+kb/s transfer rates rather often during my earlier tests, with 100kb/s hardcoded download speed limit set... so I guess the download speed checker isn't functioning properly right now.
I added local workaround for the bug to avoid those crashes, so I could leave hydranode running overnight (dloading few 700mb files for testing), but the real fix needs to be worked on tomorrow (I'm not even going to commit this local fix - it's bad karma).
Madcat, ZzZz
Tuesday, February 15, 2005
Distractions ... distractions ... arghhhhh
Been a damn long day, I was supposed to go to sleep 9 hours ago already, but one thing led to another ... you know how it goes. Anyway, been damn lot of distractions and issues (from the physical world) which are hindering development. For starters, my USB mouse died; I went out and got myself a new USB mouse, and a new USB keyboard too. 5 minutes after I got home with those, the keyboard died too, leaving me with a VERY long and puzzled face. Some investigation revealed that the front-panel USB connectors of by box were connected to mobo reveresed (ground / +5v were reversed), which explains why they died...
Inbetween all that, I had to write a small port-forwarding/logging app for a friend, which turned out to become a ~180-liner, using, naturally, hydranode networking, events etc API's. This gave me a nice chance to test out the engines in completely different context, where the sockets responsiveness is very important (it's not so important inside hydranode - 0.1 sec delays at times were not noticable). Anyway, after switching the main loop to use 1ns timeouts, it became as response as one could ever want, so can say hydranode API is pretty useful, and usable for non-p2p contexts as well, which is a Good Thing.
Tomorrow I'll see if I can get the brand new fried usb keyboard replaced in warranty... if not, then I'm really unhappy. In any case, I'm telling everyone [in rl] to STOP DISTRUBRING ME, 'cos I need to work, and I can't work if everyone WANTS something from me all the damn time... :( After I get the keyboard to warranty, I hope I can shut off all distractions and dive into full dev-mode and make some real progress on the download handling now.
Madcat, zZzz
Monday, February 14, 2005
First GUI attempts
I'm trying to move the development more and more towards windows, but little success so far. The reason behind developing on windows is that the vast majority of users are on windows. And windows users expect fundamentally different approach to final application than Linux users - namely, graphical. While HydraNode works (somewhat) on Linux, it doesn't work at all on windows. Yes, it does compile, yes it does run, but no, it does not work. Because it's missing anything even resembling a graphical interface, and doing stuff in windows console just doesn't cut it.
So I'v been trying to cook up some kind of shell, along with some controls to help with further development - constantly parsing pages and pages of trace output in console got rather tedious a long time ago.
The GUI library choice pretty much came down to QT, after their
announcement that QT4 for Windows will be released under GPL in late Q2/2005. First impressions on QT library were very positive - things that I had spent several days trying to get working with wxWidgets worked right out of the box. However, there are limitations.
Namely, one idea was to compile the engine to a dll, and link GUI against it. With this approach, the GUI could do direct core function calls, subscribe to core events directly using core event engine, etc, giving very fast responsiveness. However, I quickly discovered that QT's preprocessor, Moc, breaks on Boost headers, namely, Boost.Signal ones, which is not very surprising. There are means of working around it, for example, by not including any hydranode headers from moc'ed GUI headers, however, it quickly leads into one big pile of mess.
As an afterthought, I realized the idea wouldn't have worked anyway, since it would'v broken remote GUI capabilities anyway. The only reason for attempting this was to tie the engine and GUI together for windows platform only, in order to allow easier development/debugging, however, seems it's a no-go.
So, the GUI still needs to do everything it needs to do over Core/GUI communication layer, and that layer must simply provide a ton of features required by modern user interfaces, that windows users expect.
Madcat, ZzZz
Saturday, February 12, 2005
Still here
As mentioned earlier, I'm doing some RL stuff that needs to be taken care of... hopefully I can get back to active development by monday. Figured I'll drop a blog post, and indicate that I'm still alive and around :)
Meanwhile, those that are having crashes with GCC 3.3 with current CVS, I could use full `gcc --version` output, as well as system configuration information (distro, RAM, CPU). This seems to be a bug in GCC 3.3 when handling very memory-intensive files (I recall hearing of similar issue about aMule compilation on some big file), so if it is the same issue, I might have to break PartData up some more to get around it. On my GCC 3.3-based test systems (slackware, gentoo), it compiles fine, so I guess those distros have incorporated the relevant patches into the compiler, while some others haven't ... or it could be my large amounts of RAM ... as said, need more info.
Madcat, ZzZz
Thursday, February 10, 2005
Porting day
Been a rather busy day, despite the fact that I was supposed to be dealing with RL issues today (they were postphoned). The main topic of today was bringing win32/MSVC compatibility back to normal. Towards that end, the following improvements were made (along with some misc stuff):
- Disabled usage of Tags in PartData ChunkMap (reduces symbol name lengths coming from Boost.MultiIndex to fix crashes on GCC 3.3 and MSVC 7.1 - still crashes on Darwin tho [might not be related]).
- Added duplicate getContains() method to RangeList class to compensate for MSVC's const-handling problems.
- Changed the Boost.Bind placeholder variable names from _1, _2 etc to _b1, _b2 etc, to avoid conflicts with Boost.Lambda placeholders (yes, I did re-define BoostLambda placeholders to __1, __2 earlier already) on MSVC, which seems to get confused (possibly related to name resolution in different namespaces? With using boost::lambda directive in effect, lambda placeholders are brought into namespace, and boost::bind placeholders are already in top-level anonymous namespace, causing the conflict).
- Added some more helper ; symbols for MSVC parser, which can't cope with function-level try/catch blocks very well.
- HydraNode shell server now properly handles win32 telnet client, which sends (a) 0x0d 0x0a on EOL (posix telnet sends 0x0d only), and (b) handle 0x08 as backspace (posix telnet sends DEL (0x7f) on backspace key by default).
- PartData info display is now properly formatted for win32 78-char console window.
- New shell command added: `hs', which displays hasher statistics (amount hashed, avg speed).
- Use std::ios::binary openmode for handling binary files - otherwise things break on win32.
- Hasher is now using filesystem::path types instead of raw strings for paths also, solving several problems regarding hashing.
- Re-enabled debug symbols generation on GCC 3.3.
The GCC bug, which generates
tons of link-time warnings when debug symbols are disabled is still unresolved, and I have no leads on where it came from either. Ideas anyone?
Bottom line: hashing, metadb handling, sharedfiles handling etc work correctly on win32 now.
Next up: Fix the PartData bug which causes us to lose some ranges of data constantly during downloading...
Madcat, ZzZz
Wednesday, February 09, 2005
Compiler problems, misc stuff
Apparently, the new PartData, along with the usage of Boost.MultiIndex library turned out to be a compiler-killer. At current state, I have MSVC 7.1 [win32] crashing on me, GCC 3.3.5 and GCC 3.4.3 [linux] giving crapload of linker warnings (GNU ld bug, resolved in GCC 4.0 branch), GCC 3.3 crashing when debug symbols (-g, -ggdb) are enabled (compiler bug related to very long symbol names inherent from hardcore template metaprogramming used by the multiindex lib), and GCC 3.3 [darwin] crashing on OSX even when debug symbols are disabled.
On MSVC side, I investigated the issues, and have narrowed the list of problems down somewhat (they seem to be related to tags usage), but actual fixes are still not there yet. There also seems problems with MSVC handling the new Range API code (it can't handle some constructs there), so more work is needed towards that end too. I still have no extended information on what's causing the GCC [linux] linker warnings (except for bunch of compiler patches), and one or two possible workarounds for the GCC 3.3 debug symbols handling issues (I currently disabled debug symbols generation when compiling with 3.3 automatically). As for GCC crashing on darwin, I'm not sure yet whether it's compiler bug, or the fact that my iMac is too old and simply runs out of resources - perhaps someone with faster Mac could try compiling current CVS ?
On other news, next few days I need to take care of some real life stuff, so I might be offline for long hours. I hope to be able to return to coding near the end of the week at latest - we'r SOO close to finally having stable downloading capabilities, so can't wait to get other things out the way so I can return to coding.
Madcat, ZzZz
Monday, February 07, 2005
Lots of fixes & improvements
Been a pretty nice and progressive development day, and since it's 5am and I'm rather wired, I'll just stick to dumping a compact summary of today's CVS updates here and get to sleep. So here goes:
- Temp files are now moved to completed dir
- ED2KPacket::readPartMap() method now adds correct amount of chunk bits (it formerly added extra padding at the end).
- ED2KClient now adds source masks and hashsets to PartData.
- Fixed Client::DownloadInfo::write() method chunk-finding loop
- Added getCompleted()[size], getSourceCnt() and getFullSourceCount() methods to PartData.
- Terminal size is now negotiated between telnet client and hydranode telnet server as defined by RFC 1073. Terminal size updates are also passed as the window is resized.
- Added `vd` [view downloads] command, which lists all current downloads
- Removed obsolete Range stream output operator, which was causing broken temp files on random cases (oddly enough, Linux/x86/gcc didn't detect duplicate operator, but OSX gcc did :o)
- Its now possible to build modules as [built_in]. To do so, compile module with -DBUILT_IN, and link the modules object files to final binary. (Currently needs to be done by hand, build-system support for it is problematic.
- New bandwidth-calculation engine in Scheduler - instead of former real-time bandwidth tracking, we now track bandwidth at 100ms precision - should be enough for our needs. In my tests, 170+kb/s transfer speeds had no effect on overall application performance after this upgrade.
- Cleaned up lots of extranous debug/trace messages from various files.
- Trim new download filenames to avoid spaces at begin/end of filenames.
- PartData now displays fancy graph during load.
- SharedFile::isPartial() and SharedFile::isComplete() methods now properly handle the situation where we have PartData at 100% completed state.
- PartData now detects itself when loading 100% complete temp file and attempts to complete itself again.
- Fixed lock generation - now adheres more closely to requested size (it was always giving size - 1 locks).
- PartData now resets itself completely if full rehash fails.
- ShellClient includes a nice progress-o-meter for monitoring download state.
Only current problem I'm aware of right now would be fixed if I had a clue where the problem lies. Thing is, it seems we'r losing ~30kb data somewhere between downloading and the target file. When PartData thinks it's complete, and performs full rehash, the rehash fails, and further investigation showed that the resulting file from the download was ~30kb smaller than was expected. The wierdest thing is that the regress-test still runs flawlessly, so I'v got no clue whatsoever right now where the problem is. Hope tomorrow brings better luck in searching this [hopefully_last] bug.
Madcat, ZzZz
Sunday, February 06, 2005
Stabilization ...
The PartData::getLock() method bug got fixed (the bug actually lied inside Range API, namely, Range::erase() method). However, PartData::getRange() method (where the actual chunk-selection code, the most important part of PartData v2 lies), also seems to have some problems. So it still needs further debugging.
I merged my local code tree to CVS, since the amount of differences between CVS and my local tree started to get out of hand (hadn't commited anything in almost two weeks). Now I can get back to more regular CVS updates, and have better means of keeping track of changes.
Below is the complete list of changes merged to CVS (changes between 01/24/05 and 02/06/05):
- hydranode/tests/test-hasher/ (Makefile.am test-hasher.cpp Makefile): Regress-test for Hasher v2.
- hydranode/tests/test-range/ (Makefile.am test-range.cpp test-range2.cpp): Regress-test for Range API v2.
- hydranode/tests/test-partdata/ (Makefile.am test-partdata.cpp): Regress-test for PartData v2.
- hydranode/tests/Makefile.am: Added test-ipfilter, test-range, test-partdata and test-hasher subdirs.
- hydranode/modules/hnsh/shellclient.cpp: Cosmetic changes. Now uses global lambda_placeholders.h for lamba stuff.
- hydranode/include/hn/ (bind_placeholders.h lambda_placeholders.h osdep.h): Bind placeholders moved to bind_placerholders.h. lambda_placeholders.h added.
- hydranode/configure.ac: Enable 64-bit file I/O support.
- hydranode/include/hn/event.h: Now uses bind_placeholders.h
- hydranode/include/hn/eventbase.h: Cosmetic and documentary changes.
- hydranode/ (include/hn/rangelist.h include/hn/range.h src/Makefile.am): Range API v1 replaced globally with Range API v2.
- hydranode/include/hn/range2.h: Range2.h removed.
hydranode/ (4 files in 2 dirs): PartData v2. - hydranode/ (include/hn/hash.h src/hash.cpp): Part renamed to Chunkin Hash/HashSet classes.
- hydranode/ (include/hn/hasher.h src/hasher.cpp): Hasher v2, now based on WorkThread API.
- hydranode/include/hn/hashsetmaker.h: Added MD4Hash generation.
- hydranode/ (8 files in 3 dirs): Use ints for events instead of enums.
- hydranode/ (include/hn/scheduler.h include/hn/sockets.h src/sockets.cpp): Wording change - timeouted -> timed out
- hydranode/ (include/hn/workthread.h src/workthread.cpp): Simplification and cleanup in WorkThread API.
- hydranode/modules/ed2k/ (clients.cpp clients.h packets.cpp packets.h): Now uses PartData v2 and Range API v2.
- hydranode/modules/ed2k/ (ed2k.cpp ed2ksearch.cpp): Now includes sharedfile.h
- hydranode/src/eventbase.cpp: EventTableBase methods definitions moved to source file.
- hydranode/src/hashsetmaker.cpp: Misc fixes, cleanups and changes.
- hydranode/src/hydranode.cpp: Disabled few trace masks.
- hydranode/src/Makefile.am: Added ipfilter.cpp to compilation.
- hydranode/src/ipfilter.cpp: Now includes rangelist.h
Everything not dealing with downloading should work just as before (I hope). Downloading things works in theory, in practice, it runs into endless loops frequently during part-selection and/or chunk-locking - that's the part that needs further debugging.
Madcat.
Saturday, February 05, 2005
Deploying new PartData
Deploying PartData in engine classesFilesList and SharedFile classes were modified to work with the new PartData. Due to the cleaner nature of PartData v2, another ~300 lines code could be dropped overall in those two classes. PartData saving/loading works. Construction/destruction works.
Deplying PartData in ed2k moduleClient::DownloadInfo class was modifying to interface with the new PartData, signficently simplifying the code there (~150-line less code than before).
ED2KPacket::ReqChunks packet interface was modified to work with new PartData and new Range API.
Bottom line & state of thingsThe application compiles and runs again (first time in a while), however, there are still remaining few runtime problems during downloading - obviously, the new PartData needs further stabilization. Current top issue seems to be
PartData::getLock() method, which seems somewhat broken - works occasionally, but breaks sometimes.
Once the overall things stabilize to a state equal-to or better-than the current CVS version (over a week old), I'll commit the new stuff to CVS. Until then... wrk wrk.
Madcat, ZzZz
Friday, February 04, 2005
Intermodule deps; PartData v2 passes regress-testing
Yesterday I had to lend my CPU resources for some video encoding for some quick cash (hey, kitty's gotta eat!), explaining the missing blog post. However, that forced break gave me chance to figure out inter-module dependancies engine, as well as add full 64-bit file I/O support capabilities.
The latter was rather simple - just needed _LARGE_FILES and _FILE_OFFSET_BITS defines enabled to get 64-bit STL IOStreams enabled (hydranode itself is fully 64-bit aware). The inter-modules dependancies thingie is somewhat more interesting tho.
The main problem with inter-module dependances is that when a module requires another module, we cannot really load the module unless all pre-requisites are already loaded. This is usually handled by platforms dynamic loader automatically when dealing with shared libraries in general, however, since we'r doing explicit shared library loading, we need to perform this step ourselves. Thus, the only way we can really get to know the dependancies of a module is via a support file -
modulename.cfg (the extension may change during implementation). That configuration file will be looked for in the same dir as the module itself, and parsed by HydraNode dynamic loader, and then loading all module pre-requisites prior to the module itself. The file format will be XML, since it needs to be human-writable. For example, Kademlia module might declare it requires ed2k module, etc. The config file is not just limited to pre-requirements. It can include HTML description of the module (for displaying in user interface), and any other info we feel like putting there.
That was yesterday. Today was a full coding day - under the topic of PartData. While finalizing and testing the new PartData, I also found one bug in the new Range API (erasing range from RangeList where the begin offset of erased range == existing ranges begin offset), and one bug in hasher (it was reading 1 byte less during chunkhashes, and the regress-test was buggy and didn't catch it). Inside PartData, there were also a number of fixes, and improvements. At the end of the day, my nice
download simulator is running smoothly, indicating PartData is ready for deployment. While saving/loading code wasn't regress-tested, due to the simplicity of that code, it'll be tested as soon as the API is deployed over existing codebase and put to use.
For the curious:
include/hn/partdata.h [479 lines]
src/partdata.cpp [531 lines]
src/hasher.cpp [168 lines]
include/hn/hasher.h [216 lines]
For the record - PartData v2 is still 500+ lines smaller than PartData v1, while doing everything the old PartData did, and much more, while being easier/safer to use (and modify), so we can conclude the rewrite was well worth the time and effort.
Madcat, ZzZz
Tuesday, February 01, 2005
Hasher passes regress-testing
Another full day of regress-testing the new hasher code, and now it seems we caught the remainder of bugs in it. There were several issues regarding non-regular file types, files of size 0 etc.
Next up - finish the PartData. Needs saving/loading code, and lots of regress-testing.
Madcat, ZzZz
PS: Regarding the enums problem noted in last blog post - we can use ints for events instead. The actual events would still be defined as enums, but since enums are basically constant integers, they can be converted and compared to/from/between ints. This will only affect the event handler function prototypes, so requires very little changes to existing codebase.
Archives:
December 2004
January 2005
February 2005
March 2005
April 2005
May 2005
June 2005
July 2005
August 2005
September 2005
October 2005
November 2005
December 2005
January 2006
February 2006
March 2006
April 2006
May 2006
June 2006
July 2006
August 2006
September 2006 Current Posts
