I rewrote the algorithm that TheShadow mentioned to make it more clearly.
/*
* 1) A piece is received for storage.
* 2) Allocate space to hold the piece.
* 3) If the new allocated space is where the piece is supposed to be,
write it there. Return!
* 4) If the new allocated space is not where the received piece is
supposed to be, search for
* the piece that is supposed to there.
* 5) If the piece is not found, search for the location where the
received piece is supposed to be.
* 6) Move the piece found there into the new allocated space, and write
the received piece in the
* old location. Return!
* 6) If the piece that is supposed to be in the new allocated space was
found, move it there.
* 7) Search for the location where the received piece is supposed to be.
If there is a piece
* there, move it into the new empty space, and write the received
piece there. Return!
*
* The result is a maximum of 2 moves and 3 writes.
* If you have 8 (0-7) pieces and you get 6,4,3,5,7,1,0,2:
* 0) You get 6; 6 goes in 0.
* 1) You get 4; 4 goes in 1.
* 2) You get 3; 3 goes in 2.
* 3) You get 5; 3 goes in 3; 5 goes in 2.
* 4) You get 7; 4 goes in 4; 7 goes in 1.
* 5) You get 1; 5 goes in 5; 1 goes in 1; 7 goes in 2.
* 6) You get 0; 6 goes in 6; 0 goes in 0.
* 7) You get 2; 7 goes in 7; 2 goes in 2.
*/
I'm not very good at writing with the fewest words possible.
If you can rewrite what I have written above in a more clear way with less
words, I would appreciate if you post it.
You still have to deal with pieces that span multiple files.
You have to create a virtual file system that puts the files end to end,
like a chain.
If you wish to see how it works, draw some boxes on a paper that represent
pieces and play with it
SpookyET said:
Thanks! The problem with bittorrent is that a piece can go over file
boundaries. Some people care more about disk space usage than file
fragmentation, thus I have to implement this. Sorting is horrible if
you
have to go through all the files to sort pieces.
After I have mentioned your algorithm, I got this in return:
[16:12] (TheSHAD0W): Heh.
[16:12] (TheSHAD0W): No.
[16:12] (TheSHAD0W): That'd work, but you'd wind up with spikes in
usage.
He's right, it has a potential for spiking.
[16:12] (TheSHAD0W): Here's how it's done, and Bram devised it...
[16:12] (TheSHAD0W): You get data for a new chunk.
[16:12] (TheSHAD0W): Allocate space on the end.
[16:12] *** blay (
[email protected]) has
joined
#BitTorrent
[16:12] (TheSHAD0W): If that space allocated is that chunk, write it
there. Done.
[16:13] (TheSHAD0W): If there already exists data for that piece in
previously allocated space, move it there, and its old spot is your new
open space.
[16:14] (TheSHAD0W): Then, check if the data you got the piece for is in
allocated space but something else is taking up its spot. If so, move
that data to the open spot.
[16:14] (TheSHAD0W): Finally, write your data into that spot.
[16:14] (TheSHAD0W): Results in a maximum of two moves in order to get
your data out.
However, the problem with this, IMHO, is after writing a slot out you
moved
data that *will* be moved again. Whereas above it might be. I think this
appraoch will result in more overal disk IO than my algorithm, but it
will
probably be spread out further.
In this case, I would probably benchmark a number of different
approaches.
Consider allocating pages(perhaps 4 chunks per page) and try to only
permit
half of a page to be used as scratch at any given time. Or perhaps
others.
Now I'm curious, lets see what I can think up...