Rainer Brockerhoff wrote:
…Imagine my surprise when I learned that Peter had just published the complete Hex Fiend source code, and also started up a Wiki to explain details…
That said, I haven’t had time to look at details of his data backend yet, but that too looks like it will save me at least a month of tinkering.
Well, I’ve now had a little time to look at the Hex Fiend code.
As I expected, it builds a tree of referenced/changed byte ranges for an edited file. There’s a generic “ByteSlice” class with concrete subclasses that represent either a range of bytes inside a file or a range of bytes in memory – the latter would be the result of an editing operation like typing or pasting in stuff. That much I’m already doing myself, albeit with different names.
The interesting part comes when an edited file is saved. Hex Fiend goes to great lengths in optimizing writing time, allocated RAM, and disk space; it uses threaded AVL trees and lots of neuron grease, and while I understood very generally what’s supposed to be going on, the details are extremely daunting. My hat’s off to the wizard. And he spent similar care on optimized searching, too.
Now, just lifting all that code and plopping it into XRay II just wouldn’t be cost-effective. Yes, the result is that you can open a 240GB file on a 250GB disk, swap huge chunks of it around, and insert random bytes in the middle, and save it with no problem. Do I see this situation arising frequently for XRay II users? Frankly, no. Remember, the idea is to do structured editing of file contents, not necessarily pure hex editing… and Hex Fiend is already terrific at this (not to mention, free). I see the hex editing panels in XRay II more suited to editing small amounts of data and as a convenience to view raw file contents without necessarily changing them.
So, falling back on the old method of saving an edited file to a temporary file, then swapping it with the original if creation succeeded, means that complexity will go way down at the expense of speed (noticeable only for really huge files) and of the necessity of extra free space (same).
A second problem is that, for Hex Fiend, a file is just a sequence of bytes – no structure. For me, on the contrary, changing a byte in one place – meaning editing a representation of (say) an ID3 tag in a music file, or a QuickTime atom in a movie file – will usually mean that elsewhere in the file one (or even several) count or length fields will also have to change, preserving the file’s integrity. So my data representation tree also needs to reflect a particular file’s format as decoded by a plugin – and there may be several plugins seeing the file in different ways – and the nodes need to be more intelligent, notifying each other when necessary.
Still, seeing the Hex Fiend code has given me assurance that I can do it myself, so that’s good…