61
Math and Science / Re: A "new" compression format [subject to changes]
« on: November 25, 2013, 02:08:14 pm »
It would be significantly heavier than PuCrunch, and it wouldn't help for small data at all, due to the header.
This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to. 61
Math and Science / Re: A "new" compression format [subject to changes]« on: November 25, 2013, 02:08:14 pm »
It would be significantly heavier than PuCrunch, and it wouldn't help for small data at all, due to the header.
62
Other / Re: Draw circuits« on: November 25, 2013, 01:41:48 pm »
You can draw resistors with a common pencil. Hard to tune, though.
Drawing transistors would be much better. You could actually draw your logic design and immediately have it work. Would be tricky though. Maybe with 3 pens? N-type pen, P-type pen and this pen? 63
Math and Science / Re: A "new" compression format [subject to changes]« on: November 25, 2013, 01:26:56 pm »
Here's a version that works on older CPUs as well, probably a bit more useful
It's slightly slower, about 4 to 5%, which means I can still round it off to 0.5 miliseconds on the test file that I've been using so far. The difference is real and detectable though, not just noise. Various paths and buffer sizes are still hardcoded, but it shouldn't be hard to get it to work. The only changes are in the files decoder.asm and altdeflate_infl_ref.h 64
Math and Science / Re: A "new" compression format [subject to changes]« on: November 23, 2013, 04:54:38 am »
Alright, new benchmark results.
With sliding-window back-matches enabled, the compressed file (still with the same input, alice29.txt from the Canterbury Corpus, which is 152,089 bytes of ascii text) is now 56,052 bytes (this is not optimal, I haven't implemented things like lazy matches (yet?) in the compressor). That's about 55kb, with a high quality Deflate compressor you can get 51kb - but like I said, no lazy matches. More importantly, it takes about 0.5 miliseconds to decompress it (it varies a bit between 0.450 and 0.520, more often on the low side, but let's just round it up to 0.500). That's 290MB/s of output-throughput, or if you'd rather measure input-throughput, 106MB/s (that's not a common measure afaik, but there you go). I really expected it to get slower, since the histogram of match lengths showed the vast majority of them to be tiny, and tiny matches have a lot of overhead per byte. Will edit with code soon (it is at this point just test code though, not a usable product. Various paths are hardcoded. And the decompressor will only run on Haswell CPUs) 65
Other / Re: Wanting to get a desktop« on: November 19, 2013, 03:02:31 pm »
Don't defragment an SSD. There's no reason to do it (there is no seek latency so making files contiguous doesn't help), it's just pointless wear.
66
Other / Re: Wanting to get a desktop« on: November 19, 2013, 12:28:56 pm »what is a stock cooler?The cooler that comes with the CPU if you buy the boxed version (which you would buy). 67
Other / Re: Wanting to get a desktop« on: November 19, 2013, 12:18:37 pm »
How good "good enough" is depends on what you're doing with it, of course.
To play Skyrim dual-monitor, you're going to need a pretty nice GPU (or enjoy looking at pixels). To browse the internet, watch movies, and get rick rolled .. well that crap GPU in an i5 is plenty for that. Just make sure you can actually connect two monitors to the mobo then. How much RAM: well at least 4GB, but again it depends on what you're doing with it. VMs take a lot of RAM, and RAM doesn't cost anything really, so you might as well go for 8 to 16GB (more would be slightly silly, but most mobo's can take up to 32GB, if you're feeling silly) Good cooling system.. well for an i5 that you don't OC, you don't need much. Stock cooler will work of course, but you can go for something quieter if you want. 68
Other / Re: Wanting to get a desktop« on: November 19, 2013, 10:33:05 am »
AMD processors these days are cheap because they're not very good. They might be good in terms of perf/cost, but that only matters if you're going to buy a whole server farm full of them. You're only going to have 1 CPU, so you can't compensate for it being shit by just adding more CPUs.
If you buy a top of the line AMD CPU, it'll be about as good as an i5. Meanwhile, it will use about twice as much power (so your cooling system had better be prepared for that). You can OC it easily to match an i7, and then it will of course use even more power. Get an i5, or if you need more speed, an i7. Ok, on to other stuff. USB 3: just about every mobo supports it now. More monitors: get a good enough GPU, and look at the specs specific to that exact supplier(!!) and model. That is, a model from Sapphire doesn't necessarily have the same number of connectors as one from Club3D, ASUS, MSI, Gigabyte, you name it. That's one of the things they differentiate their prices on (other things being RAM and cooling and free games, etc) Storage: 1TB worth of SSDs (or a single one that big) is possible, but expensive. I'd get a nice but not very big one and put the OS and programs on it, dump big data on a slow hdd. Personally I only have a 256GB SSD and no conventional HDD, I don't store that much crap anyway. Other: don't forget to get a powerful enough PSU (modular, to avoid a mess of wires), look at cooling a bit (stock coolers are kind of meh, fans that come with cases are usually meh), when selecting a case, make sure it's deep enough for your GPU (high end GPUs are bloody long). 69
Math and Science / Re: A "new" compression format [subject to changes]« on: November 12, 2013, 03:56:02 pm »
I guess I'll do a quick example: suppose you have the data 4, 5, 5, 4, 0, 1, 2, 3, 4, 5, 5, EOS
The codelengths and codes assigned to those symbols will be 0: 3 001 1: 3 010 2: 4 0000 3: 4 0001 4: 2 10 5: 2 11 EOS: 3 011 So the message would be 10 11 11 10 001 010 0000 0001 10 11 11 011 That would be packed as 0xBE2806F6 or, as bytes, F6 06 28 BE (and a header of course but that one's simple) 70
Math and Science / Re: A "new" compression format [subject to changes]« on: November 12, 2013, 03:48:17 pm »
No they're Huffman codes, with longer ones and shorter ones.. the thing about the "9 rightmost bits" doesn't change that. That's just a property that those Huffman codes automatically have due to their construction and the size of the alphabet.
71
Math and Science / Re: A "new" compression format [subject to changes]« on: November 12, 2013, 12:03:06 pm »
Depends. On ARM calcs, no problem. But decoding the Huffman codes will probably suck on the z80 calcs.
72
Math and Science / Re: A "new" compression format [subject to changes]« on: November 12, 2013, 06:20:58 am »
Ok, here are the first benchmark results.
For the file alice29.txt from the Canterbury Corpus, using only Huffman compression and no matches (those don't quite work yet), the decompression time on my machine is (approximately and rounded upwards) 1.2 miliseconds. That's purely the decompression time, ie not counting file IO and such. The uncompressed file size is 152,089 bytes, the compressed file size is 87,784 bytes (it would be a lot smaller with matches, about 55kb, but they don't really work yet). That gives a output-throughput of 120MB/s. For reference, 7zip decompression speed (as measured by its built-in benchmarker) is about 50MB/s for a single core on my machine (limiting to a single core because my code is currently single-threaded, but that's not a limitation due to the format). I expect it to slow down when I enable matches, though. Long matches will be fast (as fast as memcpy, essentially), but the more common short matches have a lot of overhead. It isn't the most relevant comparison of course, LZMA doesn't even use Huffman compression at all. But it was the easiest comparison to make, and it's something. 73
TI Z80 / Re: Shutdown (a clone of Lights Out Deluxe)« on: November 11, 2013, 12:45:02 pm »
How do you ensure that the random instances are solvable?
74
Math and Science / Re: A "new" compression format [subject to changes]« on: November 09, 2013, 04:23:29 pm »
I'll benchmark it of course, but it'll be hard to filter out the contribution of the format. I already have an implementation mostly done.
75
Math and Science / A "new" compression format [subject to changes]« on: November 09, 2013, 03:17:32 pm »
I was trying to write a Deflate decompressor, and that's perfectly doable, but occasionally annoying. The "new" format I'm suggesting is basically Deflate with some changes. Some to make it less annoying, others because "why not". The goal is mostly "improved decompression speed".
The main structure is the same. Matches in a sliding window, with literals and match-lengths (and End of Block marker) in the same alphabet, and Huffman coding. But there are changes. The main change is that the Huffman codes are packed in dwords (specifically to help decoding with a 32bit "window"), starting at the most significant bit (helps with decoding). Those dwords are then saved with little-endian byte-order (to help x86 - ARM can work this more easily than x86 could in the reverse situation). They're aligned to their natural boundary, so there may be a little padding after the header (the point is to read in dwords, aligning them makes that more efficient on some architectures and only costs 3 bytes at worst). The Huffman codes are, of course, canonical, but a different kind of canonical than in Deflate: long codes must begin with zeroes. IE, using the rule "shorter codes have a numerically (if filled with zeros to the right) higher value than longer codes". This ensures that no more than 9 rightmost bits can ever differ from zero, which keeps your decoding tables small without trickery. The maximum length for a Huffman code is 31, up from 15. (that usually won't help, I'm throwing that in "because why not") Furthermore, the match-length symbols start at 256, with EndOfBlock assigned the code 288. This makes it slightly more convenient to index a table directly with the match-symbol. All 32 match-length symbols are used, following the same pattern of "extra bits" as in Deflate, but extended. The last 4 match-length-symbols have 6 "extra bits". The distance codes work like in Deflate64 (which is like Deflate, but with codes 30 and 31 being valid and getting 14 "extra bits"). The suggested way to decode this is to take two dwords, shift them left (with the bits of the second dword appearing in the lsb of the first dword) by some amount, count the leading zeroes (you can OR with 1 and use BSR since the maximum length of one symbol is 31 bits anyway), shift right by some amount depending on the number of leading zeroes, add some offset depending on the number of leading zeroes, then index a table with that. Alternatively, take a (possibly unaligned) qword, rotate it by 32 bits, then shift left (normally), count leading zeroes, etc.. The header is changed to this: if the first byte is 0, end of stream otherwise, the first dword (little-endian) has some bitfields: bit 0(lsb): 1 (constant) (to ensure the first byte is not zero) bit 1: 0 if the block is compressed, 1 is the block is stored uncompressed bit 2-31: length of this block if block is stored: just the bytes, raw. if block is compressed: dword (little endian): length of this block when decompressed uint16 (little endian): is_present, bit[n] (numbered from lsb up) indicates that the range of symbols [n * 16 .. n * 16 + 15] is part of the alphabet (1) or not (0) uint16 (little endian): is_default, bit[n] (see above) indicates that the range (see above) has its own 16bit mask (0) or is completely present (1) For every range that needs a mask (ie is present and not default), an uint16 (little endian) that indicates for every symbol whether it's part of the alphabet (1) or not(0) Two more masks for the match-lengths (these two masks are always present) Then, some bytes to store the code lengths: The lowest 5 bits indicate the code length (a length of 0 may be coded - enables you to drop a subrange mask and save a byte sometimes), the upper 3 bits are a repeat-count. If the repeat-count is 0, the repeat count is 8 + next_byte. [up to 3 bytes padding to align the next part] None of those changes are really spectacular, but IMO they all contribute a little bit to make the format less annoying. The part that I'm least sure about is the header. I'm also not sure whether allowing longer match distances is a mistake or not. Actually I'm not sure whether the whole deal for encoding matches is the right way to go or not. I couldn't come up with anything better, but it's a pain in the behind to parse. For most lengths `n`, parsing the match takes so much time that decoding `n` literals would have been faster. As always, suggestions are welcome (that's why I'm posting this in the first place). Especially if you've written a Deflate decompressor before (otherwise the changes might seem random). |
|