Author Topic: CalcRAR  (Read 6016 times)

0 Members and 2 Guests are viewing this topic.

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
CalcRAR
« on: February 23, 2012, 09:36:57 am »
I believe that I once read about an idea by another programmer to make an Archiver of sorts, with the capacity to work with the DCS file structure. I learned on Cemetech that he put the project on hold. Reading this sort of made me want to work on a similar program myself, mainly for practice with different data types.

It'll work similarly to WinRAR for Windows and Stuffit Expander for Mac. That's why I name it CalcRAR. This program will create an archive file, a specially-formatted program with data that CalcRAR will read to determine what files to create, data to put in the files, and where to put the files. The file, after creation, will be compressed. When it is read, it will be decompressed. The best part, you can run this program just as you would anything else, given CalcRAR installed. The Archive is designed with the DCS AP feature, so when you run it, CalcRAR unpackages the Archive, automatically runs the program, then returns to CalcRAR, which removes the unpacked files and quits.

The reason I posted is because I need some advice. Is RLE the best compression to go with here? Or is another type better? And I may actually need some help implementing it.


Offline turiqwalrus

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 840
  • Rating: +51/-2
  • Wheeeeeee~!
    • View Profile
Re: CalcRAR
« Reply #1 on: February 23, 2012, 10:14:20 am »
well, the main problem here is that most options for compressing on-calc(especially if there's lots of features/fancy GUI, etc.) take up a lot of space, most of the time more than is saved by the compression of the program :P

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: CalcRAR
« Reply #2 on: February 23, 2012, 10:18:57 am »
well, the main problem here is that most options for compressing on-calc(especially if there's lots of features/fancy GUI, etc.) take up a lot of space, most of the time more than is saved by the compression of the program :P

Well, then what would you recommend? Just not compress it?

Offline turiqwalrus

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 840
  • Rating: +51/-2
  • Wheeeeeee~!
    • View Profile
Re: CalcRAR
« Reply #3 on: February 23, 2012, 11:31:59 am »
it might make more sense to compress it off-calc.
that would require computer access for every user though :\
I don't know...

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: CalcRAR
« Reply #4 on: February 23, 2012, 11:54:04 am »
Well, my idea was this...

The program is made, then copied to Archive. A second program is created in RAM. The first one is compressed and copied into the second. The first is then deleted. The second is the compressed Archive. That is how compression works.

Decompression I need to think about.

Offline DrDnar

  • LV7 Elite (Next: 700)
  • *******
  • Posts: 546
  • Rating: +97/-1
    • View Profile
Re: CalcRAR
« Reply #5 on: February 23, 2012, 01:01:17 pm »
I suggest looking up the DEFLATE standard, which is open and specified in an RFC, unlike RAR.
"No tools will make a man a skilled workman, or master of defense, nor be of any use to him who has not learned how to handle them, and has never bestowed any attention upon them. . . . Yes, [] the tools which would teach men their own use would be beyond price."—Plato's The Republic, circa 380 BC

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: CalcRAR
« Reply #6 on: February 23, 2012, 02:32:17 pm »
Well, I'm a bit confused as to application in z80. Would you perhaps be so kind as to write me a routine that compresses and decompresses a file? Or perhaps psuedocode it.
« Last Edit: February 23, 2012, 02:32:37 pm by ACagliano »

Offline shmibs

  • しらす丼
  • Administrator
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2132
  • Rating: +281/-3
  • try to be ok, ok?
    • View Profile
    • shmibbles.me
Re: CalcRAR
« Reply #7 on: February 23, 2012, 03:39:36 pm »
Would you perhaps be so kind as to write me a routine that compresses and decompresses a file? Or perhaps psuedocode it.
isn't that just asking for him to write the entire thing for you?
also, i was under the impression that more complex compression formats like this were compressed off-calc because the calculators have too little RAM to handle it. (see Iambian's pucrunch programs).
« Last Edit: February 23, 2012, 03:39:58 pm by shmibs »

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: CalcRAR
« Reply #8 on: February 23, 2012, 05:20:25 pm »
Well, I would like him to demonstrate, in pseudocode, the basic concept behind performing compression and decompression like that. There is more to my project than just that. :)

Yes, I know that the RAM limitation is an issue. However, I'm looking for a compression algorithm that can be used on calc. If it can't, it defeats the purpose of this project.

Offline thepenguin77

  • z80 Assembly Master
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1594
  • Rating: +823/-5
  • The game in my avatar is bit.ly/p0zPWu
    • View Profile
Re: CalcRAR
« Reply #9 on: February 23, 2012, 05:34:54 pm »
Would you perhaps be so kind as to write me a routine that compresses and decompresses a file? Or perhaps psuedocode it.
isn't that just asking for him to write the entire thing for you?
also, i was under the impression that more complex compression formats like this were compressed off-calc because the calculators have too little RAM to handle it. (see Iambian's pucrunch programs).

I think Iambian made a pucrunch compressor as an Axiom (or something similar to that).


For ACagliano, there are a couple different methods you could use, but RLE is definitely not the right choice. There are two main reasons you wouldn't use RLE. The first is that most programs on calc do not have long strips of the byte and the second is that if a program did have long amounts of the same number, the programmer would either compress it themselves or render that on the fly. What this means is that I would assume that RLE would make most programs larger

So, for a better alternative, so far in my life I've used LZ77 and DEFLATE. LZ77 is what's called a floating window where you search the past X bytes to see if you can find a match for the data your are currently looking at. X can be whatever you want, but for this project, since you probably won't be compressing stuff that won't fit in ram, you can just search the whole program for a match. DEFLATE uses the exact same principle as LZ77 except it also compresses the "data" used to tell the decompressor where to search for a byte.

What I would suggest is that first you take a look at LZ77 and get that running. Once you're happy with how that works (even if you're still increasing size when you compress), you can then make a few slight modifications and you'll be using DEFLATE, at which point you'll probably get some nice ratios.


If you want some source examples, I've done LZ77 and DEFLATE compression on the computer and I've decompressed them both on the calculator. (Though DEFLATE was done in TruVid so the code is rather scary (and runs at $F000)). The only problem with my code is that I made up my own data format to fit my needs.

And for information on these two, the LZ77 Wikipedia page is probably enough to get you going. For DEFLATE this is the page I used. (You can't make up DEFLATE like you can LZ77)
« Last Edit: February 23, 2012, 05:37:01 pm by thepenguin77 »
zStart v1.3.013 9-20-2013 
All of my utilities
TI-Connect Help
You can build a statue out of either 1'x1' blocks or 12'x12' blocks. The 1'x1' blocks will take a lot longer, but the final product is worth it.
       -Runer112

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: CalcRAR
« Reply #10 on: February 23, 2012, 05:45:10 pm »
Thanks, thepenguin. Instead of showing me the source code, can you perhaps try to give me an example stream of data, and then that same data compressed in LZ77. From there, perhaps I could decipher it.

Offline thepenguin77

  • z80 Assembly Master
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1594
  • Rating: +823/-5
  • The game in my avatar is bit.ly/p0zPWu
    • View Profile
Re: CalcRAR
« Reply #11 on: February 23, 2012, 06:42:49 pm »
LZ77 is good for really big amounts of data, and it gets rid of byte boundaries, but here's a small example.

First of all, since you'll be working with sizes no bigger than calculator ram, you should probably make your sliding window just big enough to include the whole program, (2,048, 4,096, 8,192, 16,384). So, for this example, I'll go with 2,048 (11 bits). Also, for max match size, I'll use 7 bits.

Also, 0 will mean uncompressed and will be followed by the number of uncompressed bytes (7 bits) and 1 will mean compressed and will be followed by search distance then size.

Key:
(1 bit)
8 bits
[11 bits]
{7 bits}

00 01 02 03 04 05 06 07 08 09 00 00 06 07 08 09 00 00 06 02 03 04 05 06 08 09 04 03 06 06 06 06 06 06 06 06 06 06 04 04 05 08 09 09 01 01 01 02 03 04 05

(0) {12} 00 01 02 03 04 05 06 07 08 09 00 00 (1) [6] {7} (1) [17] {5} (0) {5} 08 09 04 03 06 (1) [01] {9} (0) {8} 04 04 05 08 09 09 01 (1) [45] {5}

That wasn't easy. In this small example, even though we didn't fully make use of all the bits we had in the size fields, and this data was super short, we still went from 408 bits to 292 bits.

Also, if you manage to follow what happened there, you'll see that this method actually encompasses RLE ;D
« Last Edit: February 23, 2012, 06:44:13 pm by thepenguin77 »
zStart v1.3.013 9-20-2013 
All of my utilities
TI-Connect Help
You can build a statue out of either 1'x1' blocks or 12'x12' blocks. The 1'x1' blocks will take a lot longer, but the final product is worth it.
       -Runer112

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: CalcRAR
« Reply #12 on: February 24, 2012, 08:41:54 am »
I'm sorry thepenguin, but I am sooo lost.

Offline shmibs

  • しらす丼
  • Administrator
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2132
  • Rating: +281/-3
  • try to be ok, ok?
    • View Profile
    • shmibbles.me
Re: CalcRAR
« Reply #13 on: February 24, 2012, 12:30:30 pm »
in his compressed data, if a bit is a 0, it will tell the decompressor that the next section of bytes in the compressed stream is a string of uncompressed bytes. the next 7 bits in that same byte indicate the number of uncompressed bytes that will follow. the first uncompressed section in his code is this:
(0) {12} 00 01 02 03 04 05 06 07 08 09 00 00
with a 0 bit indicating that it's uncompressed, and the next 7 holding the number 12, indicating 12 bytes of uncompressed data will follow.

after that, the decompressor checks for a 0 or 1 in the next bit again. if it sees a 1, it knows that the next session is compressed, in that, rather than storing uncompressed data, it checks the number stored in the next 11 bits to find the point in the data that is being pointed to. it then checks the 7 bits after that, which hold the number of bytes to be copied from that spot in the data.

in his example, the first compressed section is this:
(1) [6] {7}
firstly, i'm fairly certain that this should be
(1) [7] {6}
instead, which says, move into the compressed data 7 bytes, and copy the next 6 bytes to that point in the uncompressed data.
he's taking advantage that this section:
06 07 08 09 00 00 06 07 08 09 00 00
has the same string of bytes occur twice in a row to have it only occur once in the compressed data, and just refer back to it for the second one.

i hope that helps a bit. =)
« Last Edit: February 24, 2012, 01:13:28 pm by shmibs »

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: CalcRAR
« Reply #14 on: February 24, 2012, 12:40:35 pm »
All it helps me to understand is the first bit. lol

So, bit 1 is the compressed/uncompressed status.
Bit 2-8 is the length of the data string.
What is the 11 and the other?
« Last Edit: February 24, 2012, 12:43:50 pm by ACagliano »