Author Topic: Compressing maps (Read 10423 times)

sjasogun1 · « **on:** March 27, 2013, 09:47:26 am »

Most people who program games on their calculator have encountered them: maps. They can take many shapes, sizes and forms, but they always have one thing in common. They occupy a lot of memory. Now, I think I have found a solution to this problem or at least an improvement from the current situation. It mainly revolves around using Matrices to store variables for 2D maps in.

Normally a matrix is very handy for storing numbers in that represent something on the map without having to worry about storing the coordinates for you since matrices have the coordinates built into them. The problem is that matrices tend to take up a lot of memory as they get bigger. For example, on my calculator, an fx-9860GII, a small 10x10 matrix takes 1236 bytes, and a 7x21 matrix that has the dimensions of the screen as far as ASCII goes takes 1800 bytes. And that is just a single matrix. A game will need many more to be of any use as a single screen-sized level won’t allow you to make much of a game that actually needs maps. Storing it in an array doesn’t make any difference either: it still takes 1800 bytes on my device (though it may save you a little bit on another device).

The solution I propose is as follows. You start by generating a list of prime numbers. You probably won’t need much more than 10 for simple maps that only include walls and have the exit hard-coded into them (meaning it is represented as an ‘empty’ square in the matrix but the code has something happen when the player stands on the spot where the exit is), but for maps that have more features on them you’ll need to generate more prime numbers as well (how many exactly will be detailed further in the explaination.)

Now, start by making yourself a map, probably something simple like a maze because you only have walls to work with for now. Put 1 in the matrix for any empty square and the first prime number on your list, which should be 2 if you did it correctly, for any wall. The reason for not using 0 as the 'empty' squares will become obvious in a minute. Now we have our first map. Nothing much, right? A player should be able to solve this maze in well under a minute. So, we’ll need more mazes. But instead of making another matrix to store the next maze in we’ll store it in this one, without changing it dimensions.

To do this you take your list of prime numbers. The first prime number is 2, but we already have 2 in our maze. So we’ll start with the second prime number: 3. Now, to make the process of making the new maze easier make yourself a new matrix. Make another maze of the same size as the previous one here, but this time use 3 for the walls instead of 2. Empty squares are still 1. Now, when you’ve finished, you multiply each square in the matrix with the matching square in the other one. So, you multiply A[1,2] with B[1,2] and A[5,12] with B[5,12]. This multiplication is why empty squares shouldn't be 0; multiplying with them would always give you zero. Before you start the multiplication, there is one important precaution to take: multiplying each square in a matrix with its corresponding square in another matrix is not the same as multiplying the two matrices. You’ll have to write a double loop in order to get the multiplication right.

Now, you end up with a matrix that looks like garbage. It has 1,2,3 and 6 scrambled all over it. The trick is that this matrix contains both of the original maps. Extracting them goes as follows: If you want the first map you take the first prime number (2) and go over each square of the combination matrix to see if the number in that square is divisible by 2. If so, count that square as a wall. If not, count that square as an empty square. If you draw the result on the screen you’ll see that it has produced your original maze again. You can extract the second maze in the same way, with the difference that you’ll take the second prime number (3) as your divisor. You can make many more mazes and multiply the squares in them with those in the combination matrix to store them all in the same one, going over the rest of the prime numbers. So 5 will be the wall in the third maze. 7 the wall in the fourth maze, 11 the wall in the fifth maze and so on.

This neat little trick makes use of one of the useful properties of prime numbers: every number has a unique set of prime factors. This means that if you take a random number like 59175839719 and figure out how to divide it into prime numbers there will be no other product of prime numbers that will produce the same number. In the compressed matrix this is easy to see. No prime number is divisible by 2 except 2 itself since the basic definition of prime numbers is that they can only be divided by themselves and 1. So, only the squares that were labeled with 2 in the first maze will ever be divisible by 2 since we only multiply by prime numbers. The same goes for all of the other prime numbers we used.

The only downside of this technique is that, in its current form, it only works for maps that have only two different kinds of squares: empty ones and walls in this case. There is an easy way to add more types of squares to the map though: simply skipping prime numbers. For example, in the first maze you’d use 1 for empty squares, 2 for walls, 3 for pitfalls that put you back a level and 5 for invisible walls. In the second maze you’d then use 1 for empty squares, 7 for walls, 11 for pitfalls and 13 for invisible walls. The only downside of this is that you need more prime numbers for more different features in your maps, scaling linearly for the amount of features. 1*10 prime numbers will allow you to make 10 maps with 1 feature (walls), 2*10=20 prime numbers will allow you to make 10 maps with 2 features (walls and pitfalls) and 3*10=30 prime numbers will allow you to make 10 maps with 3 features (walls, pitfalls and invisible walls). Luckily this only scales linearly so adding a feature will only add as many prime numbers as you want maps. Another advantage is that arrays (called Lists on my fx-9860GII) and matrices will usually reserve the same amount of memory for every element: enough to hold a number up to 9.99*10^99. This means that having a value like 6469693230 in your matrix (the product of the first 10 prime numbers) in your list won’t take any extra memory, effectively making the limit 53 prime numbers (after which the combined product of them, which is the worst case scenario, exceeds 10^99). Of course, this limit only applies to maps that only use walls: in maps with two features 2 and 3 can’t be multiplied, nor can 5 and 7 or 11 and 13, thereby raising your limit to 91 primes (took me a while to calculate that

). The limit to the amount of maps you can make does shrink as you add more features (from 53 to 45 in this case) but it won’t shrink too fast, especially not when you consider that adding more features gives maps potential to be harder thus requiring less to make an interesting game.
Also, if you need to store more maps you can simply make a second matrix in which you can re-use your prime numbers to store even more maps, thus reducing your memory load by a factor of 30~50, depending on the amount of features you add (not 30~53 as the code for generating primes, the storage for the primes and the code for extracting a map all take extra space as well, but not much and the 'memory reduction factor' only increases as you add more maps).

I hope you all like my idea and can put it to some use!

Builderboy · « **Reply #1 on:** March 27, 2013, 12:03:31 pm »

This sounds like a very intriguing new way to save memory! You better watch out though, since even though the maximum value for a number in a matrix is 10^99, the numbers are stored in floating point format. What this means is that even though you can store very large numbers, the precision of them will only be about the last 14 digits or so (depending on your calculator model). This will result in extremely large numbers being seemingly divisible by numbers they cannot be divisible by. For example, a large enough power of 2 will return 0 when you ask for remainder 3.

Xeda112358 · « **Reply #2 on:** March 27, 2013, 01:35:50 pm »

Another method that you might be interested in works like this:
Say you have 10 different tiles and numbers are stored with 14 digits of precision. You can then store each map layer with the nth digit in the number. To access the map in TI-BASIC code:

Code: [Select]

10fPart(10^-N[A]

If you aren't familiar with TI-BASIC, fPart( simply grabs the decimal part of a number (leaving off the integer part). If the language you are using doesn't have that, there is probably some equivalent to a floor() function (round down to the nearest integer) and you would essentially do #-floor(#).

If you have M different tiles, you can theoretically store up to floor(14/log(M)) different maps in a single matrix element, this way. However, I would go with 1 less than that to avoid precision problems. Your code for extracting a given matrix would be to replace the 10 in the above code with M.

I hope to see more ideas

Encoding data with primes and properties of numbers can have some pretty interesting effects.

Streetwalrus · « **Reply #3 on:** March 27, 2013, 02:05:35 pm »

Quote from: Xeda112358 on March 27, 2013, 01:35:50 pm

Encoding data with primes and properties of numbers can have some pretty interesting effects.

Yeah true, that's how compression works : complex math algorithms to gain a lot of space.

zeldaking · « **Reply #4 on:** March 27, 2013, 05:05:55 pm »

I have a question about this for Ti-basic. Matrices do take longer to display because you have to change each number back into a corresponding character correct? So to speed things up we could use strings, but similarly string take up memory especially when having many strings. Is there anyway to "compress" a string similarly to this matrices way, while still retaining speed?
----
Oohh I just came up with a good idea while thinking of some stuff.
Strings do take less memory than matrices, so say take a string: "123456789" comparable to the matrix of: [[1,2,3][4,5,6][7.8.9]]. So we take our string and use that prime factor "compression" talked about above. We take the matrix and put them instead into a string, "12121212" and whatever prime we are using and do the same way as the matrix. But instead using Substr( and converting to the correct character. I am not sure if that is faster, probably more confusing for map collision testing but oh well. Just a thought.

MGOS · « **Reply #5 on:** March 27, 2013, 05:51:26 pm »

well, the problem with that is the fact that the values each character can take are very limited. If you're using a LUT-String with InString( you can get to 256 different (probably less) values per character. Since 2*3*5*7 is already 210, 4 maps is the maximum to store in a string using the prime factor compression. I'm not sure whether strings are still smaller than matrices when you use two characters for each square. And it would be real mess to code.
In case you only need to store two values (1's and 0's) in each map, a binary compression would be a lot more efficient. Each bit of a number would be the value of the square in the corresponding map. With binary compression, you could store up to 7 (maybe 8 ) maps in a string, and a lot more in the matrices. When more different tiles are needed, it can be rather easily changed that a group of 2, 3 or 4 bits represent one tile (same method; explained in the initial post of sjasogun1).

sjasogun1 · « **Reply #6 on:** March 28, 2013, 06:44:09 am »

Quote from: Builderboy on March 27, 2013, 12:03:31 pm

This sounds like a very intriguing new way to save memory! You better watch out though, since even though the maximum value for a number in a matrix is 10^99, the numbers are stored in floating point format. What this means is that even though you can store very large numbers, the precision of them will only be about the last 14 digits or so (depending on your calculator model). This will result in extremely large numbers being seemingly divisible by numbers they cannot be divisible by. For example, a large enough power of 2 will return 0 when you ask for remainder 3.

You're absolutely right! I just tested it, and 14 digits is indeed the maximum precision for my model, which lowers the limit for 1-feature maps to 12 and 10 for 2-feature maps. A bit of a bummer, but still a significant gain, especially when considering that you can always take another matrix to store a dozen extra maps.

Quote from: Xeda112358 on March 27, 2013, 01:35:50 pm

Another method that you might be interested in works like this:
Say you have 10 different tiles and numbers are stored with 14 digits of precision. You can then store each map layer with the nth digit in the number. To access the map in TI-BASIC code:
Code: [Select]
10fPart(10^-N[A]If you aren't familiar with TI-BASIC, fPart( simply grabs the decimal part of a number (leaving off the integer part). If the language you are using doesn't have that, there is probably some equivalent to a floor() function (round down to the nearest integer) and you would essentially do #-floor(#).

If you have M different tiles, you can theoretically store up to floor(14/log(M)) different maps in a single matrix element, this way. However, I would go with 1 less than that to avoid precision problems. Your code for extracting a given matrix would be to replace the 10 in the above code with M.

I hope to see more ideas Encoding data with primes and properties of numbers can have some pretty interesting effects.

That's a very interesting method! There are only two small downsides it really has: one is being limited to a maximum of 10 features before it reduces your storage capability by half. But when you use two decimal places for each tile instead of 1 you can store as many as 100(!) different tiles, allowing you to store maps that are more artistic, while still reducing your storage by a factor of 6 (I take 12 decimal places as the limit just to make sure I don't go over it, otherwise the factor would be 7 (14/2)). The second is that this storage method is far less efficient when you use it with maps that don't need 10 different features, but even so it is more efficient than my method since it allows for up to 13 maps to be stored in a single matrix as opposed to my 12 maps which is only in the best-case-scenario of using only 1 terrain feature.

Hmmm... I think I might change the first post a bit to make this thread more of a discussion for different methods to compress maps, and perhaps even to simply 'compression methods' if the need to discuss compression for data other than maps arises.

Xeda112358 · « **Reply #7 on:** March 28, 2013, 10:57:33 am »

Quote from: sjasogun1 on March 28, 2013, 06:44:09 am

Quote from: Xeda112358 on March 27, 2013, 01:35:50 pm
Another method that you might be interested in works like this:
Say you have 10 different tiles and numbers are stored with 14 digits of precision. You can then store each map layer with the nth digit in the number. To access the map in TI-BASIC code:
Code: [Select]
10fPart(10^-N[A]If you aren't familiar with TI-BASIC, fPart( simply grabs the decimal part of a number (leaving off the integer part). If the language you are using doesn't have that, there is probably some equivalent to a floor() function (round down to the nearest integer) and you would essentially do #-floor(#).

If you have M different tiles, you can theoretically store up to floor(14/log(M)) different maps in a single matrix element, this way. However, I would go with 1 less than that to avoid precision problems. Your code for extracting a given matrix would be to replace the 10 in the above code with M.

I hope to see more ideas Encoding data with primes and properties of numbers can have some pretty interesting effects.

That's a very interesting method! There are only two small downsides it really has: one is being limited to a maximum of 10 features before it reduces your storage capability by half. But when you use two decimal places for each tile instead of 1 you can store as many as 100(!) different tiles, allowing you to store maps that are more artistic, while still reducing your storage by a factor of 6 (I take 12 decimal places as the limit just to make sure I don't go over it, otherwise the factor would be 7 (14/2)). The second is that this storage method is far less efficient when you use it with maps that don't need 10 different features, but even so it is more efficient than my method since it allows for up to 13 maps to be stored in a single matrix as opposed to my 12 maps which is only in the best-case-scenario of using only 1 terrain feature.

Hmmm... I think I might change the first post a bit to make this thread more of a discussion for different methods to compress maps, and perhaps even to simply 'compression methods' if the need to discuss compression for data other than maps arises.

If you use the method explained a little later, you can use a map with something like 5 features and get 20 maps in one matrix. With only 2 features, you can get 46 maps in one matrix because 14/log(2) is about 46.5. I used 10 as an example because it would be easier to think of each layer as a digit in base 10 instead of a digit in base M

If you wanted to use up to 30 tiles, you could get floor(14/log(30))=9 maps.

But I did think of another way to get a few more maps, just now. Floating Point numbers usually have a separate bit to store whether it is negative or positive, so you actually probably have 2*10¹⁴-1 numbers to work with. That is how it works on the TI calculators, anyway (10¹⁴-1 negative numbers, 10¹⁴-1 positive numbers, and 0).

Ooh, I just had an idea for quickly checking (with your method). For example, a 4-feature map will use 2,3 for the first layer, 5,7 for the second, and so on. Then to check layer 1, simply use gcd() of the matrix element and 6. The second layer, use the matrix element and 35 and so on. Then you just check if the value is {1,2,3,6} or {1,5,7,35} et cetera, corresponding to the tile.

Another neat method for storing maps is to use dynamic tiles that change based on the tiles it is in contact with. For example, a trick I used in TI-BASIC for storing map data for a maze was to store to a picture (black and white) the map. When drawing a tile, it would test the corresponding pixel, then check above, below, and to either side to see precisely which tile was needed. In this way, you could store a tilemap using a single bit for each tile, using 16 tiles, which is a rather crazy level of compression. In normal circumstances, you need at least 4 bits for each tile if you have 16 tiles (2^4=16). When I took remade it in assembly, it compressed a 32*32 map (1024 tiles) to 128 bytes, using 16 unique tiles. In pure TI-BASIC I think the best compression I could get would be using a 19-element list (a little over 171 bytes).

Here is a screenshot of one of the many offshoots I made using this technique:

sjasogun1 · « **Reply #8 on:** May 11, 2016, 01:11:48 pm »

Holy necropost batman!

Yeah yeah, I'm resurrecting this from its grave, but since the last time I visited here I've studied Computer Science at a university for a couple of years so I wanted to make an addendum to this technique in case anybody is still interested. Note that this is still interesting no matter what kind of calculator you are using, even if you don't use matrices.

So, the entire point of this compression in the first place was to reduce the wasted space from numbers having a much larger range than what you are actually making use of. My initial idea was to use prime numbers, but even that's wasteful is what I've found now that I have more experience with this sort of stuff. I won't touch on other compression techniques that rely on encoding here as a quick search on the site for "compression" already reveals plenty of topics about that.

So, the best way to actually do this is to use binary numbers instead. Take the number of tiles you have as N₁ and calculate B₁ = Floor(log₂(N₁)) as the number of bits you need to store your tiles. There's no need to use this number just yet, just store your tiles as the regular numbers in a data structure of your choice, like matrices. Be warned though that this only works if your tile id's start at 1 and increment by 1 - so for example if you have N=6 then the ID's for the tiles would be in the inclusive [1...6] range, leaving the 0 spot as a special 'this is not a tile' case that'll allow you to stack maps of different sizes using this method.

The trick kicks in when you want to store additional maps. Calculate the same B-value for your second map, so B₂ = Floor(log₂(N₂)). Now calculate the sum of all previous B-values, B_sum,1 = B₁ in this case since there is only one previous B-value. Make sure not to count the B-value of the current map! Now, multiply all of your tile ID's (again, the same range should be used) by 2^B_sum,1 and add each of these adjusted tile ID's to the ones of the first map. Of course, you can keep repeating this for more maps, for example with B_sum,2 = B₁ + B₂ for the large map. Now you can retrieve any map from this compound map by dividing by 2^{B_{sum,[map number - 1]}} (the same number the original ID's were multiplied by) and taking the Floor() of the result. So for the second map this would be Floor(ID / 2^B_sum,1)

The way this works is by relying on binary numbers. If you don't know what these are I'm sure there are good primers on them somewhere on the internet so I'm not gonna bother explaining them right here. Basically what this math does is calculate what the least amount of bits is that can store all your tile ID's (B) and shift every consecutive map over by the sum of all B-values for the previous maps.
So let's take N₁ = 6 and N₂ = 21 as an example. B₁ = 3 which is correct since the highest ID would be 6 which is 110 in binary, taking up the predicted 3 bits. B₂ = 5, also correct since 21 is 10101 in binary. Now multiplying by 2 in binary has the interesting property of shifting all the bits one space to the left - 6*2=12 and indeed 12 in binary becomes 1100. So when multiplying the tile ID's for the second map by 2^B_sum,1 = 2^B₁ = 2³ all the bits will shift three spaces to the left, ensuring that there are always three unused bits all the way on the right - which are precisely filled by the bits that form the tile ID's of the first map! No matter which tile ID's are transformed this way those bits will never be overwritten, thus preserving both maps. That this works for multiple maps is easy to see, as the next map will have its ID's shifted B_sum,2 = B₁ + B₂ = 3 + 5 = 8 places to the left, leaving room for both previous maps.

This is much more efficient than the prime number conversion, and simpler to boot since you don't have to make a prime number calculator and can simply use built-in logarithms for the math instead. As a comparison, the prime-number method could only store 10 maps with 2 features this way, while this method can store 46 bits in total with 14 decimals of precision in the floating point numbers, which means a whopping 23 maps with 2 features as opposed to the 10 from the old method (Or 46 if you use 0 as traversible space instead of a special 'this is outside of the map' value)

This method of compression can be used for other purposes as well - you could even use it on the table of N-values for each map to combine all of those! Simply treat each N-value as a map with only 1 tile ID and use the value itself for the N-value of this imaginary map and you're set! You could store all N-values for all maps that fit in one matrix/other data structure in a single number this way if you wanted to. And for the true geeks who want to compress even further you could even use other compression techniques on this already compressed map to compress it even further, though the gains will be lessened as is usually the case when you're stacking compression methods.

Well, there's my first post to omnimaga in I think over three years. If this is too necro-y feel free to just move this to another thread altogether, but I felt that it was appropriate to post this improved version of a trick I came up with three years ago (but which has been widely known in the much improved version in this post to most computer scientists for decades) in the same thread I originally posted it. Hope this helped someone!

Geekboy1011 · « **Reply #9 on:** May 11, 2016, 03:56:42 pm »

That's quite a nice writeup! I'm sure someone will make use of it seeing as this whole topic is interesting! Also definitely the right place for it. Nice to have you around again!

Author Topic: Compressing maps (Read 10423 times)

sjasogun1

Compressing maps

Builderboy

Re: Compressing maps

Xeda112358

Re: Compressing maps

Streetwalrus

Re: Compressing maps

zeldaking

Re: Compressing maps

MGOS

Re: Compressing maps

sjasogun1

Re: Compressing maps

Xeda112358

Re: Compressing maps

sjasogun1

Re: Compressing maps

Geekboy1011

Re: Compressing maps