1
HP Calculators / The .hpprgm format
« on: August 05, 2014, 04:41:36 am »
I spent some time reverse engineering the .hpprgm format (the format of the user programs of the Prime).
MAIN HEADER:
0x0000-0x0003:
size of the header, excludes itself
(so the next header begins at size+4)
0x0004-0x0005:
Amount of variables in table
0x0006-0x0007:
Amount of something? (Perhaps views or something?)
0x0008-0x0009:
Amount of exported functions in table
0x000A-0x000F:
unneeded?
Conn. kit generates
7F 01 00 00 00 00
but all zeros seems to work too.
0x0010-0x----:
Exported item table.
Entry format is as follows:
Type of item:
30 00 for variable,
31 00 for exported function
Name of item:
UTF-16, until 00 00 00 00
Then the next entry follows.
VARIABLE VALUES:
(There are as many blocks of this type as you have exported variables and the
blocks are in the same order as the exported variables)
0x0000-0x0003:
size of the value, excludes itself
0x0004-0x0005:
01 00 for detecting that this is a list
02 00 for single value entry
IF single value entry:
0x0006-0x0007:
type:
10 01 for base-10 integer or float
11 20 for base-16 integer
12 02 for string
IF base-10 integer or float:
0x0008-0x000B:
exponent
signed little endian 32-bit integer
0x000C-0x0013:
mantissa
little endian weird stuff. Hexadecimal to be interpreted as decimal... WTF?
00 00 00 00 00 00 00 25 01 is supposed to be 1.25 in decimal,
00 00 00 00 00 00 00 28 06 -> 6.28 and so on
The value is mantissa*10^exponent
IF base-16 integer:
0x0008-0x000B:
02 00 00 00 (why?)
0x000C-0x0013
55 63 62 00 00 00 00 00 becomes #626355
25 06 00 00 00 00 00 00 becomes #625
IF string:
0x0008-0x0009:
Length of string in characters, excludes the tailing 00 00
0x000A-:
string itself, ends in 00 00
IF list:
0x0006-0x0007:
(Tends to be 16 00, 16 01 or 16 02)
0x0008-0x0009:
32-bit LE
Amount of members in list, let's call this N
0x000A-0x000B:
(Probably reserved for something, 7F 01 or 00 00)
N 4-byte values:
Actual list of values follows, they are in reverse order compared to what
they are in the source.
An entry in the list follows this formula:
Stuff gets clever and recursive, every entry in here is handled like a
"VARIABLE VALUE" itself minus the size integer at the beginning
CODE HEADER:
0x0000-0x0003:
size of the header, excludes itself
0x0004-:
Code in UTF-16 LE until 00 00
Sorry if this is common knowledge already, I did try to search the forums (no results for hpprgm). I did find this: http://tiplanet.org/hpwiki/HP_Prime/File_Format#User_BASIC_programs, but it seems that the information there is true only for some very simpe cases. I have no wiki editing skills so I can't update that.
I hope someone else find this useful and that someone continues the reverse engineering of the format (if that stuff truly is bytecode, it could be very useful). My own motivation is has dropped due to other, more interesting projects unrelated to calculators, which is why I stopped working on this.
EDIT: updated, that stuff is not bytecode, they are the encoded objects like Tim Wessman said.
MAIN HEADER:
0x0000-0x0003:
size of the header, excludes itself
(so the next header begins at size+4)
0x0004-0x0005:
Amount of variables in table
0x0006-0x0007:
Amount of something? (Perhaps views or something?)
0x0008-0x0009:
Amount of exported functions in table
0x000A-0x000F:
unneeded?
Conn. kit generates
7F 01 00 00 00 00
but all zeros seems to work too.
0x0010-0x----:
Exported item table.
Entry format is as follows:
Type of item:
30 00 for variable,
31 00 for exported function
Name of item:
UTF-16, until 00 00 00 00
Then the next entry follows.
VARIABLE VALUES:
(There are as many blocks of this type as you have exported variables and the
blocks are in the same order as the exported variables)
0x0000-0x0003:
size of the value, excludes itself
0x0004-0x0005:
01 00 for detecting that this is a list
02 00 for single value entry
IF single value entry:
0x0006-0x0007:
type:
10 01 for base-10 integer or float
11 20 for base-16 integer
12 02 for string
IF base-10 integer or float:
0x0008-0x000B:
exponent
signed little endian 32-bit integer
0x000C-0x0013:
mantissa
little endian weird stuff. Hexadecimal to be interpreted as decimal... WTF?
00 00 00 00 00 00 00 25 01 is supposed to be 1.25 in decimal,
00 00 00 00 00 00 00 28 06 -> 6.28 and so on
The value is mantissa*10^exponent
IF base-16 integer:
0x0008-0x000B:
02 00 00 00 (why?)
0x000C-0x0013
55 63 62 00 00 00 00 00 becomes #626355
25 06 00 00 00 00 00 00 becomes #625
IF string:
0x0008-0x0009:
Length of string in characters, excludes the tailing 00 00
0x000A-:
string itself, ends in 00 00
IF list:
0x0006-0x0007:
(Tends to be 16 00, 16 01 or 16 02)
0x0008-0x0009:
32-bit LE
Amount of members in list, let's call this N
0x000A-0x000B:
(Probably reserved for something, 7F 01 or 00 00)
N 4-byte values:
Actual list of values follows, they are in reverse order compared to what
they are in the source.
An entry in the list follows this formula:
Stuff gets clever and recursive, every entry in here is handled like a
"VARIABLE VALUE" itself minus the size integer at the beginning
CODE HEADER:
0x0000-0x0003:
size of the header, excludes itself
0x0004-:
Code in UTF-16 LE until 00 00
Sorry if this is common knowledge already, I did try to search the forums (no results for hpprgm). I did find this: http://tiplanet.org/hpwiki/HP_Prime/File_Format#User_BASIC_programs, but it seems that the information there is true only for some very simpe cases. I have no wiki editing skills so I can't update that.
I hope someone else find this useful and that someone continues the reverse engineering of the format (if that stuff truly is bytecode, it could be very useful). My own motivation is has dropped due to other, more interesting projects unrelated to calculators, which is why I stopped working on this.
EDIT: updated, that stuff is not bytecode, they are the encoded objects like Tim Wessman said.