If you're dealing with the parser, I'm not sure how much of a speedup you can really expect to get by avoiding B_CALLs. But anyway...
I assume by DelAllocFPS you mean DeallocFPS. That routine is very, very simple: it multiplies the input by 9, then subtracts that value from the current (FPS). It doesn't check that there are actually that many entries allocated on the FPS. So you could simply do
ld hl, (FPS)
ld de, -18
add hl, de
ld (FPS), hl
(Depending on the situtation, it might be good to verify that you're not popping past the start of the FPS, but the OS routines don't do that.)