Author Topic: Interesting data structures and algorithm you may not have heard about  (Read 6406 times)

0 Members and 2 Guests are viewing this topic.

Offline harold

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 226
  • Rating: +41/-3
    • View Profile
I'll start out with some stuff I actually use and may not be widely known: ("widely known" is stuff such as red-black trees, array-based heaps, circular buffers, hash tables, linked lists, you name it - first year CS material)
  • Binary Decision Diagram solves all sorts of problems that you thought were hard or impossible, such as: boolean function equivalence, boolean satisfiability (and in fact the number of solutions, in time independent on the result) and Linear Boolean Programming. I've used BDDs with great success in many Project Euler problems (usually the "count solutions" part of BDDs) that were designed to require some mathematical insight. Great for "brute force - but smart".
  • Zero Suppressed BDD's - a variation of BDDs especially suited to sparse functions, such as when solving some Exact Cover (or regular Set Cover) problems (not when choosing a set has very non-local results, such as with Sudoku) and you can actually get a count of solutions, pick a uniformly distributed random solution or a solution that optimizes the sum of weights of the chosen sets. Also good for representing all Hamiltonian paths though a graph, which you can use to solve some nontrivial instances or the Traveling Salesperson Problem.
  • Dynamic Programming. Not an algorithm, but a class of algorithms so big that you have already used DP even if you didn't know it. For example, you know how to calculate Fibanacci numbers in linear time? That's DP. You've used Dijkstra's algorithm? DP. Learn about DP, it will come in handy. Maybe this shouldn't be on the list - almost everyone has heard about it.
  • Branch and Bound. The wiki entry is just a stub, so no link. The idea is simple enough, but powerful: you do a Depth First Search, but you stop going down the recursion tree when the solution can not possibly become better by going down this sub-tree than the best solution found so far. Especially good when combined with a fast approximation. This technique can save your ass when it looks like you are going to have to use actual brute force, but when the cost function is monotonic while going down a sub-tree. Often applicable to solving puzzle games. It may seen niche, but I've had to use this in real life software that is actually used by people.
  • Suffix Arrays, takes less space than a suffix tree, and useful for many of the same problems, such as many string problems (such as searching for substrings in sub-linear time). Other significant applications include a linear time Burrows–Wheeler transform (useful for data compression) and linear time Lempel Ziv decomposition (probably useful for LZ77/Deflate).
  • Miller-Rabin primality test for when trial division gets too slow. It will test 64bit numbers for primality in mere miliseconds.
  • Pollard's Rho algorithm for when trial division gets too slow and you actually need a factor, not just a boolean "IsPrime".
Let's get this going, I'm sure we can all learn something new and interesting from this thread if people start adding stuff to it.
« Last Edit: March 05, 2013, 03:33:33 pm by harold »
Blog about bitmath: bitmath.blogspot.nl
Check the haroldbot thread for the supported commands and syntax.
You can use haroldbot from this website.

Offline Scipi

  • Omni Kitten Meow~ =^ω^=
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1547
  • Rating: +192/-3
  • Meow :3
    • View Profile
    • ScipiSoftware
A lot of these I've never heard about.

I will have to take a look into understanding these and implementing them in a program or two :D

* (Sadly, HOMER-16 doesn't know any data structures or algorithms off the top of his head)

Imma Cat! =^_^= :3 (It's an emoticon now!)
Spoiler For Things I find interesting:
Spoiler For AI Programming:
Spoiler For Shameless advertising:

Spoiler For OldSig:





Spoiler For IMPORTANT NEWS!:
Late last night, Quebec was invaded by a group calling themselves, "Omnimaga". Not much is known about these mysterious people except that they all carried calculators of some kind and they all seemed to converge on one house in particular. Experts estimate that the combined power of their fabled calculators is greater than all the worlds super computers put together. The group seems to be holding out in the home of a certain DJ_O, who the Omnimagians claim to be their founder. Such power has put the world at a standstill with everyone waiting to see what the Omnimagians will do...

Wait... This just in, the Omnimagians have sent the UN a list of demands that must be met or else the world will be "submitted to the wrath of Netham45's Lobster Army". Such demands include >9001 crates of peanuts, sacrificial blue lobsters, and a wide assortment of cherry flavored items. With such computing power stored in the hands of such people, we can only hope these demands are met.

In the wake of these events, we can only ask, Why? Why do these people make these demands, what caused them to gather, and what are their future plans...

Offline harold

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 226
  • Rating: +41/-3
    • View Profile
  • Unrolled Linked List. Despite theoretical advantages, linked lists almost always turn out to be very disappointing compared to ArrayLists. Unrolled linked lists fix that to a certain degree - sometimes they may even beat ArrayLists.
  • Table Based Huffman Decoding. The most well-known way (frequently given in textbooks, because it's The Obvious Way) is going down the huffman tree bit-by-bit until you reach a leaf. The table based method in the simplest way makes a single table lookup to decode a symbol. With this trick, the performance of Huffman decoding increases from "unacceptable" to "very good". Note that it requires canonical Huffman codes - which most compression formats use, so it's widely applicable.
  • K-d trees and variants, useful for various geometry problems. For example, I used them to fit textures varying in size widely together onto a texture atlas. That isn't optimal, but a lot better than a naive approach and a lot faster than an optimal approach (it runs every time when the game starts, and does not significantly increase the loading time).
  • Fermat Factorization, an other trick for when trial division is too slow. It has different performance characteristics than Pollard's Rho: Rho is very good for small factors, Fermat Factorization is good for big factors.
Come on guys, post some more :)
« Last Edit: March 06, 2013, 04:26:34 am by harold »
Blog about bitmath: bitmath.blogspot.nl
Check the haroldbot thread for the supported commands and syntax.
You can use haroldbot from this website.

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Binary Indexed Tree, a data structure which represents an array that can both be modified in O(log n) time and retrieve the sum of any range of elements in O(log n) time. In addition, the storage requirements are no more than that of an array of the elements.

The alternatives to this method are simply keeping an array of the elements, which is O(1) for modification and O(n) for range sum, or keeping an array of the cumulative sum, which is O(n) for modification and O(1) for range sum. Binary indexed tree is the best of both methods at O(log n) for each operation, and it can be implemented in a very small amount of code.

Seriously small:
Code: [Select]
// Note that indices start at 1

// Get sum from elements 1 to k
int get_prefix_sum(int k) {
   int sum=0;
   for(; k; k-=(k&-k))
      sum += array[k-1];
   return sum;
}

// Add n to element k
void add_to_element(int k, int n) {
   for(; k<=array_size; k+=(k&-k))
      array[k-1] += n;
}

Edit: Also, here are some related helper functions.
Spoiler For Spoiler:
Code: [Select]
// Sum from element a to b
int range_sum(int a, int b) {
   return get_prefix_sum(b) - get_prefix_sum(a-1);
}

// Get element k
int get_element(int k) {
   return range_sum(k,k);
}

// Set element k to value n
void set_element(int k, int n) {
   add_to_element(k, n - get_element(k));
}
« Last Edit: March 06, 2013, 01:47:31 pm by calc84maniac »
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline harold

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 226
  • Rating: +41/-3
    • View Profile
Very good, Binary Indexed Trees are awesome, simple but powerful :)

Here's some more stuff I find interesting:
  • Bloom Filters. I haven't really used them yet, but they look very useful when you want to do a quick but inaccurate test so you can save a large portion of slower but accurate tests - for example, google uses them in Chrome to pre-test a list of "dangerous" domains locally, so most queries don't have to go all the way to their servers.
  • The Hungarian Algorithm finds a "best assignment" of "tasks" to "actors". This is an important optimization for real life, as well as in many programs, including games (for example if you have a bunch of units and a bunch of tasks in an RTS, and you'd like to not just pick the nearest the unit for each tasks greedily but calculate the optimal way to assign the tasks). It looks simple, but sadly it has steps that are difficult to do nicely on a computer.
  • Interpolation Search is an adaptation of Binary Search that works extremely well when the items in the list are roughly uniformly distributed in their keyspace. Often neglected because it's not as "theoretically pretty" as basic Binary Search, but frequently a win in practice.
Blog about bitmath: bitmath.blogspot.nl
Check the haroldbot thread for the supported commands and syntax.
You can use haroldbot from this website.

Offline AngelFish

  • Is this my custom title?
  • Administrator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 3242
  • Rating: +270/-27
  • I'm a Fishbot
    • View Profile


More of an idea than a specific algorithm, but Secure Multi-party Computation. Quite possibly one of my favorite pieces.

Blum Blum Shub (no free links, sorry). It's a cryptographic random number generator of the form: xn+1 = xn2 mod M, where M is the product of two large primes. Each iteration can produce a max of two bits (!) of random numbers, which can be done via the following: (parity(xn)*(xn & 0x01))*(parity(xn+1)*(xn+1 & 0x01)). Alternatively, you can just take the straight parity or the least significant bit of the output and get one bit per iteration.
∂²Ψ    -(2m(V(x)-E)Ψ
---  = -------------
∂x²        ℏ²Ψ

Offline Scipi

  • Omni Kitten Meow~ =^ω^=
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1547
  • Rating: +192/-3
  • Meow :3
    • View Profile
    • ScipiSoftware
Well, what would you know, I do have something to add here. :D

This is something I just found today and am trying out.

  • Neural Networks are similar to cellular automata in that they have emergent behaviors. Although the difference is that neural networks learn and evolve with each successive generation. They're useful in AI programming as well as pattern recognition.

Edit:

This article might be more relevant http://en.wikipedia.org/wiki/Artificial_neural_network
« Last Edit: March 08, 2013, 03:17:40 pm by HOMER-16 »

Imma Cat! =^_^= :3 (It's an emoticon now!)
Spoiler For Things I find interesting:
Spoiler For AI Programming:
Spoiler For Shameless advertising:

Spoiler For OldSig:





Spoiler For IMPORTANT NEWS!:
Late last night, Quebec was invaded by a group calling themselves, "Omnimaga". Not much is known about these mysterious people except that they all carried calculators of some kind and they all seemed to converge on one house in particular. Experts estimate that the combined power of their fabled calculators is greater than all the worlds super computers put together. The group seems to be holding out in the home of a certain DJ_O, who the Omnimagians claim to be their founder. Such power has put the world at a standstill with everyone waiting to see what the Omnimagians will do...

Wait... This just in, the Omnimagians have sent the UN a list of demands that must be met or else the world will be "submitted to the wrath of Netham45's Lobster Army". Such demands include >9001 crates of peanuts, sacrificial blue lobsters, and a wide assortment of cherry flavored items. With such computing power stored in the hands of such people, we can only hope these demands are met.

In the wake of these events, we can only ask, Why? Why do these people make these demands, what caused them to gather, and what are their future plans...

Offline harold

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 226
  • Rating: +41/-3
    • View Profile
  • Simulated Annealing is a nice trick that works very well for how simple it is (in it's basic form - it gets complicated when you make it fancy). It combines well with other tricks, since it's really more of a "framework-type algorithm" and not really a "mathy algorithm", so there are lots of "blanks" that you get to fill in. It's also attractive because it becomes arbitrarily good just by throwing more time at it - which means that on one side, you can make it quick if you don't really care that much about how good the answer is, or on the other side, you can make a very good answer just by waiting a long time. And it's the same code for both, so you can even make the speed a user input. That's a rare thing in algorithms.
  • A* is a very often a good way to speed up an exhaustive search. It's like BFS, but instead of a queue it uses a priority queue, sorted by "distance from root + estimated distance to goal". What makes it particularly useful is that it's guaranteed to find the optimal answer when the estimated distance to the goal is never higher than the actual distance (heuristics satisfying this condition are called "admissible"). The maximum over a bunch of admissible heuristics is still admissible, so you can often make the search faster by making up more heuristics (unless evaluating the heuristics takes too much time, of course). If your heuristic is accidentally not admissible, that's one of the worst bugs to diagnose, so be careful.
    [I thought about putting A* in the list when I created the thread but had thought that too many people would already know it, but maybe not everyone does so here it is]
Blog about bitmath: bitmath.blogspot.nl
Check the haroldbot thread for the supported commands and syntax.
You can use haroldbot from this website.