05 May 11

If you have a web site with a search function, you will rapidly realize that most mortals are terrible typists. Many searches contain mispelled words, and users will expect these searches to magically work. This magic is often done using levenshtein distance. In this article, I’ll compare two ways of finding the closest matching word in a large dictionary.

by mlb 14 years ago

27 Aug 10

Pattern matching algorithm used in GNU Grep (see: http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html )

by mlb 15 years ago

06 Aug 10

The purpose of this note is to illustrate how the ordinary standards of ran- domness have little to do with the type of randomness required for cryptographic purposes. That is, there are really two standards of randomness.

by mlb 15 years ago

02 Jul 10

“When approaching the string comparison optimization problem, what we would like to do is to provide effective and efficient ways to rule out most of the candidate strings. We may refer to it as a “disqualifying comparison” - it lets us move faster down the search tree or move faster along the hash bucket linked list, until reaching the final string comparison in the search, keeping in mind that even the most efficient hash structure would probably waste a substantial amount of its time and cycles in string comparison.”

by mlb 15 years ago


25 Jun 10

That’s the crazy thing about malloc implementations. They all claim to be awesome in every way. So the burning question is: is this better than what I’m using today? I don’t know, but I’d sure like to!

by mlb 15 years ago

15 Mar 10

Most credit card numbers are validated using an algorithm called the “Luhn check”. This is a very simple algorithm that doubles the odd digits and does a sum to see if the number is divisible by 10.

by mlb 16 years ago

27 Jan 10

A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way. A. Criminisi*, P. P´erez and K. Toyama

by mlb 16 years ago

14 Jan 10

Based on Donald Knuth’s concept of literate programming, LiteratePrograms is a collection of code samples displayed in an easy-to-read way, collaboratively edited and debugged, and all released under the liberal MIT/X11 License (see Copyrights) so that anyone can use our code and text for any purpose.

Code on LiteratePrograms is organized in a variety of ways using categories: by subject area, by language, by environment, and so on.

by mlb 16 years ago

13 Nov 09

This book, is supposed to teach methods of numerical computing that are practical, efficient, and (insofar as possible) elegant.

by mlb 16 years ago

07 Nov 09

Documentation on how to make stereograms (also known as Magic Eye Images).

by mlb 16 years ago

This article briefly explains how Shazam works. Shazam is a service that takes a short sample of music, and identifies the song.

by mlb 16 years ago

20 Oct 09

17 Oct 09

Bubble Sort; Insertion Sort; Median-of-three Quicksort; Multiple Link List Sort; Shell Sort

by mlb 16 years ago saved 2 times


This page has visualizations of some comparison based sorting algorithms.

by mlb 16 years ago

Very interesting thoughts about the fact that sometimes it pays more to add more data rather than fine-tuning the weights on your fancy machine-learning algorithm.

by mlb 16 years ago

The graph-based SPEAR algorithm (Spamming-resistant Expertise Analysis and Ranking) is a new technique to measure the expertise of users by analyzing their activities. The focus is on the ability of users to find new, high quality information in the Internet. At the same time, the algorithm has been shown to be very resistant to spamming attacks.

by mlb 16 years ago