Surname Saturday – The Soundex

In our post How Do I Call Thee? Let Me Count The Names… we talked about the necessity for expanding your searches widely to ensure you can find your ancestors even when you don’t really know how to spell their name. One of the ways that many database search sites like Ancestry.com and Rootsweb.com perform these wider searches is by using the SOUNDEX index.

So What IS the SOUNDEX?

The SOUNDEX is a surname index developed to aid in locating individuals in the Census records when the spellings of their name may have been incorrect or varied from the conventional spelling. This ingenious indexing uses a code assigned to each surname that is made up of the first letter of the surname and a three digit numerical value after.

In the 1930s many states did not have uniform or long standing birth records. Censuses were often used to provide official proof of age when people wanted to claim Social Security benefits. The SOUNDEX project was a WPA project in the 1930s that made it easier and faster for the Census Bureau to locate these records easily and efficiently.1

Knowing what we know today about the inaccuracy of Census records for birth dates, this concept seems laughable! But out of this necessity was born a very useful tool in our surname searches!

How Do You Find Your SOUNDEX?

Creating your SOUNDEX code for each surname is a delicate process.

To illustrate the rules, we’ll use one of my family’s surnames: Compton

Step 1: First letter of the surname


For my surname Compton, the first part of the SOUNDEX code will be the letter C.

Step 2: Remove all vowels and a few consonants too!


From the surname, remove the following letters: A, E, I, O, U, Y, H, W. So for Compton we’re left with: C-MPTN.

Step 3: The three consonants following the first letter of the surname are assigned a numerical value; all other consonants are dropped.


After our initial letter (C) our remaining consonants are MPTN. Since we only want the first three we’re left with: MPT. Now we assign the numerical value according to the following matrix:

  • 1 = B, P, F, V
  • 2 = C, S, G, J, K, Q, X, Z
  • 3 = D, T
  • 4 = L
  • 5 = M, N
  • 6 = R

That gives us C-513 for the SOUNDEX code for the surname Compton.

OK! Compton was an easy one. But there are other rules for finding your SOUNDEX code that are important.

The Other Rules

  1. Double consonants (such as Harris) or consonants that would be assigned the same numerical value (such as Jackson) are only counted once. The numerical value still has to be three digits, so you fill in with zeroes until you get to the required number. That means our SOUNDEX code will be H620 and J250, respectively.
  2. To confuse matters more, consonants that are the same but separated by a vowel (even though we ignore all vowels) are counted separately. For example: Herman becomes H-RMN and thus H-655.
  3. If “double” consonants are separated by an H or a W the consonant is only counted once. About.com uses the surname Ashcraft to illustrate this point. Ashcraft becomes A-SCR, except SC are both “2″ letters and since they are separated by H we will only count them once: A-261.2
  4. SOUNDEX codes for surnames that have a prefix could be found under an index where the prefix was taken into account or even as if the prefix was dropped. This will be like names like Van Meter and DePetris.
  5. Surnames like McKim or MacIntire never drop the prefix!
  6. For religious groups that refer to members as “Sister” or “Brother” they were often indexed with Sister or Brother as their surname and therefore can be found under S-236 and B-636.3

Leave a Reply