New York State Identification and Intelligence System
From Wikipedia, the free encyclopedia
The New York State Identification and Intelligence System Phonetic Code, commonly known as NYSIIS, is a phonetic algorithm devised in 1970 as part of the New York State Identification and Intelligence System (now a part of the New York State Division of Criminal Justice Services). It features an accuracy increase of 2.7% over the traditional Soundex algorithm.
The algorithm, as described in Name Search Techniques, New York State Identification and Intelligence System Special Report No. 1, by Robert L. Taft, is:
- Translate first characters of name: MAC → MCC, KN → NN, K → C, PH → FF, PF → FF, SCH → SSS
- Translate last characters of name: EE → Y, IE → Y, DT, RT, RD, NT, ND → D
- First character of key = first character of name.
- Translate remaining characters by following rules, incrementing by one character each time:
- EV → AF else A, E, I, O, U → A
- Q → G, Z → S, M → N
- KN → NN else K → C
- SCH → SSS, PH → FF
- H → If previous or next is nonvowel, previous.
- W → If previous is vowel, previous.
- Add current to key if current is not same as the last key character.
- If last character is S, remove it.
- If last characters are AY, replace with Y.
- If last character is A, remove it.
[edit] External links
- NIST Dictionary of Algorithms and Data Structures entry, including pointers to several implementations: http://www.nist.gov/dads/HTML/nysiis.html
- Sample coder, using a variant of the algorithm: http://www.dropby.com/indexLF.html?content=/NYSIIS.html
- Python implementation of the dropby variant: http://metagram.webreply.com/downloads/nysiis.py
- Simple Online NYSIIS Utility with GPL Source: http://www.utilitymill.com/utility/nysiis