You are currently viewing Kryptos K4 is not a Vigenere Cipher

Kryptos K4 is not a Vigenere Cipher

The fourth passage of the Kryptos sculpture has been unsolved for over 21 years. If you are one of the professional cryptanalysists and amateur enthusiasts who have attempted to solve this puzzle, you may be wondering if Kryptos passage 4 (K4) can be solved by a double key Vigenere cipher. The answer seems to be no. Kryptos cipher 4 is not a Vigenere cipher.

I’ll walk through how I attempted to brute force the keys for a Vigenere cipher to solve the 4th Kryptos message. I included my Python code for anyone who wants to modify it. But at this point, I can safely say that K4 will not be solved by a straightforward Vigenere cipher with “KRYPTOS” as the primary key.

Built by Jim Sanborn, the sculpture sits on the grounds of the CIA headquarters in Langley, Virginia. The sculpture has four messages, three of which have been solved. In past years, Sanborn has provided clues. We know the plaintext will include the term “BERLINCLOCK”, “EAST”, and “NORTHEAST.” We know the first two ciphers were solved with Vigenere ciphers. We know the third was solved through transposition. Beyond this, there seems to be speculation on blogs but not much progress.

Outline

I’ll start by reviewing the Vigenere cipher to give some intuition how it works. Next, I’ll discuss my short python code that automatically decodes a ciphertext using two keys and a Vigenere cipher. After that, I’ll show how I used a natural language toolkit to automatically look for possible keys. Next, I’ll describe how I used this system to do a brute-force search for all secondary keys. Finally, I’ll review how I used this system to do other tests.

For example, I tested ciphertexts suggested by members of the community that performed transposition first. I also tested whether Vigenere could be applied first, followed by transposition. I found one possible candidate but I could not decipher the resulting text. Over the next months, I will perform a brute force search for all possible primary and secondary keys to rule out Vigenere altogether. I will keep this blog updated with the results. 

Vigenere Review

Since the first two Kryptos codes were encoded with Vigenere ciphers, my hope was perhaps the fourth code was encrypted this same way. For Kryptos, the Vigenere cipher requires two keys to build the encryption table.

The first key establishes the ‘alphabet’ that makes up the first row (row 0) of the table. The first letters of the alphabet is the keyword, followed by the other letters of the alphabet, minus the letters in the keyword. Say the key is “KRYPTOS”. The alphabet is then “KRYPTOSABCDEFGHIJLMNQUVWXZ”. Notice how K, R, Y, P, T, O, and S are not in their original place but at the front of the alphabet.

The second key establishes the first letter of each row (row 1 – x). Each letter of the second key determines the shift of the alphabet for that row. For the first Kryptos cipher, the second key was “PALIMPSEST”. For row 1, we start with the P, then the rest of the alphabet. Row 1 is “PTOSABCDEFGHIJLMNQUVWXZKRY”. We continue this for the remaining letters in the second key. 

palimsest table

As Dr. Bill Houck says when explaining the Kryptos solutions, we start by finding the length of the second key which sets the period of the encoding. Since PALIMPSEST has 10 letters, we set the period to 10 and we label every letter of the ciphertext with numbers 1-10.

For every letter in the ciphertext, we look at the row that corresponds to its number. For example, the first letter of the ciphertext is E. We look for the E in row 1. Going to the top of the table (row 0), we see this is in the same column as B. E decodes to B.

The second letter is M. In row 2, we look for M. We follow this up the column and see that this decodes to E in row 0. M decodes to E. And so on. 

To solve a Vigenere encryption, we need two keys to build the table. The problem is that doing this by hand takes a long time. Time to automate!

An Automated Vigenere Decoder

I couldn’t find an online program that would let me quickly try lots of key pairs for a Vigenere cipher that works in this way, so I decided to build my own automatic Vigenere cipher decoder using Python. 

There are some online programs that let you try one key at a time like the Keyed Vigenere Cipher from Rumkin. There is also some GitHub work here and there that should do the same thing but looks a little more complicated.

I use the first key to build the alphabet. I use the second key to build the rows of the table, with the correct offset. With the table, I decode the ciphertext, returning the plaintext. (If you’re testing a lot of primary keys, you can build the alphabet first in a separate function.)

import string
def vigenere(k1, k2, c):
## code to generate alphabet
key1_list = list(k1)
abc_list = list(string.ascii_lowercase)
abc_list_edit = [x for x in abc_list if x not in key1_list]
abc = key1_list + abc_list_edit

## code to generate tableaux
rows = []
for e in list(k2):
i = abc.index(e)
row = abc[i:] + abc[:i]
rows.append(row)

## the full table
rows.insert(0, abc)

## precompute
period = len(k2)
len_c = len(c)
c_num = [(i % period) +1 for i in range(0, len_c)]

## gen plaintext
p = ""
for e, n in zip(c, c_num):
ind = rows[n].index(e)
p+=(rows[0][ind])
return p

I tested this code with Kryptos cipher 1 under the known keys “KRYPTOS” and “PALIMPSEST”.

key1 = "kryptos"
key2 = "palimpsest"
c1 = "EMUFPHZLRFAXYUSDJKZLDKRNSHGNFIVJYQTQUXQBQVYUVLLTREVJYQTMKYRDMFD"
c1 = c1.lower()
plain = vigenere(key1, key2, c1)
print(plain)
>> betweensubtleshadingandtheabsenceoflightliesthenuanceofiqlusion

And I tested with cipher 2, which has keys “KRYPTOS” and “ABSCISSA”. The extra line of code is to get rid of the question marks.

import re
key1 = "kryptos"
key2 = "abscissa"
c2 = "VFPJUDEEHZWETZYVGWHKKQETGFQJNCEGGWHKK?DQMCPFQZDQMMIAGPFXHQRLG\
TIMVMZJANQLVKQEDAGDVFRPJUNGEUNAQZGZLECGYUXUEENJTBJLBQCRTBJDFHRR\
YIZETKZEMVDUFKSJHKFWHKUWQLSZFTIHHDDDUVH?DWKBFUFPWNTDFIYCUQZERE\
EVLDKFEZMOQQJLTTUGSYQPFEUNLAVIDXFLGGTEZ?FKZBSFDQVGOGIPUFXHHDRKF\
FHQNTGPUAECNUVPDJMQCLQUMUNEDFQELZZVRRGKFFVOEEXBDMVPNFQXEZLGRE\
DNQFMPNZGLFLPMRJQYALMGNUVPDXVKPDQUMEBEDMHDAFMJGZNUPLGEWJLLAETG"
c2 = c2.lower()
emptyStr = ""
c2 = re.sub(r'[^\w\s]',emptyStr,c2)
plain = vigenere(key1, key2, c2)
print(plain)
>> itwastotallyinvisiblehowsthatpossibletheyusedtheearthsmagneticfieldxtheinformationwasgatheredandtransmittedundergruundtoanunknownlocationxdoeslangleyknowaboutthistheyshoulditsburiedouttheresomewherexwhoknowstheexactlocationonlywwthiswashislastmessagexthirtyeightdegreesfiftysevenminutessixpointfivesecondsnorthseventysevendegreeseightminutesfortyfoursecondswestidbyrows

They worked! The code works. Great. Now we aren’t restricted to doing this by hand but we still have no idea what the two keys are. 

Automating Keyword Tests

My first guess was that the primary key that determines the alphabet was “KRYPTOS” since it was the primary key for the first and second ciphers. As for the secondary key, I had no idea where to start. I felt as though it must be a word between 4 and 17 letters long (being generous), most likely a noun. How many words does English have in this range? Too many to test by hand!

Time to use the natural language toolkit (nltk) that plugs in with Python. Used for all sorts of language processing, it lets me to download and process corpora, or wordslists. 

from nltk.corpus import words
wordlist = words.words()
testkeys = [w for w in wordlist if len(w) >3 and len(w)<18] # length test
table = str.maketrans('', '', string.punctuation)
test_keys2 = [w.translate(table) for w in testkeys] # remove punct
test_keys3 = [w for w in test_keys2 if w.isalpha()] # remove numbers
test_keys4 = [w.lower() for w in test_keys3] # to lower case

After filtering for word length (between 4 and 17 letters), removing punctuation, removing numbers, and formatting, my final list of possible keys had 233,383 English words.

K4 is not Vigenere with "KRYPTOS"

With a dictionary of possible keywords, I can apply the double key Vigenere cipher method and examine the output. But how would I know if the plaintext output by my Vigenere decoder was correct? Thankfully, Sanborn has already hinted that “BERLINCLOCK” is in the plaintext. This phrase can serve as a flag — if this phrase is in the plaintext, I’ve found the solution. Here’s a short program that takes my list of possible keys, the ciphertext, and the flag.

def testwords_kryptos(k1, tk_list, c, flag):
for k2 in tk_list:
plain = vigenere(k1, k2, c)
if flag in plain:
print("***success***")
print(plain)
print("k2: ", k2)
return plain
print("nope")

This program works as an oracle: I give it candidate primary and secondary keys, and it tells me if the decrypted text is a match. I ran this program on the first two Kryptos ciphers to see if it worked. It did! And almost immediately. 

I ran it on the fourth cipher….and I quickly got a ‘nope’.

I even printed out all the decryptions for every word in my test bank just to see if it was actually processing all 200k+ words, and it was…but no. KRYPTOS + any word between 4-17 letters does not decrypt to anything containing “BERLINCLOCK”.

Transposition then Vigenere?

Now, it seems clear that K4 will not be decoded by a double-key Vigenere with “KRYPTOS” as the primary key. But perhaps transposition is involved. I checked one popular solution but it did not work.

Karl Wang’s blog suggests that passage 4 can be solved by a transposition of the rows, in the same fashion that passage 3 was decoded. After he walks through the transposition process, he suggests two possible ciphertexts that could produce the plaintext under a double-key Vigenere cipher with the “KRYPTOS” primary key.

Namely, he suggests the ciphertext “EIHHVBGLCYRVONIVSMJYQGUOMOTNLIDEUQBTDBEUXODMWBYEPCMENOXFRBECAERKAOYJQDAUEGBTPYLAUKXBFIASBGPBGOBLI” under 49 columns, or “IVSMJYQYLAUKXBEIHHVBGKAOYJQDTDBEUXOCMENOXFGUOMOTNFIASBGPLCYRVONAUEGBTPDMWBYEPRBECAERLIDEUQBBGOBLI” under 21 columns.

I plugged these candidate ciphertexts into my oracle, testing 233k+ words as the secondary key and found no valid result. Therefore, no, this transposition followed by a double-key Vigenere with KRYPTOS will not generate the plaintext. Perhaps a different transposition method is necessary.

Vigenere then Transposition?

Perhaps double-key Vigenere is applied first, with the primary key “KRYPTOS” and then some transposition method is applied? I added a component in my program that tested all possible plaintexts to see if they had the right letters to produce “BERLINCLOCK,” “EAST,” and “NORTHEAST.” With “KRYPTOS” as the primary key, only one keyword out of all 233k+ produced such an output: “YOUNG”.

The word “YOUNG” does not sound quite like the other secondary keys from K1 and K2. The only reference about cryptography that I found was from course notes that said the Kama Sutra of Vatsayana lists cryptography as the forty-fourth and forty fifth of sixty-four arts men and women should know and practice. “Even young maids should study this Kama Sutra…” This seems like a bit of a stretch, but “YOUNG” is the only keyword that produces an output that could be permutated into the hints Sanborn gave.

The output of “KRYPTOS” + “YOUNG” is “PPOBBPNDQRNFGGMIDLIUUAVBCMIBBSEUEGMYMCRNHIEGXXQTTQYEOXBBDFTYLAGKIRLCEOWABADEOOSEYEHLYEUKUBOJJAVOV.”

I attempted the same transposition method that Karl used, but the result was still jumbled.

Testing all words as primary and secondary key

I woke up the next day with the intention of ruling out a straightforward Vigenere cipher altogether. To do that, I threw out the idea that the first key was “KRYPTOS”. What if the first key was something else entirely? To test this, I created a slightly altered function that tested all 233,383 words as key 1 AND key 2.

This program did not execute immediately. In fact, it’s scheduled to take 2 months (approx. 10/2021) to rule out all words for both keys. I will update this blog with the results.

What Does This Mean?

My work does not rule out the presence of a Vigenere cipher altogether. It simply suggests that Kryptos passage 4 cannot be solved with Vigenere alone.

Maybe some other form of transposition is involved. Maybe the “KRYPTOS” + “YOUNG” text can be arranged into something useful. Or maybe a different cipher method is needed altogether.

A limitation: if the encoding method is a double key Vigenere, maybe the secondary key was not in the dictionary I used. But since “palimpsest” and “abscissa” were in the dictionary, I’m guessing the dictionary is pretty comprehensive. Further, most estimates say English has around 170,000 words and my dictionary has 233,383.

Thanks to Elonka for her great blog on all things Kryptos. I highly recommend checking it out if you want to learn more.