Google

Solving the 2014 DBIR Puzzle Challenge

Written on:May 6, 2014
Comments
Add One

dbir2014_52

Intro

This year’s challenge was quite…well…challenging. Unfortunately Andrij, Will, and I were not able to repeat last year’s win and had to settle for second place. Frankly, at one point we weren’t sure we were going to finish at all, so we’ll take it! Read on to see our approach to finding the clues and solving the puzzle – and all of the frustrating missteps along the way.

Day 0 – Tuesday, April 22

This year I once again teamed up with my colleagues Andrij (Andy) and Will. Coming off our win last year we were ready for a repeat (Oh how wrong we were!).

dbir2014_5

dbir2014_0

I happened to be up with my 6 month old daughter at 4 am that Tuesday morning when I saw an email from Verizon with a link to download the pre-release, insider report. Like a careless user reacting to an email from their friendly “IT Helppdesc” I clicked the link and blindly downloaded whatever was at the other end.

I took a look at the front cover, didn’t find anything interesting and immediately turned my attention to the back cover where I found some “hidden” text:

dbir2014_2

The first thing that caught my eye was my last name. As humbled as I was to be named an honorary analyst, seeing my name in the first clue was a bit unsettling and a sign of things to come. I did a quick once-over of the rest of the data. It was in Veris DB format, though I confirmed the incident_id was not valid.

The github binary entry led to https://isitchristmas.com which, like http://beesbeesbees.com, ended up not being a clue.

dbir2014_18

dbir_2014_3

The answer to “Are we going to win the DBIR challenge this year?”

Too sleep deprived to continue, I dumped the clue in an email, sent it off to Will and Andy, and went back to bed.

I woke up later that morning to an email from Andy saying he found a clue in the first letter of each sentence of the second “notes” field…”hash the whole report first clue

dbir_2014_4

We calculated the MD5 of the report and Google’d the resulting hash but found nothing. Then we saw this tweet…

dbir2014_6

It seemed we had another day to wait but there was one more thing we needed to do. To avoid making the same mistake as last year and missing a potential shortcut to the win, we checked our friend IMA Hintz’s blogspot page and found this…

dbir_2014_7

The first picture is a shout to fellow puzzle aficionado and one of this year’s co-winners David Schuetz (@DarthNull). The second pic made it obvious we weren’t going to get any help here. At least we had already figured out the first clue and knew where to begin once the challenge officially kicked off.

Day 1 – Wednesday, April 23

dbir2014_8

The next day I was up early, grabbed the “official launch” report, verified the hash was the same, and repeated the Google search. This time I found the following Pastebin post:

dbir2014_1

I recognized the hyphen-separated words as all appearing in the content of the report so I did a few things. First, I highlighted all of their appearances to see if it could be a proximity- or content-based clue.

dbir2014_9

I also noted the letter count (how many letters per word) and frequency count (how many times each word appeared in the report). I wasn’t yet sure what to do with any of this data so I started by focusing on the word proximity approach (my first of many mistakes).

We spent some time trying to figure out what these words had in common. We also took this opportunity to get hung up on both of the other items in the Pastebin post. First, the A4 hash… Why was it there? Does Verizon release an A4 formatted version in Europe that we need to get our hands on? Wait, A4 top/bottom margins are wider than letter format. Could this be a clue to convert the report into A4 page size and reveal some additional content in the margins? Now you may be saying “but if they originally converted the report from A4 to letter, the extra bit of margins would have been lost”. But you’re forgetting, you can’t think logically when caught up in the hysteria that is the DBIR puzzle challenge…where even the craziest of ideas appear as they could lead to the promised land (and by “promised land” I mean a nominal prize and minor Twitter recognition).

Ok, that didn’t pan out, so we thought maybe, just maybe, “A4” stood for the A4 Threat Model developed by the Verizon RISK team. But even so, what could it be a hash of? In the end it didn’t matter because it was a dead end. Quite frankly they could have labeled the hash “Z9” instead of “A4″ and I still would have found a way to look too deeply into it.

When I wasn’t obsessing over the hash on that first day, I was honed in on the second sentence…”Good luck in your quest, yo”. What an odd choice of words. Why add the “, yo”? Was this sentence an anagram? So many word possibilities presented themselves. …question…, …link…, …go to… Nope, it was another dead end.

Then we made one of the worst discoveries ever…the Fake DBIR blog and Twitter account.

dbir_2014_13

Both the Twitter account and the blog had the following statement:

Absolutely not part of any DBIR crypto clues and 100% unofficial account. #promise

Naturally we read that as:

Absolutely part of the DBIR crypto clues and 100% official account. #promise

The blog had a post titled “Not the second clue” which we of course read as “Definitely…” well you get the idea.

dbir2014_15

I spent the rest of my day trying to figure out what this collection of nonsense might be. Letter distribution was similar to English so I figured it must be some sort of foreign language, encoding, or cipher. There were really only 4 important words I should have keyed in on … NOT THE SECOND CLUE. [the all CAPS represents present-day me yelling at Day 1 me]

At this point I think the following tweet was directed at us:

dbir2014_22

So that’s how we concluded our first day. A flash-in-the-pan start that didn’t gain any true momentum. For good measure I sent the team a summary of what little we had so far, including the word frequencies and called it a night.

dbir2014_10

Day 2 – Thursday, April 24

We started the next day in much the same manner as we ended the previous – looking at the initial few clues with no new ideas. Finally, common sense prevailed and Will took my original word frequency and Google’d it (dashes included). That led to the next clue (and the bane of our existence for the next several days) … the website canadastate.ca

dbir2014_11

We figured what better way to get started than to click the “Get Started” button which presented us with an application form.

dbir2014_16

We followed the instructions (sans video submission) and received the following response:

dbir2014_17

Ok, this was good — we had a defined goal with parameters. We needed to find 5 clues, all related to the canadastate.ca website which were intended to narrow down a data set referred to as an “open-source security incident database”. We knew this was a reference to the Veris Community Database which was available on Github as a csv file:

dbir2014_19

It was our assumption that these clues could be found and applied to the spreadsheet in any order so we turned our attention to the website, divvied up the content, and quickly uncovered several clues:

Clue 1: actor=external

The website had 4 sample video lessons, the first of which was a tutorial on Math Programming. In the upper left-hand corner of the embedded screenshot was a visible “actor=external” clue. This meant the actor.external column of the Veris DB spreadsheet should be set to 1, reducing the data set to 1802 records and getting us one step closer to completing the puzzle.

dbir2014_20

Clue 2: victim.state=CA

The second video lesson was a course on the “Use of Business Lingo”. While watching the video we noticed the clue victim.state=CA scroll across the bottom.

dbir2014_21

Applying this filter to our Veris csv file we were then left with us with 194 records. 2 clues down, 3 to go.

Clue 3: action.physical.location.victim.work.area = 1

We noticed a phone number at the bottom of the page, called it, and got our third clue which told us to set the action.physical.location.victim.work.area column value to 1. The left us with 26 records in our Veris data set and 2 clues remaining.

Although we wanted to find the remaining clues and solve the puzzle, we had to call it quits for the night. We ended the day pretty happy with our progress and confident we would probably have this thing wrapped up by the weekend.

Day 3 – Friday, April 25

We spent Friday scouring the website for clues and following many potential clues that led to dead ends. Here’s a taste:

Dead End #1 – Threatverica.ca

In the student testimonials section, one of the students is said to be working at threatvertica.ca.

dbir2014_23

The site revealed very little…all of the links were dead and there were no apparent clues in the visible text. We noticed an obvious misspelling in the heading of the page (“Soutions”) and thought maybe it was intentional in order use the same 13-digit word frequency that led us to canadastate.ca as a key to count/extract letters from the page content (a long shot, I know), but it turned out to be nothing.

We found the site content on Github but this revealed nothing of interest either.

dbir2014_24

This site was used to host the April Fool’s DBIR, which we took another look at, but again found nothing.

dbir2014_25

The site ended up being a complete dead end.

Dead End #2 – Fight song

Like last year, this DBIR challenge contained a music remake, though this one was without video and far less enjoyable. We transcribed the lyrics (as best we could). Despite quickly ruling it out as a clue, it may just be the worst Rush remake ever and therefore required each of us to listen to it at least 50 times. See for yourself…


Lyrics (we think):
We are can-state born and bred
with all the knowledge from A to Zed
We never fight alone
We dominate the (?) zone
And if you are wanting a fight
we’ll crush you while polite
Go Can-State!

Dead End #3 – Semper Lorem Hat Society Facebook Page

Perhaps one of the more frustrating false leads was the Facebook page. Clicking the icon at the bottom of the page led to a Facebook page for the Semper Lorem Hat Society, the “honor society for alumni of Canada State University”. We submitted for membership and while we waited for a response, examined the only available content — photos from the July 2013 alumni bash. 

dbir2014_50

dbir2014_26

We searched for the origin of these images via Google and determine if any were modified in any way. We also checked to see if together they indicated some sort of clue. Neither panned out. We eventually received confirmation of our membership to the Facebook group, but this didn’t reveal any new content, nor did examining the profiles of the other members. It ended up being a dead end that wasted some valuable time.

There were several other dead ends and I won’t cover them all here, but the following screenshots represent our review of the site in its entirety (caution: final clue spoiler alert in images 3 & 4).

dbir2014_46dbir2014_47

dbir2014_48 dbir2014_49

The one good thing about chasing down some of these false leads was that we were able to narrow down the scope to what we believed to be the keys to the last two clues.

Promising Lead #1 – Video to Colonel Haberdasher

The fourth video posted to the candastate.ca site was a lecture given by an old friend, Col Henry Haberdasher. We located the Colonel’s GitHub site and noticed two things — a branch (master) containing his personal home directory and another branch (gh-pages) containing the canadastate.ca web root.

dbir2014_27

One file in particular from the web root caught our eye: grades.csv.gpg.

dbir2014_28

The commit history “encrypted those grades so nobody can get at them” made it apparent that we needed to decrypt the contents of this file using a valid gpg private key. We turned our attention to the Colonel’s home directory and found it.

dbir2014_29

We imported it and decrypted the file.

dbir2014_30

Opening the spreadsheet, our eyes were immediately drawn to the “Middle_Init” column with the thought that it held some encoded message.

dbir2014_31

We figured we needed to sort and/or filter the data to reveal the message, but every sort and filter we applied got us nowhere. We went so far as to write a script to identify and filter on only the valid Canadian Social Insurance Numbers (of which there were 13). It also revealed nothing.

dbir2014_32

We must have spent at least an hour or two rearranging the content to no avail. Nevertheless, we were pretty convinced one of the last two clues was contained with this spreadsheet.

Promising Lead #2 – The Business School Dinner Menu

The other item that we were convinced contained (or led to) a clue was the College of Business Dinner menu available for download directly from website.

dbir2014_33

The wines on the dinner menu were all confirmed valid and although we couldn’t find the food items on any public website (I love a Grilled Scallop but had no idea what Tarte Tatin or Spring Ramps were), they all seemed to be legitimate dishes.The capitalization of some of the words in the menu seemed odd so we tried extracting those letters to see if they revealed a clue, but saw nothing.

Both of the images were determined to be unmodified and benign – one was a stock photo and the other was from Wikipedia.

We considered the text at the bottom right (Where, When, Who…) could also be a clue, though nothing in particular seemed out of place or obvious. Nevertheless, it was good for at least an hour of distraction while we attempted to re-arrange the letters into something meaningful.

What really caught our eye was the odd spacing/indentation/font of the text to the right of the tasting menu “EVENT ONLY…”. It definitely didn’t conform to the style of the menu so we focused our attention here. 127 CAD also seemed to be an oddly specific amount though we ruled out currency conversion as that’s too dynamic to use in a puzzle. We thought maybe CAD stood for something else but turned up nothing. That left the peculiar formatting though the more we stared at it, the less sure we became that it was actually a clue.

We also looked at the commit history for this file and noticed there was a previous version:

dbir2014_34

This proved to be another considerable distraction while we spent time trying to figure out what the differences in content could mean. The file size of the newer version was also considerably larger, prompting me to open both files in a hex editor and look for hidden content… turning up nothing.

The other thing we noted from the Github commit history is that the most recent version of the dinner menu was posted along with a pdf of the DBIR cover.

dbir2014_43

The pdf only contained an image of the cover so there was no possibility of hidden text or content that could be extracted. Looking at them side-by-side also didn’t appear to reveal anything of interest.

dbir2014_44

We ended our Friday with these two potential leads but no additional clues. Unfortunately, none of us had any time to spend on the puzzle over the weekend and our upcoming week was pretty busy so we knew our opportunity to solve it was limited.

Day 6 – Monday, April 28

As expected, the weekend yielded no time to devote to puzzle solving and thus no progress. Unfortunately Monday proved to be too busy of a day to devote any time to it either. During that day,we saw a tweet from the VZDBIR account that a good hint would be forthcoming to anyone had submitted their canadastate.ca applications.

dbir2014_45

Clue 4 – asset.assets.Media = 1

The 5 pm hint was as follows:

dbir2014_36

We immediately realized we did not account for the +/- in the grades when we originally sorted the spreadsheet. Doing so (and then sorting by last name) revealed the following clue:

dbir2014_35

Ok, down to 4 records in our Veris data set and 1 clue remaining. Our Veris data set now looked as follows:

dbir2014_39

I was still convinced the final clue was in the dinner menu but unfortunately we didn’t have time to pursue it. A bit later that day, our closest competitors David Schuetz and Alex Pinto, announced that they had teamed up and submitted their winning solution which was subsequently confirmed by Verizon. No first place for us this year.

The coming days were equally as busy for us so we had all but given up on solving this year’s puzzle. When one of us had a free moment we would try something else – overlaying files to see if their content lined up to reveal a clue, applying the original 13-word frequency count to various content to see if it uncovered hidden messages – but nothing seemed to lead to a clue as obvious as those we had found so far. Out of time and ideas, we figured we were done.

Day 9 – Thursday, May 1

By now I had pretty much come to terms with the fact that we weren’t going to solve it when I got a DM from David Schuetz asking if we were any closer. I responded with several of the things we had tried and that although we thought the dinner menu was the key, couldn’t find a definitive clue. We traded several more messages about what we had seen, including the following when we overlaid the menu and the DBIR cover.

dbir2014_37

The fact that they overlayed perfectly was definitely telling, but again, we saw no obvious clue. At first we thought “Event: Insider Misuse” could be a clue but based on the Veris data set, that made no sense. We noticed the overlap between the words “insider & only”, “miscellaneous & small” and “physical & business” and thought it could possibly be “insider only” or “only small business”, but we discounted those as well. The latter seemed the most promising, but as I relayed to David, we discounted it because although there is a seemingly relevant column in the Veris data set (victim.employee_count), there is a corresponding value of “small” which did not appear in our current data set.

dbir2014_38

Our current 4-item data set had the following remaining values for employee_count.

dbir2014_40

Had we found this clue in a different order we would have seen (and filtered on) the value “small”, which would have completely changed the data set.

dbir2014_41

Of course, we could have made the assumption that we should filter on organizations with employee_counts of “small”, “101 to 1000”, “11 to 100” and “1 to 10”, but we though this left too much to interpretation. For this reason and the fact that the clue was not in the same format as the other four, we discounted its validity.

Nonetheless, David encouraged us to submit our solution (and who were we to argue with this year’s winner?).

Clue 5 – victim.employee_count = small

Unconvinced, we applied the final filter to the four remaining items (removing the entries with organizational sizes of “unknown”) which reduced the the data set to two items. We submitted our solution with proof of discovery for each clue and shortly after received confirmation that we were the 2nd team to solve this year’s puzzle. Phew!

dbir2014_51

Conclusion

I must say we were a bit disappointed with the ambiguity of the final clue. The fact that it left room for interpretation and was considerably different in format from the other clues made it too easy to dismiss. Then again, we probably shouldn’t have over-thought it so much. Nevertheless, we’re glad that we were able to once again solve the puzzle, win or not.

Final clue aside, every year the DBIR puzzle promises some serious critical thinking and this year was no exception. It was a challenge and a lot of fun. A huge congrats to this year’s winning team: David Schuetz and Alex Pinto as well as this year’s third place finisher Michael Oglesby.

The DBIR puzzle is a great team exercise and if you haven’t tried it, I strongly recommend recruiting some friends or co-workers and participating next year. It’s more fun when there’s a lot of competition! Until then, crank some Rush, grill up some scallops or some squab, and enjoy the quest, yo!

Mike CzumakAndrij “Andy” Kuzyszyn and Will Pustorino


Leave a Comment

Your email address will not be published. Required fields are marked *