Index of /pub/archives/ftp-sites/ftp.lanl.gov/public/pflarr

 Name                                         Last modified      Size  Description
 Parent Directory                                                  -   
 2013-03-02.txt                               2013-09-30 00:00  467   
 2013-03-03.txt                               2013-09-26 00:00  192   
 2013-03-04.txt                               2013-09-26 00:00  206   
 2013-03-05.txt                               2013-09-26 00:00   97   
 2013-03-06.txt                               2013-09-26 00:00   92   
 2013-03-07.gz                                2013-09-26 00:00  3.2G  
 2013-03-07.txt                               2013-09-26 00:00  105   
 2013-03-08.gz                                2013-09-26 00:00  2.5G  
 2013-03-08.txt                               2013-09-26 00:00   95   
 2013-03-09.gz                                2013-09-26 00:00  1.6G  
 2013-03-09.txt                               2013-09-26 00:00   73   
 2013-03-10.gz                                2013-09-26 00:00  1.4G  
 2013-03-10.txt                               2013-09-26 00:00   71   
 2013-03-11.gz                                2013-09-26 00:00  3.0G  
 2013-03-11.txt                               2013-09-26 00:00  112   
 2013-03-12.gz                                2013-09-26 00:00  3.0G  
 2013-03-12.txt                               2013-09-26 00:00  126   
 2013-03-13.gz                                2013-09-26 00:00  3.0G  
 2013-03-13.txt                               2013-09-26 00:00  147   
 2013-03-14.gz                                2013-10-10 00:00  3.0G  
 2013-03-14.txt                               2013-12-09 09:12   65   
 2013-03-15.gz                                2013-10-10 00:00  2.2G  
 2013-03-15.txt                               2013-10-10 00:00   75   
 2013-03-17.gz                                2013-10-10 00:00  1.2G  
 2013-03-17.txt                               2013-10-10 00:00   70   
 2013-03-18.gz                                2013-10-10 00:00  2.3G  
 2013-03-18.txt                               2013-10-10 00:00   88   
 2013-03-19.gz                                2013-10-10 00:00  2.4G  
 2013-03-19.txt                               2013-10-10 00:00   82   
 2013-03-20.gz                                2013-10-15 00:00  2.4G  
 2013-03-20.txt                               2013-10-10 00:00   93   
 2013-03-21.gz                                2013-10-15 00:00  2.5G  
 2013-03-21.txt                               2013-10-10 00:00   79   
 2013-03-22.gz                                2013-10-16 00:00  1.9G  
 2013-03-22.txt                               2013-12-09 09:14   96   
 OEL_6r5.iso                                  2014-02-07 10:44  3.6G  
 README                                       2013-12-09 09:23  3.7K

NOTE: This file may change without notice! Be aware of its timestamp on the ftp server. Other files may change too,

=== TRAINING DATA ===
The data from the month of February, 2013 is all considered to be training data. Other than being anonymized, it is otherwise untouched.

=== Cases ===
Each day cooresponds to one of the cases mentioned in the Summary pdf. Each case has multiple days that follow the case guidelines, typically at least four.
Each day is entirely independent for the inserted intrusions. Callbacks signaling one day will not continue into the next.

There is some variety within each case. The number of domains looked up varies from instance to instance, some behaviours are not consistent, non-uniform hosts may result in non-uniform behaviour, and attacks may simply proceed differently.

== Case 1 ==
3/2, 3/3, 3/4, 3/9, 3/10, 3/16

== Case 2 ==
3/5, 3/6, 3/7, 3/8, 3/11, 3/12, 3/13

== Case 3 ==
3/14, 3/15, 3/17, 3/18, 3/19, 3/20, 3/21

== Case 4 ==
3/22, plus all the prior days without utilizing the hints.

=== HINT FILES ===
For each day in March, there should be a .txt file along with the data for that day. The hint files generally have two sections, a 'hint', and an 'answer'. The 'hint' sections tell you information you can use to try to narrow down your search for the answer. The 'answer' sections are everything you need to know to find everything 'malicious' that was inserted. This is usually just the domains involved, as just about everything else can be derived from that.

=== ODDITIES ===
1. The Feb. data was originally parsed with 'tcp save states' turned on. This means you may see occasional tcp transactions from wildly different time periods pop up. Those are absent from the March data, and you can generally ignore TCP dns transactions anyway.
2. I skipped March 1st for logistical reasons, for now.
3. The early hint files had more information, like full timelines. That isn't practical for me to do in the long run, but I thought it would be useful to see for at least one day. That day capture the fundamental pattern pretty well.
4. I used the domain 'dormer.wad' to identify hosts actively used on a given day. Its presence does not imply compromise, but compromise implies its presence. In the off chance that you associate certain domains with compromised hosts across multiple cases and that domain pops up, it should be ignored.
5. There is no February 9th data.

=== Case Solving Example ===
I've been asked many times for an example of how I would solve one of these cases, just given the DNS data as you have it. The problem with that question is that this is interesting because I don't know how to solve it. When we do this in the real world, we generally start with more of a hint than I'm giving for the cases. Working backwards from there, we can then root out a little bit more from these logs about what is going on, and who else is infected.

Though I can't tell you how I would solve any of this, I can tell you where I might start.
1. The vital information here is who is talking to who, and when.
2. The DNS servers don't really matter, especially given our setup.
3. Given 1 & 2, I'd reduce the data to a timestamp, the client asking the question, the question, and the answer.
3a. In real life, bad guys might be using DNS for more than it's basic purpose, so record types other than A records would be interesting. They're not here.
4. Get rid of the baseline, periodic noise, carefully. Remember that beaconing ends up being baseline, periodic noise.
5. Now you can start figuring out what clients are being used by a human when.

The above steps aren't necessarily easy problems in and of themselves. I have some ideas from here, but I don't really want to pollute the thinking about this problem any more than I already have.