Old Dominion University � Visual Investigator

VAST 2011 Challenge
Mini-Challenge 3 - Investigation into Terrorist Activity

Authors and Affiliations:

Matthew Kelly, Old Dominion University, me@matkelly.com
Dr. Michele C. Weigle, Old Dominion University, Faculty Advisor, mweigle@cs.odu.edu


All of the tools used were created specifically for this project. The languages and systems used for this project include Python (http://www.python.org/), PHP (http://php.net/ ) , MySQL (http://www.mysql.com/), PhpMyAdmin (http://www.phpmyadmin.net), XAMPP (http://www.apachefriends.org/en/xampp.html) and Javascript. These were combined together to create the investigation system. The system was created in June 2011 at Old Dominion University in Norfolk, Virginia. Much of the programming utilized GET requests to iframes within a page, so the tool would be useful and easily conveyed as an API for other investigatory projects of this nature. 




The video can be viewed here or accessed directly here.



MC 3.1 Potential Threats: Identify any imminent terrorist threats in the Vastopolis metropolitan area. Provide detailed information on the threat or threats (e.g. who, what, where, when, and how) so that officials can conduct counterintelligence activities. Also, provide a list of the evidential documents supporting your answer.


Our analysis revealed that the group �Citizens for the Ethical Treatment of Lab Mice� will imminently commit the terrorist act of distributing engineered airborne microbes to humans by plane at some point in mid-May through the means of breaching less-than-sufficient airport security.


The imminent threat became apparent to us by examining the source documents and noting the repeated mentioning of the group, �Citizens for the Ethical Treatment of Lab Mice�. This group seeks vengeance on those that have tested on the mice as documented in Article 00008. The livestock deaths (Article 02385) that were determined to be caused by microbes (Article 04085) briefly precede the robbery of Vast University (Article 01785) where expensive equipment capable of easily modifying existing organisms to create new biological hazards was stolen from Professor Patino (Article 01785). Patino is renowned for his familiarity with dangerous microbes (Article 03212) thus making the equipment useful for The Citizens.


The above was determined by a series of searches using the Visual Investigator tool as well as direct queries to a database stocked with the article data for simplicity and to work within the time constraints of the project. The initial search was a matter of finding a pivot to start in developing the investigative corpus. This was done by choosing a variety of words that were representative of our familiarity of the domain of terrorism. The words initially tried were �bomb�, �kill�, �assassinate� and �die� until we realized that these words were more likely to describe terrorism or events that had already happened rather than imminent threats. With this in mind, the first word used that we felt would be a sign of suspicious activity is �stolen�. This was used as the pivot word to investigate further. The search for instances of the word, �stolen� brought Professor Patino into the picture. Patino has had previous conflict with The Citizens group who likely stole the lab equipment to be used for microbial manipulation. The livestock death was attributed to microbes, but Article 03740 reassured the readers that it was not transmittable to humans. The pieces come together with the sub-corpus composed of an aggregation of the above articles that point the greatest likelihood of involvement in an imminent terror threat to The Citizens, as they would have a motive and formally stated (Article 00008) that they want people and not the mice to be �tested on�.



Fig. 1 The starting interface asks for a search term to be entered.


The sequence of articles shows continuity:

Mass Animal Deaths�02385
April 1, 2011

Manufacturing Dangerous Microbes�03212
April 11, 2011

Update on Animal Deaths�03740
Trespassers on-site of animals� deaths
April 14, 2011

CDC Publication on Bioterrorism-03040
April 18, 2011

Animal Deaths in City Caused by spore forming microbes -04085
April 20, 2011

Robbery at Vast University�01785
April 26, 2011

Animal Activist Threatens Press�00008
May 9, 2011



Fig. 2 After we searched for �stolen�, the database returned information on the word including its TF-IDF and access to the articles containing the word.


Further investigation of the corpus showed that there had been recent issues with food and water services on airlines, particularly in the realm of potential contamination of goods provided to passengers (Article 00028). This paired with the potential for baggage to be checked curbside for domestic flights (article 00152) would allow for a suitable opportunity for the action that we�ve hypothesized for The Citizens to commit. Because the transmission of an engineered airborne microbe would thrive in the close conditions of a plane, this would also maximize the likelihood of retention of the microbe (Article 3040) and prevent detection until it was too late.



Fig 3. After clicking the �See Articles� button, we are returned a list of articles with brief information about each article, an option to add to our corpus and an option to search for similar articles.


A tool was created from-scratch to accomplish this investigatory task that was dubbed �Visual Investigator�. The workings of the code were created with a combination of Python, PHP and Javascript, MySQL and the XAMPP package, which allows for ease of deployment of an Apache/MySQL stack onto a Windows machine. The database was loaded with the article data and pre-processed to quantify term frequency and limit the latency between queries. We used this tool through entering search terms we thought were applicable to the problem into a search field that then queried the database, calculated the Term Frequency-Inverse Document Frequency (TF-IDF) of the term based on the corpus containing all of the articles provided, sorted the results and allowed a user to proceed the investigation with this data. From there, we chose the term that we wished to investigate, which returned a list of articles, with title and content correlated, and a means to select articles for retention into a sub-corpus, i.e. a collection that we progressively built while using the application that is a subset of the entire article corpus. The above investigate procedure was used as the chosen articles were iteratively examined until finding the commonality that bound the articles together leading to our aforementioned hypothesis of The Citizens� motive.



Fig 4. An article of interest is found and retained.


A similarity engine and accompanying interactive visualization was also created for the Visual Investigator that allows a rudimentary similarity algorithm to be performed to relate articles to one another through the similarity of keywords. The basis for similarity was found to be insufficient in associating articles in a clustered form like we hoped. This was not because of lack of commonality of articles� content but similarity of articles requires a more domain-specific algorithm beyond word commonality. A better way to have accomplished this would have been to develop some sort of directed graph that exhibits the causal relationship of articles.




Fig 5. After clicking the �similar� button, the user is shown an interface where articles� relevance is depicted spatially with the most relevant articles being closest to the bull�s-eye