Tuesday, December 18, 2018

Doing vulnerability assessment of my own code...It's bad.

Introduction:
I took "Application Security: for Hackers and Developers" class at Derbycon this year. More about the course is here: https://www.vdalabs.com/expert-training/security-training-courses/security-for-hackers-and-developers/ Videos are also available on PluralSight. Anyways, the class is focused on searching for vulnerabilities by using various methods such as static analysis, dynamic analysis, binary analysis, fuzzing, and etc.

For a class I was taking this semester, I decided to write my final paper on different techniques used to review code and/on find security issues. I also decided to look at a project that I worked on a long time ago for the hands-on part.

The project is runthelabs (can be found here: https://github.com/BoredHackerBlog/runthelabs). It is a Python+Flask based web app that takes in a JSON configuration file and creates a virtual environment. It uses Minimega (minimega.org), KVM, OpenVSwitch, and NoVNC. It also uses SQLite for holding data.

The point of it is that a teacher creates a JSON file with virtual environment specifications, uploads it, and starts the lab. The teacher can copy and send NoVNC links to students/groups so they can VNC into a VM and work on whatever. More info here: https://github.com/BoredHackerBlog/runthelabs/tree/master/documentation

When I put the project on Github, I knew it could have some kind of injection vulnerability. The code was written so long ago and I never got to updating everything. (laziness is not good for security)

Goal/Testing Purpose and Scope:
The goal of this testing is to apply code review and security testing techniques and find security issues in my project. The scope is just my application/code. Third-party code or issues related to Minimega, KVM, OpenVSwitch, and etc. are not a concern.

Software Internals:
api.py is the main flask app. There is a webUI and an API way of interacting with the app.
config.py stores config information (paths to files and etc...)
dbcontro.py is responsible for interacting with SQLite DB.
mmcontrol.py is responsible for executing commands in relation to minimega, iptables, and openvswitch.
mmstart.py parses JSON file uploaded by the admin and uses mmcontrol to set things up.

There are two user roles. One admin and the other one is student/unauthenticated.

Here's what the admin does: Uploads a JSON config file and starts the lab (which turns on VM's and sets up networking). Optionally, the admin can turn the whole lab off, reboot VM, change VNC password, and finally, share NoVNC link with the student.

VNC can be accessed via realvnc or other VNC software with the correct port and password or NoVNC.

Unauthenticated user: They can check server status (if it's up or not. Not very useful) and access VM's via VNC, if they have URL or password+port.

The software uses SQLite DB to keep track of VM name, password, and port.

Port 1337 is used for WebUI and API. Port 1338 is used for Websockify/NoVNC.

Testing Setup:
To set up a testing environment, I needed one server to the run web app and two machines. One for static analysis/dynamic analysis/hacking and the other one for Admin/Teacher role.

Static Analysis:
Static analysis is analyzing the code without running it. Here are useful OWASP links: https://www.owasp.org/index.php/Static_Code_Analysis & https://www.owasp.org/index.php/Source_Code_Analysis_Tools

I started by using bandit (https://github.com/PyCQA/bandit) to scan my code. Here are some of the issues:

  • Subprocess module is in use
  • Hardcoded password
  • Use of md5 function (used to generate VNC password, not a vuln)
  • Binding to all interfaces
  • Starting a process with shell (using os.system)
  • Starting a process with shell, with possible injection (it detects when external variables are used)
I also used Python-Taint (https://github.com/python-security/pyt) It found that I was using a URL parameter as an input for SQL queries. 

These tools are definitely useful for a larger project. They did find useful things. 

I am also doing manual analysis. OWASP has guides on how to do a code review (https://www.owasp.org/index.php/Category:OWASP_Code_Review_Project) and I'm using those as well. 

Here's what OWASP recommends focusing on:

OWASP also recommends looking at inputs and data flow. They have more things recommended but I wanna try to focus on vulnerability areas the above screenshot mentions and inputs.

  • Data Validation: There isn't any. json.loads is used, which validates that the upload is json, however, that doesn't really matter. For starting a lab, JSON data has to be correct, however, values don't have to be. If something is int, string, or etc. it isn't checked. The uploaded file isn't saved on disk either.
  • Authentication: Admin has to login to use WebUI or API. api.py has a hardcoded password.
  • Session Management: Basic auth is used, so there isn't any.
  • Authorization: N/A
  • Cryptography: The WebUI/API access does not use SSL/TLS neither does the NoVNC connection. If someone was eavesdropping, they could get credentials.
  • Error Handling: Yes! I'm doing try-except then returning a generic error message. Also, in the try section, I'm doing If and returning a generic message. It's not perfect. There are some flaws.
  • Logging: None
  • Security Configuration: N/A
  • Network Architecture: The web app does bind to all interfaces.
As for inputs, only admin has input capabilities. They can upload JSON file, reboot VM's, and change VM VNC password.


Here's the example config file:

For JSON file upload, it's done through /upload. The file is assigned to labconfig variable. When the lab is turned on (through /on), startlab() is called, which creates a db and calls mmstart.startmm(labconfig). startmm calls mmcontrol.start_mm(), which starts Minimega. After that, JSON file is processed. First thing looked at is gre, then dhcp, then internet, then finally VM. For VM, mmcontrol.vm_config ends up being called, which runs os.system statement with networking info. With JSON processing, there are several places a command could be injected.

For VM reboot, here's what ends up being ran, when vmname is supplied via GET request:
mmcontrol.vm_reboot(vmname, dbcontrol.get_password(vmname)), and in vm_reboot() this statement is executed first: os.system(minimega_cmd + "vm kill " + vm_name). Injection can happen here.

For VM VNC password reset, mmcontrol.set_password() is used, which executes os.system(minimega_cmd + "vm qmp " + vm_name + " " + json.dumps(vncpwcmd)) first. In this case, the injection could occur in the middle of the statement.

Dynamic Analysis:
For dynamic analysis, the application has to be running. I used OWASP ZAP, Subgraph Vega, and Nikto. They didn't actually find anything useful, which is expected, however, Nikto did guess hardcoded login admin/admin.

I started doing manual analysis. I would use Burp but it really isn't needed for now.

First, I uploaded a random file, which didn't do anything. I uploaded a random JSON file, and it was accepted. The labs wont start obviously since it's not a valid config file. After that, I took the example config file and injected commands. Here's what the new file looks like:


The command injection worked:


After this, I uploaded the example JSON configuration which was included and started the lab the way it should be so I can try to mess with GET parameters.

First the reboot:


It worked:


Next, the VNC password reset:


That also worked:


Another issue with this software is cross-site request forgery. Since I'm not using session management or any other security, when a request is made via another webpage, if the admin is logged in, the request will get processed.

For example: <img src="http://10.0.0.53:1337/reboot/tc2" width="0" height="0" border="0"> embedded in another HTML page does cause a reboot for tc2. Of course, since I wasn't doing any checking to see if VM name is a valid name, the attacker, does not need to know vm name to execute commands.

Here's my new code, asdf vm doesn't exist:



I logged into the admin account on another machine and opened the poc webpage:

On my "hacker" machine:


Dynamic analysis helps with confirming/validating some of the findings in static analysis.

Findings:

  1. Binding to all network interfaces
    1. This is bad because depending on the network configuration, the webui can be accessed from inside the VM
  2. Command injection
    1. Bad but only admin can do it (unless CSRF is used)
  3. Lack of data validation
    1. I should have validated everything in JSON file and even data from GET requests. For example, if vmname is supplied for reboot, I should check to see if that VM exists or not. I should make sure that I only allow a-z,A-Z,0-9 as input chars. 
  4. Bad session management
    1. I probably shouldn't have used basic auth. Flask (or modules on top of Flask) has session management mechanisms that I could have used. CSRF token should be used. There are other web app protections that could be used as well.
  5. Cross-site request forgery
    1. CSRF token should be used. 
  6. Lack of cryptography
    1. The way I imagined this web app would be used didn't require adding ssl/tls protection but it's still something I wanted to point out.
  7. Error handling could be better
    1. Error messages are generic. More detailed messages would be useful. Also, more error checking should be done. For example, if someone starts a lab with bad json file, code still starts minimega binary. That should not happen. Return from os.system should be checked too.
  8. Lack of logging
    1. I should have been logging some stuff, mainly errors. 
Basically, there are three ways to get root on the system running runthelabs. A non-admin user can use CSRF w/ command injection. A malicious admin can use various command injection points. Finally, a MITM attack can be used to capture admin credentials and those could be used to execute a command injection attack.

Conclusion:
It's possible that I may have missed something. Static and dynamic analysis both definitely were useful. OWASP is a great resource on code review. Also, this Github awesome list is very useful: https://github.com/mre/awesome-static-analysis

The security issues occurred due to laziness and the risk/chances of exploitation were low. Also, I accepted the risk of possible command injection by the admin when I was programming. The impact is high since you can get root pretty easily with CSRF or if you were a malicious admin. 

Monday, November 5, 2018

Computer usage and health

Introduction:
If you're in infosec or any other computer focused jobs such as sysadmin or a programmer, you may be spending a lot of time on a computer and/or sitting at a desk all day (or at least more than the average human being). This may come with health problems related to but not limited hands, eyes, back, and neck. In this post, I'll try to provide tools and tips that may help limit injuries or pain. 

Disclaimer: I am not a doctor. This blog post does not provide any cures. Check links in the resources section for more information. 

Tools/Tips:
One of the main things you do when using a computer is staring at your screen. I'm not sure if this affects your vision long-term or not but it may certainly cause strain or dry eyes. There are applications you can install to remind you to look away or take a break from staring at your screen. These applications include Workrave (http://www.workrave.org/) and Eyeleo (http://eyeleo.com/). There are more if you check AlternativeTo (https://alternativeto.net/). I have used Workrave in the past but currently, I use Eyeleo. Depending on your settings, Eyeleo will give you a popup about various eye movements (rolling your eyes for example) or looking away. Workrave also gives you popup about doing exercises at your desk. You may have to tune the time settings to make sure the popups don't get too annoying and you can still remain productive. 

You can also adjust your screen brightness level depending on the light level around you. Android phones and tablets usually have an option to adjust brightness based on the sensor included on the phone. Another thing you can do is use a blue light filter option on your devices. Again, Android phones and tablets may include this option in their settings as well. For Windows, I use f.lux (https://justgetflux.com/), which is pretty popular. It will adjust your screen color based on the time of day. 

Finally, you can get computer glasses that are designed for computer users. I'm pretty sure they protect your eyes from the blue light, besides that, I'm not 100% sure what else is different about them. Additional features or protections may depend on the manufacturer I guess. 

"Protect Ya Neck" - Wu Tang Clan
You may get neck issues depending on how your monitor is positioned/angled. Make sure your monitor is in front of you and you don't have to keep your neck tilted or twisted to view it. Position the monitor so you're not getting glare or reflection. Keep your monitor clean as well. Keep in mind the height and distance of the monitor compared to your eye level. You shouldn't have to bend your neck down to view what's on the screen. Check the resources for more information on setting up your monitor. 

Pay attention to your posture when using a computer. Make sure you're not cutting off or reducing blood circulation to your hands because of the way you're using the keyboard. Try to keep your back straight. Keep your forearms and wrists aligned with the keyboard and mouse. Your feet should be flat against the ground. Check the links in the resources for a diagram. 


Instead of sitting all day, you can also stand at your desk. Adjustable standing desks exist, you can also buy kits that convert your desk into an adjustable standing desk. If you do utilize a standing desk, make sure not to stand ALL day and switch between sitting and standing. Also, if you're standing, use an anti-fatigue mat to stand on. 

Ergonomic keyboard and mouse may make your hands more comfortable when using a computer. Ergonomic keyboard and mouse can be used with your natural hand position and may help prevent carpal tunnel (not sure how true that is). There are many options out there when it comes when it comes to ergonomic keyboard and mouse, you may just have to test and find what feels most comfy to you. 

Some workplaces may have people in charge of ergonomics/human factor or occupational health. I've worked at a place that has had people like that. They can help make sure your work environment is comfortable and safe. Check with HR. 

That's all! Hopefully, some of the information was useful to whomever that's reading this. Links in the resources are probably more helpful.

Resources:
http://www.workrave.org/
http://eyeleo.com/
https://alternativeto.net/software/eyeleo/
https://alternativeto.net/software/workrave/
https://justgetflux.com/
https://www.digitaltrends.com/mobile/how-to-use-blue-light-filter-phone/
https://ergo-plus.com/office-ergonomics-position-computer-monitor/
https://www.ccohs.ca/oshanswers/ergonomics/office/monitor_positioning.html
https://www.spineuniverse.com/wellness/ergonomics/workstation-ergonomic-tips-computer-monitors-posture
http://www.healthycomputing.com/office/setup/monitor/
https://lifesworkpt.com/2018/03/proper-computer-posture/
https://www.wikihow.com/Sit-at-a-Computer
http://ergonomictrends.com/proper-sitting-posture-computer-experts/
https://www.mayoclinic.org/healthy-lifestyle/adult-health/in-depth/standing-workstation/art-20088544
https://www.uclahealth.org/safety/sitting-to-standing-workstations
https://www.doityourself.com/stry/choosing-the-best-ergonomic-keyboard-five--essential-features

Friday, June 1, 2018

Extracting winner info from gameplay video with OpenCV and Tesseract OCR

A long time ago, while I was on Youtube, I came across a video of Trials Fusion gameplay on a channel named "CaptainSparklez." In this video, two players are playing Trials Fusion to win. There is a playlist full of videos of CaptainSparklez and NFEN playing Trials Fusion. I thought it would be cool to see if I could use OpenCV and OCR to figure out who won which maps or at least who won a map. 

Typically, the winner information is presented as shown below:



(from video: https://www.youtube.com/watch?v=W528UyfC42k)

As you can see in the screenshot above, there is a specific area where map and winner information is presented. To extract that specific information, I used Python, OpenCV, and Tesseract OCR. OpenCV will allow me to process the videos and Tesseract OCR will be able to extract the information for me.

OpenCV allows you to view a file frame by frame and the code example is shown here: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_gui/py_video_display/py_video_display.html#playing-video-from-file

Here's what it looks like when I process the video:



In the screenshot above, we have 720p video (after some testing, I found out that 720p was probably the best choice for this) and frame extracted from that video. Since I'm just interested in the area of the frame that contains map and winner information I decided to crop that area.

This post has more information on how to crop: https://www.pyimagesearch.com/2014/01/20/basic-image-manipulations-in-python-and-opencv-resizing-scaling-rotating-and-cropping/

After cropping the frame appears as shown below:



After I had the cropping part working correctly, I decided to do OCR with tesseract. OpenCV allows you to convert the frame to grayscale or black and white. I tested OCR with both gray and black and white frames and it did not make too much of a difference in the output from OCR.

To have tesseract OCR analyze the frame, the following has to be done:

image = Image.fromarray(frame)
str_out = tesserocr.image_to_text(image)

Here are some issues with doing OCR that I had while doing this:

  • You'll be extracting information about the current race, such as time, player name, and faults.
  • Processing each frame is a bad idea and wastes a lot of resources
  • Looking for "wins" and "Player" in the OCR output is a good idea but you may see that every frame for that one map (for example, seeing frame 99 then 100 with similar OCR output)
  • If you look for "wins" and "Player" and skip frames, you may end up skipping too much and missing results
  • Sometimes winner players information is picked up but not the map information
Here's the process I ended up using for extracting information as accurately as I can:
I process every 50 frames. If OCR of the frame contains "wins" then I process OCR data further. Check each line from OCR output, if the line contains "wins" and "Player" then figure out the player who won. If the line does not contain "wins" and "Player" then it's obviously a map name. Concatenate map name, player name, and video name into one variable and print it. There is another part that keeps track of frame number that we successfully extracted the map and the player information out of. Full extraction in OCR frame only happens if the new frame is 500 frames after the last successful extraction frame number. This is implemented so after 50 frames, we don't re-extract the same map and player information but at the same time, since we're processing every 50th frame, we won't miss an outcome of a map. The process is hard to explain so definitely look at the code. 

The results:
165 videos were processed. 670 maps were seen by the script.
NFEN won 291 maps. CaptainSparklez won 371 maps. Mark/YYFakieDualCom (script didn't look for this username) won 8 maps.

Script and the results are posted on Github: https://github.com/BoredHackerBlog/TrialsFusionOCR

Anyways, this was a fun and interesting side project for me. There are definitely ways to improve this. For example, OCR isn't perfect and tesseract could have been trained to extract better data. Also, the code could probably be optimized or be written in a different language like C++ or Golang to get better performance. 

Thursday, May 24, 2018

Providing a shell for CTFs

Introduction
I was working on creating a CTF for students. I’m using CTFd since it’s so simple to setup and use. You can check it out here: https://github.com/CTFd/CTFd/wiki/Getting-Started

Anyways, I’ve played some CTFs and have noticed that some of them provide web shell or SSH access. For example, pwnable.kr (http://pwnable.kr) provides SSH access. PicoCTF (https://github.com/picoCTF/picoCTF ) also has a module for providing a shell. I also noticed that PentesterLab (https://pentesterlab.com) providers web shell as well. TAMUctf also has a plugin for CTFd here: https://github.com/tamuctf/ctfd-shell-plugin.

Providing shell to players can allow you to provide binaries, tools, and other files to the player easily. Also, you can have a category of challenges (example: how to use CLI or do basic things on Linux) specific to the container you provided.

Goal
My goal is to basically provide web shell to a Docker container for users. I’ll be adding users to the docker container, just like some other CTFs/plugins have. 
I also want to do it from inside of CTFd. This way, the players don’t have to register on another site or do anything like that. (my first idea was to make them register on another web server…) Finally, I also want this to be simple for me.

Here’s my Github repo with the plugin I wrote and my Dockerfile: https://github.com/BoredHackerBlog/ctfdwettyshell 

Setup
For my setup, I’ve decided to obviously have a docker container. I’m using ubuntu 16.04. For web shell, I’ve decided to go with wetty. You can check out wetty here: https://github.com/krishnasrinivas/wetty It’s very simple to use.

First I decided to make my docker image, which includes wetty. Here’s the Dockerfile:

The Dockerfile can obviously be modified to include more tools and custom scripts. 

After the file was created, I used “docker build” to build the image (mine is called wettytest, since I was just testing). I started up the image by running “docker run” command.

Now I needed to add a plugin to CTFd. The goal of this plugin was to:
See if a user has been assigned creds or not
  1. If the user has been assigned creds
    1. Return creds
  2. If the user hasn’t been assigned creds
    1. Randomly generate username and password
    2. Add user to the docker container
    3. Return creds
More information about creating CTFd plugin is here: https://github.com/CTFd/CTFd/wiki/Plugins

I didn’t exactly follow the guidelines provided by CTFd for this. For some reason, I couldn’t get my script to add a table and store values in the database. I finally gave up and went with storing the login information in a dictionary. 

Here’s what the plugin script looks like:

If the plugin is used then container_name string and return_info string obviously need to change to match your setup. 

I also added a page via the Admin page on CTFd which redirects to /docker. It looks like this:

When that page is added, a link to Docker page should show up on non-admin pages like below:

Results
It works for my needs.
When the player visits the docker page, this is presented:

And the player is able to log in:


Issues
There are several issues with doing this that I haven’t had the need to fix yet. 

If a user changes their password while in the container and forgets about it, they lose access to it. If their password is accidentally leaked or shared, they can’t change it. 

Login info is lost if CTFd web app is restarted. Since I was having trouble with storing login info into CTFd database, I went with a dictionary and I’m not permanently storing that data and reloading it every time. When CTFd is restarted and the player visits /docker, a new user is created. 

Docker container is allowed to access the internet. Depending on your setup and your users, you may or may not want that ability. 

Resources
https://github.com/BoredHackerBlog/ctfdwettyshell 
https://github.com/CTFd/ 
https://github.com/CTFd/CTFd/wiki/Plugins 
https://github.com/tamuctf/ctfd-shell-plugin 
https://github.com/picoCTF/picoCTF 
https://pentesterlab.com 
http://pwnable.kr 
https://github.com/krishnasrinivas/wetty 

Tuesday, January 16, 2018

Digital Forensics and Law

This paper was written for FIS 41500 - Forensic Science and the Law class at IUPUI. It was written in Fall 2014. I am not a legal expert so the paper may not be very accurate. It is old as well and laws may have changed.
I found it when going through old documents so I'm posting it here.

Digital Forensics and the Law

Introduction

In the past few decades, the use of computers and internet use has been on the rise. There are more than 2.5 billion internet users worldwide. [1] [2] There are more and more people using social media today. There is also a rise in “Internet of Things” devices, which are home appliances that connect to the internet, such as a thermostat. [3] Crimes involving technology are also on the rise.
Digital forensics or cyber forensics is a relatively new forensic science field that has been solving issues related to the use of technology in crime. Digital forensics experts commonly deal with child pornography, corporate espionage, and copyright infringement.[4]  There are digital forensic journals, and conferences. Digital forensic experts exist in academia, military, law enforcement, and intelligence agencies. It’s a very broad field, and it’s becoming more relevant to the crimes committed these days. It is important for the criminal justice system to keep up with the changes. This paper will discuss history of digital forensics, evidence gathering and processing techniques, problems with digital forensics, and most importantly, the law and how it relates to digital forensics.

History

Florida is the first state that passed the laws regarding computer crimes. In 1978, Florida passed Florida Computer Crimes Act. [5] This act addressed the fact that computer related crimes were on the rise, and it caused harmed to government, as well as the private sector. The act focused on offenses committed against Intellectual property, computer equipment or supplies, and computer users. Offenses against intellectual property included unauthorized modification, destruction, and disclosure of someone else’s computer system or network. The act also counted modification of computer equipment or supply as an offense. Finally, the act counts accessing anyone else’s computer, or conducting denial of service as an offense. The offenses had to be committed willfully and knowingly. Penalties for offenses ranged from misdemeanor to second degree felony. [6] With the increase of computer crimes, the Congress passed Computer Fraud and Abuse Act in 1986.
Computer Fraud and Abuse Act mainly focused on punishing unauthorized access. CFAA criminalized unauthorized access of national security information, and exceeding authorized access.  For example, someone hacking into IUPUI financial computer to get financial information would be punished, and someone who works in the financial department exceeding their privilege to get information would also be punished. Accessing government computers, such as computers owned by a government department or an agency is also punishable. CFAA also dealt with malicious code, computer attacks, and password theft. Transmission of code, program, or command that damaged a protected computer was punishable. Knowingly transmitting computer viruses that are designed to cause damage would fit under this. Protected computer is defined as a computer that is used by a financial institution or US government and computer used for interstate or foreign commerce or communication, including computers outside of US. [7] After CFAA was passed, FBI had the authority to conduct computer crime investigations. FBI, in 1992, created a special team called Computer Analysis and Response Team for digital forensics. CART collected and investigated digital evidence. [8] CART also provides training to digital forensics experts.  

Digital Forensics

CART and forensics department in other agencies most likely had standard operating procedures for dealing with the evidence collection and examination. In 1998, the Scientific Working Group on Digital Forensics was formed. SWGDE was formed to allow communication and cooperation between people who worked in digital forensics. SWGDE also publishes papers on best practices for digital evidence collection and examination [9]
Collection of digital evidence is a bit more complicated than it is in other forensic fields. First challenge is finding the storage devices. With advances in technology, large amount of data can be stored in a small device. Finding these storage devices can be fairly difficult because storage devices can be disguised to look like unsuspicious things, such as pens or flash lights. Second challenge is posed by how a computer fundamentally works. Computer has storage memory (harddrive) and temporary storage memory (RAM). The challenging part of collection is when the computer is turned on. When a computer is turned on and is being used, it puts data into the RAM, which is temporarily there. As soon as the computer is turned off, all the temporary data, which could be evidence, permanently goes away.  [10]
When computer is off, the examiner collects data by cloning the harddrive or the storage device. Examiner attaches the storage device to a write-blocker and attaches the write-blocker to the collection device. Write-blockers prevent data contamination by blocking the examiner or the collection device from writing to the storage device. After the drive is cloned, the examiner has to check the integrity of the clone. Integrity is usually checked by using two or more checksum algorithms. Checksum algorithms generate an alphanumeric string such as “eb61eead90e3b899c6bcbe27ac581660.” The alphanumeric string generated is comparable to DNA; it’s unique to that data. The examiner runs the checksum algorithm on the clone and the suspect’s harddrive to check if they both match. If the checksums match, then the data was not contaminated. [11] Examiners usually create another clone of the original clone to work with.
Harddrive clone is first automatically examined by existing tools, such as Forensic Toolkit and EnCase. These tools automatically recover deleted files, and separate different data for the examiner. Examiner can usually search the harddrive for specific words or filetypes (such as videos or pictures) that are relevant to the case. Examining the evidence this way usually is very time consuming. Depending on the case, examiners use a different method that is less involved. Hashing algorithms discussed earlier can also be applied to individual files, and not just a whole harddrive.
Since the algorithms can be applied to files, the examiner is able to build a database of unique hash values. This is just like Automated Fingerprint Identification System (AFIS) or National Integrated Ballistic Identification Network (NIBIN). An examiner typically uses whitelist of hash values to identify known or good files. The software runs the algorithm on each individual file and compares the results with this whitelist. If the hash value is in the whitelist, the file is known and good. An examiner can also create a database or a list of known bad files, such as child pornography or confidential files.[12] The examining software can apply the algorithm to all the individual files, and if hash value of a file matches with the bad list, the software can alert the examiner. This method allows examiner to quickly identify known bad files or the files they are looking for.
Since digital forensics includes many sub-disciplines, examination method covered in this paper isn’t used by all of them. Examination method covered in this paper is used for most of the cases that law enforcement handles.

Rules, exceptions, and case laws

4th amendment

In most cases, 4th amendment and exceptions to the 4th amendment apply to digital evidence. For example, the person collecting evidence needs to have a warrant to take or search the evidence. The warrant should also include scope of the search. An investigator can get a warrant to search a laptop, but cannot come in and start searching other computers. The investigator may also have to specify the part of laptop they want to search. For example, computer can have multiple folders that are used for storage. An investigator, who gets warrant to search a specific user’s folder, cannot start searching irrelevant folders for evidence. Evidence an investigator searches for has to be related to probable cause. If investigator is looking for evidence of child porn, they cannot start searching for evidence of digital piracy.
Sometimes computers are shared between multiple users. 4th amendment still applies here. One of the users can consent to the search, but the investigator still has to ignore searching folder of another user if they have taken measures to protect it. A user can implement access control to their folders or use password-protected login. The user still has reasonable expectation of privacy. If a user decides to share their folder via peer-to-peer share or a Windows share, they no longer have reasonable expectation of privacy.

Border search exception

Another important rule that applies to computer search and seizer is the border search exception. Border search exception is not really an exception to the 4th amendment. Border search exception rule is to protect country and its people from external threats. The authorities need probable cause to search your devices if you are in the United States. If you are coming into the United States, the authorities have the right to search your devices without any reason or warrant, even if you are a citizen. This rule has been used multiple times to search electronic devices.[13] [14]

Admissibility

The best evidence rule also applies to digital evidence. Best evidence rule (Federal rules of Evidence 1002) states “An original writing, recording, or photograph is required in order to prove its content unless these rules or a federal statute provides otherwise.” This rule is to limit the testimony or description of the evidence, and allow introduction of the original evidence. This rule applies to digital evidence because of the fact that writings, recordings, and photographs can be now stored on a computer or any other digital device. Another rule, Rule 1003, also applies to digital evidence because of the way it’s collected and processed. Rule 1003 states “A duplicate is admissible to the same extent as the original unless a genuine question is raised about the original’s authenticity or the circumstances make it unfair to admit the duplicate.” The digital forensic examiner usually makes at least two duplicates of the original evidence, and according to this rule, it is admissible, just as long as it was authenticated via a hashing algorithm. [15]
Rule 803, titled Exceptions to the Rule Against Hearsay, also applies to digital evidence. Courts have accepted digital evidence as “business records” under the business records exception rule, assuming that the evidence was gathered correctly and is accurate. [16]

Case laws

One of the sources of laws is case decisions or precedence. It’s important for us to look at digital forensics cases and how they have changed the field.
U.S. v. Bonallo
One of the main issues with digital evidence is authenticity. Although the evidence collected can be considered authentic and verified because of the hashing, someone could always say that original data was tamped with or implanted. An argument can be made that because the computer evidence is susceptible to tampering or implantation of evidence, it should not be admissible.
Daniel Bonallo was an employee at American Data Services. He worked on Automated Teller Machines and Bank Card Management Systems. American Data Services handled data processing for The Oregon Bank. In 1985, the bank started receiving complaints of unknown withdrawals. The bank realized that someone had made 40 fraudulent transactions.  Bonallo was the main suspect because he was an expert on ATM and bank systems. Most of the fraudulent transactions happened on ATM’s located between Bonallo’s house and American Data building. Bonallo also completed transactions at an ATM outside of American Data building.
The person who was handling Bonallo tasks testified and said that he found a “fraud program” on Bonallo’s computer. This program allowed a person to access ATM’s file system and change transaction records. Although, he also said that the program had legitimate uses. Bonallo denied making transactions and existence of “fraud program.” He claimed that he was framed.
Bonallo made the argument discussed previously. He claimed that evidence was not trustworthy because someone had altered it to frame him. Bonallo failed to provide evidence that supported his alteration claim. The evidence was in fact admissible because “The fact that it is possible to alter data contained in a computer is plainly insufficient to establish untrustworthiness. The mere possibility that the logs may have been altered goes only to the weight of the evidence not its admissibility.”
U.S. v. Glasser
U.S. v. Glasser is a similar case to Bonallo v. U.S. In this case, the focus is on security of a system and how it relates to alteration of a system. In this case, Jodi Glasser is charged with embezzlement and making false entries in bank records. Trustworthiness of the records was questioned in this case. The evidence or the computer printouts were admissible because “The existence of an air-tight security system is not, however, a prerequisite to the admissibility of computer printouts. If such a prerequisite did exist, it would become virtually impossible to admit computer generated records; the party opposing admission would have to show only that a better security system was feasible.” [17]
U.S. v. Crist
In 2005, a landlord hired two people, Jeremy Sell and Kirk Sell, to remove tenant’s possessions. The tenant, Robert Crist, was unable to pay the landlord so landlord requested the removal of his belongings. Jeremy and Kirk Sell started putting most of Crist’s belongings on the curb. In August, Jeremy Sell gave a computer that belonged to Crist to his friend Seth Hipple. When Crist came back home, he asked where his computer was. The Sells decided not to tell Crist about his computer being given to Hipple. Crist called the police and reported that his computer was stolen.
Hipple, after getting the laptop, searched for things he could delete. While cleaning the computer, Hipple came across child pornography, deleted the folder containing it, and turned the computer off. Few days after this incident, Hipple contacted the police department and reported what he did. Detected Michael Cotton was assigned to handle this investigation. The detective was told who the computer belonged to and he also knew that it was reported stolen. The detective contacted Attorney General’s office to have the computer forensically analyzed. Attorney General’s office took the computer and gave it to David Bushwash, an agent in computer forensics department, to examine.
Bushwash followed the standard forensic examination procedures and created a clone image of the harddrive. Bushwash started analyzing the clone image with EnCase. EnCase performed MD5 hash algorithm on all the files that the clone image contained and matched with known hash list. Some of the files matched with known child pornography list. After the analysis, Bushwash reported to detective Cotton. Crist was charged with possession of child pornography.  Crist filed motion to suppress the evidence. One of the main issues in this case was that the forensic examiner ran MD5 hashing algorithms on the files. The act of running hashing algorithm was a search. The court ordered to suppress all the evidence obtained from the forensic examination. [18]
In re Boucher
Software manufactures have been making software that allows a person to easily encrypt their data. For example, Microsoft started including full-disk encryption feature, BitLocker, in some versions of their operating systems starting in 2007. Free and reliable encryption software was available before 2007, but they were generally hard to setup, or use. Many encryption tools did not allow you to encrypt the whole harddrive; instead they allowed you to encrypt folders or specific files. Microsoft’s BitLocker allowed users to setup full-disk encryption in just a few clicks. This is different from password protection, since password protection is only in play when the computer is running. When a computer is password protected, it can be turned off and an examiner could still access all the files. An examiner now had the challenge of dealing with fully or partially encrypted data.
Sebastien Boucher was traveling from Canada to United States. At the border station his car was inspected. This is allowed because of the Border Search Exception mentioned previously. Officer doing the inspection saw that there was laptop in the car. When the officer opened the laptop, it was not password-protected. Officer started to search the computer for images and videos and found around 40,000 images, and some of them had pornographic file names. Officer asked Boucher if his laptop contained any child pornography. Boucher replied that he was not sure. Officer noticed that file names looked like they could contain child pornography. Officer called Immigration and Customs Enforcement officer, who had training in identifying child pornography. ICE agent started to examine the laptop. He determined that the computer did contain child pornography. Agent read Miranda Rights to Boucher, and Boucher waived them. Agent asked Boucher about the file “2yo getting raped during diaper change” and Boucher replied that he downloads many pornographic files, but when he sees a file containing child pornography, he deletes it. At the agents’ request, Boucher showed him where the files were stored. Files were stored in drive Z, and it didn’t require password to access them.
After doing more examination, agent calls US Attorney’s office and then seizes the laptop. In this process, the agent also shut the laptop down. When the laptop was provided to an examiner, they noticed that they were unable to access the Z drive because it was encrypted. Grady jury sent a subpoena to Boucher telling him to provide passwords. Boucher quashed the subpoena because it was violating his 5th amendment rights. Magistrate judge handling the case agreed that providing the passphrase/password to the encryption harddrive would be self-incrimination. Passphrase would be considered product of the mind and testimonial communication, therefore he does not have to give it up. This issue is often compared with a combination lock vs. traditional lock with a key.[19] Key already exists, but the combination to a lock can be in your mind or can be in your mind.
In 2009, a different court ruled that Boucher does have to give up the passphrase. It was ruled that the act of production (of the passphrase) cannot be used to prove that Boucher owned the laptop or data. Court would now need to use an alternative way to prove that Boucher owned the laptop. [20]
Another rule that can apply to encryption cases is foregone conclusion exception. Under foregone conclusion exception, the 5th amendment no longer applies because the evidence is not considered testimonial. If the government already knows that files exist, and exactly where they exist, foregone conclusion is valid. [21]

Famous cases with digital evidence

BTK Killer
BTK Killer or Dennis Rader killed ten people in Kansas. Rader sent letters to the police asking if a floppy disk could be traced or not. The police published the answer in the newspaper saying that “Rex, it will be OK”, which told Rader that it was safe to do so. Rader sent a floppy disk to KSAS-TV. The floppy disk contained a file with message “this is a test.” The examiners recovered deleted files from the floppy disk and looked at metadata to discover that author was Dennis. Other recovered data indicated that the computer used to write the file was owned by Christ Lutheran Church. After internet search, the police determined that Rader was part of the church. The police also had evidence that the killer had black Jeep Cherokee. Investigator noticed that Rader had the same car. In this case, digital evidence examination helped the police track down the killer.  [22]
Caylee Anthony Case
This case is perfect example of some of the problems that can occur with digital evidence examination. John Bradley was one of the witnesses who conducted digital evidence examination. Prosecution said that the word “chloroform” was searched for 84 times. Bradley later discovered that the software didn’t correctly handle the forensic data, and the search was only made once. During cross-examination Bradley also agreed with the defense saying that there were two different accounts on the computer and there was no way to tell who searched for “chloroform.” Forensic software can cause issues sometimes, but this can be avoided by validating the tool and by using Daubert Standards. The tool has to be tested by other people, has to be peer-reviewed, has to be checked for error rates or potential problems, and be generally accepted by the community. [23] A lab could use a dataset and use the examination tool to see if the output matches the expected results. Even if the tool valid and accepted, it would be good to have internal testing. [24]
After the trial was over, the police reported that they never looked at history of Firefox browser; they just looked at Internet Explorer history. When the police did search Firefox history, they saw that someone searched for “foolproof suffocation” and clicked on article about committing suicide. [25]

Problems and challenges

One of the main problems is encryption. Encrypting data has become easier over the past few years; it can accomplished by clicking a few buttons. Examiners now have either limited amount of data or have to go through the legal system to get the unencrypted data. This is not to say that encryption is a bad thing or only the criminals do it.
Cloud computing or cloud storage is another issue. Cloud refers to things being done on remote servers. If someone is using cloud storage to store their data, they are storing the data on some remote server, such as Dropbox or Box. There are service providers who monitor user in order to see if anything illegal has been transmitted. Apple has iCloud service with keyword and phrase detection. When a keyword or phrase is detected in an email, it automatically deletes it. [26]
Anti-forensics and covert channels are also a problem. Although these both things have legitimate uses, they have been used to do illegal things. Tor or The Onion Router allows a user to browse the web anonymously. Tor also allows a person to setup an anonymous server. Silk Road was an online website, similar to Amazon or eBay, which was used to sell illegal services or things. It was mainly used to sell drugs.[27] There are also sites similar to Silk Road, which are used to sell weapons. Last year, the FBI was able to take down Silk Road and arrest the administrator of the website. [28] This is probably the only field that has anti-forensics side. Digital forensics can be used by a criminal to recover confidential data or gather intelligence for an attack. Anti-forensics techniques are developed to deal with this issue, and these methods can also be used by criminals. As discussed before, encryption is a good method to stop digital forensics. A criminal could also securely delete their files. There are ways to hide data using methods such as steganography. Criminal could also directly attack the forensic software. This is only a big issue when the criminals are very sophisticated.[29]

Conclusion

In this paper, I discussed: how digital forensics works, laws and case laws that apply to digital evidence, famous cases that had digital evidence, and problems with digital evidence.
From what I’ve researched for this paper, I believe that the courts are doing a good job dealing with digital evidence. Digital evidence is not too different from other evidence. Most of the times 4th amendment and Federal Rules of Evidence apply to digital evidence. Case laws only apply in rare cases. With rise in use of internet, cloud storage, and encryption, it will be interesting to see what other changes come to this field.

Resources

"Anti-forensic techniques." from http://www.forensicswiki.org/wiki/Anti-forensic_techniques.
          
"Computer Fraud and Abuse Act (CFAA)." from https://ilt.eff.org/index.php/Computer_Fraud_and_Abuse_Act_(CFAA).
          
"EnScript to create EnCase v7 hash set from text file." from http://www.forensickb.com/2014/02/enscript-to-create-encase-v7-hash-set.html.
          
"General Information: Florida Computer Crimes Act." from http://docweb.cns.ufl.edu/docs/d0010/d0010.html.
"In re Grand Jury Subpoena to Sebastien Boucher ". from http://volokh.com/files/BoucherDCT.1.pdf.
          
"Infographic: The Growth Of The Internet Of Things." from http://www.theconnectivist.com/2014/05/infographic-the-growth-of-the-internet-of-things/.
          
"INTERNET USAGE STATISTICS." from http://www.internetworldstats.com/stats.htm.
          
"Internet Users." from http://www.internetlivestats.com/internet-users/.
          
"MD5SUM AND MD5: VALIDATING THE EVIDENCE COLLECTED." from http://codeidol.com/community/security/md5sum-and-md5-validating-the-evidence-collected/22551/.
          
"Probably the First U. S. Legislation against Computer Crimes." from http://www.historyofinformation.com/expanded.php?id=3888.
          
"SWGDE History." from https://www.swgde.org/pdf/2003-01-22%20SWGDE%20History.pdf.
          
"UNITED STATES OF AMERICA v. ROBERT ELLSWORTH CRIST, III, Defendant." from http://www.volokh.com/files/USA_v._Crist,_order-1.pdf.
"UNITED STATES of America, Plaintiff-Appellee, v. Jodi GLASSER, Defendant-Appellant." UNITED STATES v. GLASSER. from http://leagle.com/decision/19852326773F2d1553_12113.xml/UNITED%20STATES%20v.%20GLASSER.
(2008). "Chapter 5. Evidence Collection." Electronic Crime Scene Investigation: A Guide for First Responders, Second Edition. from http://www.nij.gov/publications/ecrime-guide-219941/ch5-evidence-collection/Pages/welcome.aspx.
          
(2012). "Casey Anthony detectives overlooked Google search for "fool-proof" suffocation methods, sheriff says." from http://www.cbsnews.com/news/casey-anthony-detectives-overlooked-google-search-for-fool-proof-suffocation-methods-sheriff-says/.
          
(2013). "Piecing Together Digital Evidence The Computer Analysis Response Team." from http://www.fbi.gov/news/stories/2013/january/piecing-together-digital-evidence/piecing-together-digital-evidence.
          
Brunty, J. (2011). "Validation of Forensic Tools and Software: A Quick Guide for the Digital Forensic Examiner." from http://www.dfinews.com/articles/2011/03/validation-forensic-tools-and-software-quick-guide-digital-forensic-examiner.
          
Gal Shpantzer, Daniel J. R. "Legal Aspects of Digital Forensics." from http://euro.ecom.cmu.edu/program/law/08-732/Evidence/RyanShpantzer.pdf.
          
Ingold, J. (2012). "Why criminals should always use combination safes." from http://blogs.denverpost.com/crime/2012/01/05/why-criminals-should-always-use-combination-safes/3343/.
          
Kabay, M. E. "A Brief History of Computer Crime: An Introduction for Students." from http://www.mekabay.com/overviews/history.pdf.
          
Leger, D. L. (2014). "How FBI brought down cyber-underworld site Silk Road." from http://www.usatoday.com/story/news/nation/2013/10/21/fbi-cracks-silk-road/2984921/.
          
Marsico, C. V. (2005). "COMPUTER EVIDENCE V. DAUBERT: THE COMING CONFLICT." CERIAS Tech Report 2005-17. from https://www.cerias.purdue.edu/bookshelf/archive/2005-17.pdf.
          
ROSEN, R. J. (2014). "'The Floppy Did Me In' The story of how police used a floppy disk to catch the BTK killer.". from http://www.theatlantic.com/technology/archive/2014/01/the-floppy-did-me-in/283132/.
          
Slane, C. (2013). "Yes, Your Cloud Storage Provider Is Looking For Child Porn In Your Files." from http://www.huffingtonpost.com/cassie-slane/yes-your-cloud-storage-pr_b_2853948.html.
          
Sparkes, M. (2014). "Drugs, guns and assassination: Silk Road successor Utopia shut down by police." from http://www.telegraph.co.uk/technology/internet/10635359/Drugs-guns-and-assassination-Silk-Road-successor-Utopia-shut-down-by-police.html.
          
STELLIN, S. (2013). "The Border Is a Back Door for U.S. Device Searches." from http://www.nytimes.com/2013/09/10/business/the-border-is-a-back-door-for-us-device-searches.html?pagewanted=all&_r=0.
          
Vijayan, J. (2014). "Justices let stand appeals court decision on border searches of laptops." from http://www.computerworld.com/article/2487636/data-privacy/justices-let-stand-appeals-court-decision-on-border-searches-of-laptops.html.
          
Villasenor, J. "Can the Government Force Suspects To Decrypt Incriminating Files?". from http://www.slate.com/articles/technology/future_tense/2012/03/encrypted_files_child_pornography_and_the_fifth_amendment_.html.
          
Walker, C. "Computer Forensics: Bringing the Evidence to Court." from http://www.infosecwriters.com/text_resources/pdf/Computer_Forensics_to_Court.pdf.
          

Cases

·        U.S. v. Bonallo
·        U.S. v. Crist
·        U.S. v. Glasser
·        In re Boucher
·        BTK Killer
·        Caylee Anthony Case

Rules/Law

·        4th amendment
·        5th amendment
·        Border Search exception
·        Business record exception
·        Computer Fraud and Abuse Act
·        Federal Rules of Evidence rules 803, 1002, 1003
·        Florida Computer Crimes Act



[1] http://www.internetworldstats.com/stats.htm
[2] http://www.internetlivestats.com/internet-users/
[3] http://www.theconnectivist.com/2014/05/infographic-the-growth-of-the-internet-of-things/
[4] http://www.mekabay.com/overviews/history.pdf
[5] http://www.historyofinformation.com/expanded.php?id=3888
[6] http://docweb.cns.ufl.edu/docs/d0010/d0010.html
[7] https://ilt.eff.org/index.php/Computer_Fraud_and_Abuse_Act_(CFAA)
[8] http://www.fbi.gov/news/stories/2013/january/piecing-together-digital-evidence/piecing-together-digital-evidence
[9] https://www.swgde.org/pdf/2003-01-22%20SWGDE%20History.pdf
[10] http://www.nij.gov/publications/ecrime-guide-219941/ch5-evidence-collection/Pages/welcome.aspx
[11] http://codeidol.com/community/security/md5sum-and-md5-validating-the-evidence-collected/22551/
[12] http://www.forensickb.com/2014/02/enscript-to-create-encase-v7-hash-set.html
[13] http://www.computerworld.com/article/2487636/data-privacy/justices-let-stand-appeals-court-decision-on-border-searches-of-laptops.html
[14] http://www.nytimes.com/2013/09/10/business/the-border-is-a-back-door-for-us-device-searches.html?pagewanted=all&_r=0
[15] http://euro.ecom.cmu.edu/program/law/08-732/Evidence/RyanShpantzer.pdf
[16] http://www.infosecwriters.com/text_resources/pdf/Computer_Forensics_to_Court.pdf
[17] http://leagle.com/decision/19852326773F2d1553_12113.xml/UNITED%20STATES%20v.%20GLASSER
[18] http://www.volokh.com/files/USA_v._Crist,_order-1.pdf
[19] http://blogs.denverpost.com/crime/2012/01/05/why-criminals-should-always-use-combination-safes/3343/
[20] http://volokh.com/files/BoucherDCT.1.pdf
[21] http://www.slate.com/articles/technology/future_tense/2012/03/encrypted_files_child_pornography_and_the_fifth_amendment_.html
[22] http://www.theatlantic.com/technology/archive/2014/01/the-floppy-did-me-in/283132/
[23] https://www.cerias.purdue.edu/bookshelf/archive/2005-17.pdf
[24] http://www.dfinews.com/articles/2011/03/validation-forensic-tools-and-software-quick-guide-digital-forensic-examiner
[25] http://www.cbsnews.com/news/casey-anthony-detectives-overlooked-google-search-for-fool-proof-suffocation-methods-sheriff-says/
[26] http://www.huffingtonpost.com/cassie-slane/yes-your-cloud-storage-pr_b_2853948.html
[27] http://www.telegraph.co.uk/technology/internet/10635359/Drugs-guns-and-assassination-Silk-Road-successor-Utopia-shut-down-by-police.html
[28] http://www.usatoday.com/story/news/nation/2013/10/21/fbi-cracks-silk-road/2984921/
[29] http://www.forensicswiki.org/wiki/Anti-forensic_techniques