Title: Project design by: Jennifer Murdoch University of Victoria, Canada
1CSI Computer Science Investigations(Learn Java
and Solve a Crime in the Same Day!)
- Project design by Jennifer Murdoch (University
of Victoria, Canada) - Lecturer - IEEE Member since 2004
2CSI Computer Science Investigations(Learn Java
and Solve a Crime in the Same Day!)
3Computer Science and ForensicsA Real World
Application
- Acoustical gunshot analysis,
- audio enhancement,
- voiceprint analysis,
- voice elimination,
- aural steganography,
- evidence authentication, and
- Acoustic Simulation for Scene Reconstruction
4Computer Science and ForensicsA Real World
Application (Case Study Rootkits)
- Competitive corporations, crime organizations,
and terrorists are using these tools to attack
networks and steal data. - While customer data theft can cost a company
millions, insider threats are the major problem.
More than 70 percent of a companys value may be
held in its intellectual property assets - a
prime target for competitive intelligence
gathering. - Rootkits can be used to steal information without
detection, which is what makes them so dangerous.
Rootkits are designed to stay hidden for years,
so that they have continued access to
information. Many techniques used by rootkits
were pioneered by virus developers in the early
90s. - While Unix systems continue to be targeted,
rootkits have rapidly evolved to target
ubiquitous Windows machines.
Taken from Hoglund, Greg. pwned. Information
Security Magazine 01 Sept., 2007.
5Computer Science and ForensicsA Real World
Application (Case Study IRC Bots)
- From all indications, something bad had happened.
After installing an intrusion prevention system,
the security team at UW Medicine spotted several
machines trying to communicate with an IRC botnet
server in France. Cindy Jenkins, a security
engineer and computer forensics expert at the
medical and research organization, immediately
went on a hunt for clues behind the suspicious
activity. - Hours spent combing through images of the hard
drives from the infected PCs turned up the
attackers' tools an IRC bot, a rootkit and an
FTP server. Passive network scanning detected
more compromised systems. To save time, Jenkins
made hash sets--digital fingerprints--of the
malware so she could look just for the hash sets
when inspecting additional images. She determined
the machines were infected 18 to 24 months
earlier--before the security measures were
installed.
Taken from Savage, Marcia. CSI for the CISO.
Information Security Magazine 01 Sept., 2007.
6Computer Science and ForensicsA Real World
Application (Case Study Steganography)
- You're confident a trusted employee can't steal
research information on your company's new
anti-cancer drug or plans for its next
acquisition. Physical and logical controls
monitor just about everything that leaves the
building or the network, even encrypted messages
sent to unauthorized recipients. But what about
the message hidden in the family vacation photo
he emailed to his "cousin"? Steganography has
just bypassed all your defenses. - Steganography (from the Greek root "staganos,"
meaning covered or secret), or stego, is the
technique of hiding data in a host file.
Historically, it's been within the purview of the
military, criminals and researchers. In recent
years, however, it's drawn a lot of interest from
the business community, and with good reason. It
is also believed that Osama bin Laden uses
steganography to hide maps, instructions, and
photographs of terrorist targets on sport chat
rooms, pornographic bulletin boards, and other
websites.
Taken from Cole, Eric. More Than Meets The
Eye. Information Security Magazine 01 Nov.,
2006.
7Computer Science and ForensicsA Real World
Application (Case Study Blind Software Scam)
- After being sued for negligence, our client was
about to settle a multi-million dollar suit and
re-write their entire software package because
the plaintiff was charging installation of the
software in question had permanently
damaged/erased existing files, the irreplaceable
data not recoverable by any means, and could not
access files in a specific software application
critical to running his business. - Computer forensic scientists were able to
restructure and reformat all the files needed for
the claimant's specific software application and
reprogram data. Using electronic data discovery,
forensic and analysis applications, the
scientists discovered that the software
installation had not caused the data loss and
determined the plaintiff had manually erased the
alleged lost data! When shown the evidence, the
plaintiff dropped the suit and was promptly
counter sued.
Taken from http//www.computerforensicsint.com/ca
se-studies.html (March 2008)
8Computer Science and ForensicsA Real World
Application (Case Study Couldn't Cover His
Tracks)
- After finding pornography downloaded on its
network server and a number of individual office
computers, our client began to build a case for
employee dismissal. Computer forensic scientists
were hired to locate any deleted files and verify
certain illicit and non-work related contents of
the hard drives in question. Forensic technicians
were able to locate spy software, illegal
file-sharing software, pornography, and
information pertaining to a personal side
business. Both the CEO and the network
administrator were dismissed as a result of the
investigation.
Taken from http//www.computerforensicsint.com/ca
se-studies.html (March 2008)
9Computer Science and Audio ForensicsA Real
World Application (Case Study JFK Assassination)
- In 1979, the House Select Committee on
Assassinations (HSCA) concluded that there was a
high probability that two gunmen fired at
President John F. Kennedy, and therefore,
Kennedy was probably assassinated as a result of
a conspiracy. - Their conclusion, which contradicted the 1964
Warren Commissions conclusion that Lee Harvey
Oswald alone killed President Kennedy, was based
largely on an acoustical analysis of an
eight-second segment of a Dallas police recording
made of radio transmissions presumed to have
originated from a motorcycle within the
presidential motorcade. Although the
static-filled recording contained no audible
sounds that could be distinguished as being
gunshots, two acoustic research groups concluded
that the recording contained four impulse sounds,
which they believed were probable gunshots. - According to these experts, three of the
gunshots originated from the southeastern-most
sixth floor window of the Texas School Book
Depository, while a fourth gunshot originated
from the southeast corner of the stockade fence
atop the grassy knoll. The probability of a
grassy knoll shot was believed to be 95 percent.
Taken from Myers, Dale K. Epipolar Geometric
Analysis of Amateur Films Related to Acoustics
Evidence in the John F. Kennedy Assassination.
Secrets of a Homicide JFK Assassination.
Milford, MI Oak Cliff Press, Inc, 2007.
10The Case Under Investigation...
- Bank of Good Cents (Toronto, Canada) has been
robbed!! - YOU have been hired to crack the case.
- 2 security guards are the prime suspects
- Alice (day shift)
- Bob (night shift)
- Two items stolen
- Worthless vase
- Over 100,000 CDN from a bank vault
- QUESTION Who stole which loot?
11The Case Under InvestigationThe Evidence
- Simultaneous recordings from 5 microphones with
known coordinates around the periphery of the
bank vault room where the crime took place. - 5 audio recordings from the day shift
- 5 audio recordings from the night shift
- Our perpetrators keep an even pace!
- Each recording is 13.5 seconds long and contains
exactly 27 footsteps. - Each footstep is equally spaced, hence, each 0.5
second segment contains 1 footstep.
12The Case Under Investigation...
13Sound in the Real World
14 Sound in the Digital World
Time (seconds)
Amplitude ( -1 to 1 )
double samples 0.0029602954191717277,
0.002838221381267739, 0.00271614734336375,
0.0025025177770317698, 0.0023194067201757866,
0.0025025177770317698, 0.0027466658528397473,
0.00271614734336375, 0.0025025177770317698,
0.0024414807580797754, 0.0023194067201757866,
0.002380443739127781, 0.002227851191747795,
0.002349925229651784, -0.0016174810022278
512, 0.002533036286507767, -0.0010376293221839045,
-0.00173955504013184, -0.017364501953125,
0.05694753868221076
15 Sound in the Digital World An Analogy to
Digital Images
The image on the left is an example of a low
resolution (poor quality)image of 72 ppi (pixels
per sq. inch). The image on the left is an
example of a higher resolution (better quality)
image of 330 ppi.
16 Sound in the Digital World An Analogy to
Digital Video
TIME
17Digital Sound Editing and Analysis Tools
- Digital audio recording, editing, and analysis
software is used in music production, film
scoring, and television and post production. - It is also used by forensic scientists to analyze
acoustic evidence.
18Digital Sound Editing and Analysis Tools
- Pro Tools is an example of a highly advanced
platform for sound recording and editing used in
professional industry.
19Digital Sound Editing and Analysis Tools
- You will be using a digital sound editor called
Audacity in order to visually display and explore
acoustic forensic evidence. - Download FREE from http//audacity.sourceforge.ne
t/ - (MAC, PC, and LINUX compatible)
20The Dimensions of Soundin the Digital World
-(A) Sampling Frequency
- Audio Sample
- A single value representing a sound signal at a
specific point in time. - Audio samples are typically represented using a
floating-point (decimal) value in the range -1 to
1. - Sampling Frequency / Rate
- The number of samples per second taken from a
real-world (continuous) signal to make a digital
(discrete) representation of the signal. - Measured in Hertz (Hz) 1 Hz 1 sample /
second - CD-quality audio typically uses 44.1 kHz (44,100
samples per second).
21The Dimensions of Soundin the Digital World
-(B) Sample Size
- Sample Size / Sample Depth
- The number of bits used to encode (or represent),
each audio sample. - Related to the number of possible values a sound
sample can have - An audio sample of N bits could have 1 of 2N
possible values (from -1 to 1). - CD-quality audio typically uses a sample size of
16 bits.
22The Dimensions of Soundin the Digital World
-(C) Number of Channels
- If an audio recording contains only 1 channel, it
is referred to as monophonic. - Typical music files are composed of 2 channels
right and left (stereophonic sound). - Surround sound maps several channels of audio to
a set of speakers positioned around a room. - For example, 5.1 channel surround sound uses a
total of 6 channels of audio, and hence, 6
speakers - 5 regular audio channels
- 1 low-frequency effects channel (subwoofer)
23Sample Calculation - Determining Digital Audio
File Sizes
- What is the approximate file size in megabytes
(MB) for a 10 second uncompressed mono (i.e. 1
channel) audio recording that uses a sampling
rate of 15kHz and a sample size of 4 bits?
24Sample Calculation - Determining Digital Audio
File Sizes(APPROACH)
- What is the approximate file size in megabytes
(MB) for a 10 second uncompressed mono (i.e. 1
channel) audio recording that uses a sampling
rate of 15kHz and a sample size of 4 bits?
samples (1 channel) x (10 seconds / channel)
x (15,000 samples /
second) File size in bits ( samples) x (4 bits
/ sample) Convert this to MB using the
following 1 byte 8 bits 1MB 1024 KB 1024 x
1024 bits
25The Case Under InvestigationThe Evidence
- Simultaneous recordings from 5 microphones with
known coordinates around the periphery of the
bank vault room where the crime took place. - 5 audio recordings from the day shift
- 5 audio recordings from the night shift
- Our perpetrators keep an even pace!
- Each recording is 13.5 seconds long and contains
exactly 27 footsteps. - Each footstep is equally spaced, hence, each 0.5
second segment contains 1 footstep.
26The Bank Vault Room
DOOR
MIC 5
MIC 4
VASE
CASH VAULT
MIC 2
MIC 1
MIC 3
27Time Difference Of Arrival (TDOA)
Mic 1 Recording
Mic 4 Recording
- The perpetrator is walking closer to Mic 4.
28The Case Under Investigation Analyzing the
Evidence
- Analyze the audio recordings to determine TDOA
between pairs of microphones. - Use TDOA values to derive position of the
perpetrator in the bank. - Repeat this process over time to plot a
trajectory (path) of the perpetrators movements
over time in order to reconstruct the crime.
29The Case Under Investigation Snapshot of 4
Footsteps Recorded by 2 Mics
30(1) Computing TDOA between a Pair of Audio
Recordings
- Through a group Problem-Solving Session (PSS) and
Project Milestone 1, you will explore two
different approaches for computing TDOA between a
pair of microphones - Threshold-Based Approach
- Cross-Correlation Approach
31(1) Computing TDOA between a Pair of Audio
Recordings
- 2 Approaches - How do we decide?
- We consider...
- Accuracy of computed TDOA values,
- Robustness of the algorithm (does it work for
any set of audio recordings, or only audio
recorded under ideal conditions), - How fast it executes, and
- How easy it is to implement.
32(1) Robustness to High and Low Amplitude Sound
Recordings
- Which do you think is generally easier to
analyze? Why?
33(1) Robustness to Clean and Noisy Sound
Recordings
- EVERY DAY Noise" usually refers to unwanted
sound or noise pollution. - SIGNAL ANALYSIS Noise is a technical term
referring to data without meaning - it is data
that is not being used to transmit a signal, but
is instead produced as an unwanted by-product
from some other process. - Within audio signals, noise may be introduced
- by the environment
- (e.g. recording a conversation in a busy
shopping mall), or - by the recording system itself
- (e.g. noise often heard as a 'hum' or
'hiss' that is introduced by bad microphones,
sound cables, or amplifiers).
34(1) Robustness to Clean and Noisy Sound
Recordings
- The amount of noise present in a signal is often
expressed as a Signal-to-Noise Ratio (SNR). - The success and accuracy achieved by any (audio)
signal analysis is usually related to the amount
of noise present in the signal.
35(1) Robustness to Clean and Noisy Sound
Recordings
36(1) Other Factors Affecting the Robustness of
Audio Analysis Techniques
- Echoes
- Reverberation
- Other audio sources
- Time delays / latency
37The Case Under Investigation Analyzing the
Evidence
- Analyze the audio recordings to determine TDOA
between pairs of microphones. - Use TDOA values to derive position of the
perpetrator in the bank. - Repeat this process over time to plot a
trajectory (path) of the perpetrators movements
over time in order to reconstruct the crime.
38The Case Under Investigation Snapshot of 4
Footsteps Recorded by 2 Mics
39(2) Use TDOA Values to Derive Position of the
Perpetrator in the Bank
- You will be given Java classes that will
- Translate TDOA values into X-Y coordinates,
- Plot visually a set of X-Y coordinates to show a
trajectory.
40The Case Under Investigation...
41CSI Computer Science Investigations(Learn Java
and Solve a Crime in the Same Day!)
42Java Source Code
- 3 types of Java classes you will work with
- Classes provided by Sun
- e.g. String, Scanner, etc.
- Classes provided by other programmers
- e.g. Sound, SoundSourceLocalizer, Visualizer
- Classes provided by YOU
43Classes Provided To You Written By Other
Programmers
- The following classes are provided to you in a
JAR (Java ARchive) file called audioForensics.jar - Sound Class
- Represents a sound and provides methods for easy
playback, creation, and manipulation of digital
audio files. - SoundSourceLocalization Class
- Provides methods that compute a 2D position from
TDOA values between 5 microphones. - Visualizer Class
- Provides methods that create a graphical 2D plot
of trajectory (X-Y) data.
44Using Java Application Programming Interface
(API) Documentation
- A manual that tells programmers about the
purpose of a Java class, and the purpose and
usage of the methods provided by that class (i.e.
method names, number and types of parameters,
return types, etc.). - Explains the interface between the source code
you write, and the source code provided by other
programmers.
45Class Name
Description of Class
Class Constructor(s)
List of Method Names, Descriptions, Usage Info
Click on method name for more detailed info.....
46....More detailed info about method use
(parameters and return value)
Method Signature
Static Method
Description of Parameters
Description of Return Value
47API Documentation Available to You
- Classes provided by Sun
- e.g. String, Scanner, etc.
- http//java.sun.com/javase/6/docs/api/
- Classes provided in audioForensics.jar file that
are written by other programmers - Sound Class
- SoundSourceLocalization Class
- Visualizer Class
- Documentation available via project description
48Using the classes found in AudioForensics.jar
- You must tell the Java compiler and Java Virtual
Machine (JVM) where to find the additional
classes you are using that have been provided to
you. - Specify this on the command line by setting the
classpath.
49Using the classes found in AudioForensics.jar(
PC )
- Use the following to compile your Java source
code (assuming a filename of "SoundCheck.java")
from the command line of a PC - javac classpath audioForensics.jar.
SoundCheck.java - Use the following to execute your Java
application from the command line of a PC - java -classpath audioForensics.jar. SoundCheck
50Using the classes found in AudioForensics.jar(
MAC and LINUX )
- Use the following to compile your Java source
code (assuming a filename of "SoundCheck.java")
from the terminal of a MAC or LINUX machine - javac -classpath audioForensics.jar.
SoundCheck.java - Use the following to execute your Java
application from the terminal of a MAC or LINUX
machine - java -classpath audioForensics.jar. SoundCheck
51Using the Sound class
- The Sound class extends the SimpleSound class.
- When you instantiate a Sound object (using
new), you can access the methods of the Sound
class as well as the methods of the SimpleSound
class. - Use the Java API Documentation provided for the
Sound and SimpleSound classes to see what methods
are available to you. - Code tracing SoundCheck.java
52Using the Audacity Sound Editor for Debugging
Your Programs
- Audacity may be used to extract approximate
timing information from the footstep recordings
(for comparison with your program output). - i.e. You can determine approximate foot onset
times, and approximate TDOA values between
recordings.
53Using the Audacity Sound Editor for Debugging
Your Programs
Mic 1 Recording
Select a Time Range
Mic 4 Recording
Footstep Onset
TDOA Estimate
54CSI Computer Science Investigations(Learn Java
and Solve a Crime in the Same Day!)
55EXTREME Programming
56Pair Programming
- Pair programming is a software development
technique in which two programmers work together
at one keyboard. - One programmer (driver) types in source code,
while the other (navigator) reviews each line of
code as its typed in. - The two programmers switch roles frequently.
- Pair programming IS NOT partnered programming.
57Pair Programming
- Periodically in labs, you will be engaging in
exercises using pair programming. - The pair you form during labs will also be your
pair for the CSI Audio Forensics project
58Pair Programming
- Benefits of Pair Programming
- Improved design quality
- Overcoming difficult problems better and faster
- Fun and efficient learning and training
- Improved morale (Its fun and social!!)
- Increased discipline better time management
- Enhanced team-building and communication
- Improved coding style / readability
- Benefits to software development companies
59Pair Programming
- Risks Associated With Pair Programming
- Coordination of work times
- Potential conflict
- Developer egos and intimidation
- Developer work preferences
60Pair Programming