Title: Encoding, Validation and Verification
1Encoding, Validation and Verification
2Introduction
- This presentation covers the following
- Data encoding
- Data validation
- Data verification
3Data encoding
- This is a method of changing the way we represent
data. - We do this to standardise the data we are dealing
with. - The original data is not stored...only the
representation of it.
4Data encoding
- Some codes are easier to work out than others
- MON TUE WED JAN FEB MAR
- For some, you will need a key.
- VXCORBLA
- FDFOCGRE
VX Vauxhall FD Ford COR Corsa FOC
Focus BLA Black GRE Green
5Take note
- Create your own encoded data with a key so
someone else will understand how it works. - Looking at someone elses, are there any
limitations with their code? Could it cause any
problems or confusion? - Explain to that person why you think it is either
fine or needs some improvement.
6Problems with encoding
- If you encode data it may become less accurate.
- You may end up limiting the possible number of
data entries. - For example, cars come in lots of different
colours, but if you limit the choices to Red,
Blue, Black, Silver, etc, you prevent the actual
colours being entered. - Star Silver and Lightning Silver are
different...but encoding may regard them both as
silver. - This would be inaccurate and the validity of your
data could be questioned.
7Problems with encoding
- Asking questions to people often returns
different responses. - Did you enjoy the race?
- It was good
- It was alright...got a bit boring in places
- Fantastic...I am glad he won.
- Responses can be similar but not always the same.
This means that we sometimes have to apply a
judgement on how best to collect the response. - If we had a scale from 1-4 (1good, 4rubbish)
then where would we put the comments? - Again, if more than one person is collecting the
data we their judgements be the same?
8Problems with encoding
- Another problem occurs when you come across some
data that wont fit in with your encoding system. - This means that you have to re-encode your data
again which takes time and can also lead to some
mistakes being made. - If inaccuracies do occur how do you know if that
data is incorrect? People might still assume the
data is fine which could lead to more problems!
9Encoding Good Stuff!
- Computers have a limited storage capacity.
- If you encode data you can reduce the amount of
storage space needed. When you are dealing with
thousands of records the space saved is huge! - Also, it can be quicker to enter coded data. It
doesnt have to be less accurate either. - For example, M Male, F Female.
- A computer can also carry out validation checks
on the encoded data to make sure it is valid. - For example, if it is not M or F then there must
be a mistake.
10Take note
- What is meant by encoding data?
- Describe three advantages of encoding data.
- Describe three disadvantages of encoding data.
- Give an example of how data can be encoded.
- Give two situations where the encoding of data is
appropriate. For each situation, explain why
data needs to be encoded.
11Validation
- Validating data can be done using the following
methods - Range check
- Type check
- Presence check
- Length check
- Lookup check
- Picture check
- Check digit
12Range Check
- Range is very simple.
- This involves a lower and an upper boundary for
which a value can be entered. - For instance, 0-100. The number 50 would be
accepted as it falls within the boundaries, but
the number 101 would exceed the boundary and thus
be rejected.
13Type Check
- This check prevents incorrect data types to be
submitted. - For example, entering the word two into a field
which was expecting a numerical value would
return an error as two is in text format.
14Presence Check
- You come across these all the time on websites
which ask for certain information to be included. - The system will insist that you enter these
pieces of data before proceeding to the next
section.
15Length Check
- Length checks prevent more characters being
entered than is allowed. - The word shoe has a length of 4.
- If we set the limit to 4 then shoes wouldnt be
allowed.
16Lookup Check
- A lookup check takes a value and compares it to a
set of values in another table. - If a match is made then a result is returned.
- If no match is made then an error is returned.
- An example of this would be entering a students
test score into a field and the system returning
the students grade.
17Picture Check
- Also known as an Input Mask or Format Check.
- This type of check ensures data is entered in a
predefined way. - A good example of this is when dealing with
dates. - There are many ways to submit a date
- 01/Jan/2008
- 01/01/2008
- 1/1/08
- Etc
- A Picture check will define how the date must be
entered.
18Check Digit
- A check digit is a value which is worked out by
performing a calculation on a number and then is
added to the end of that number. - ISBN numbers have check digits.
- The ISBN for the text book is
- 978-0-340-95825-5
- The check digit is 5.
- Before 2007, when ISBN numbers had 10 numbers,
the check digit was calculated using Modulus-11. - New ISBN numbers are calculated using the modulus
10 method.
19Modulus-10
Remove the check digit. Then write out the
numbers in a table like this. The code starts at
2, and increments by 1, going from right to left.
ISBN 0 3 4 0 9 5 8 2 8
Code 10 9 8 7 6 5 4 3 2
Multiply the number by the code below.
ISBN 0 3 4 0 9 5 8 2 8
Code 0 27 32 0 54 25 32 6 16
Add up all the numbers. 027320542532616
192 Divide the number by 11. 192/11 17
remainder 5 Take the remainder from 11. Check
Digit 11 - 5 6 If the remainder is 0 the
check digit is 0. If the remainder is 1 then the
check digit is X.
20Modulus-13
Remove the check digit. Then write out the
numbers in a table like this. From right to left,
alternate the weighting code from 3 and 1.
ISBN 9 7 8 0 3 4 0 9 5 8 2 8
Code 1 3 1 3 1 3 1 3 1 3 1 3
Multiply the number by the code below.
ISBN 9 7 8 0 3 4 0 9 5 8 2 8
Code 9 21 8 0 3 12 0 27 5 24 2 24
Add up all the numbers. 921803120275242
24 135 Divide the number by 10. 135/10 13
remainder 5 Take the remainder from 10. Check
Digit 10 - 5 5 If the remainder is 0 the
check digit is 0. If the remainder is 1 then the
check digit is X.
21Take note
- In a spreadsheet, try creating a working Check
Digit Checker. - The spreadsheet should be able to calculate a
check digit using the ISBN number and then
compare the result with the actual check digit. - It should say whether it is valid or not.
- To work out a remainder use the MOD() function.
22Take note
- Use modulus-11 on these ISBN numbers.
- For numbers with incorrect digits replace them
with correct ones. - 1-854-87918-9
- 0-552-77109-X
- 0-330-28414-3
- 0-330-34742-X
- 0-330-35183-3
23Verification
- Verification is not making sure that data is
correct, but rather making sure data hasnt been
changed in any way. - There are two ways of carrying out verification
checks - Double Entry
- Manual verification
24Double Entry
- Basically, entering in data twice.
- For example, some websites ask you to type in
your email address twice. This lowers the risk
of entering in an address incorrectly. - If the emails do not match the website will ask
you to check them. - However, if you enter the email address
incorrectly both times and make the same mistake,
then the website will miss the mistake!
25Manual verification
- This is like proof reading. A person may read
data from a paper source and then type them into
a computer system. - Humans arent very reliable and often make
mistakes. - Common mistakes include
- Transcription errors
- Transposition errors
26Transcription Errors
- This may involve pressing the wrong key
accidently. - For example,
- Surname Mouse Mowse or Mouce
27Transposition Errors
- This is where two characters have been accidently
reversed. - For example
- Surname Mouse Muose or Moues
28Accuracy
- Just because we have use of validation and
verification checks doesnt mean data is
accurate. - For example, a number entered could still pass a
range check, or a presence check can be validated
because someone pressed the space bar in the
field.
29Take note
- Describe two methods of verification.
- Give two disadvantages of double entry
verification. - Give one advantage of manual verification.
- Explain why verification and validation can not
ensure that data is entered accurately but do
explain why they are useful despite these
problems.