# DATA AND DATA COLLECTION - PowerPoint PPT Presentation

PPT – DATA AND DATA COLLECTION PowerPoint presentation | free to view - id: 4464c0-MDNmN

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## DATA AND DATA COLLECTION

Description:

### DATA AND DATA COLLECTION Lecture 3 What is STATISTICS? Statistics is a discipline which is concerned with: designing experiments and other data collection ... – PowerPoint PPT presentation

Number of Views:219
Avg rating:3.0/5.0
Slides: 44
Provided by: JustDj
Category:
Tags:
Transcript and Presenter's Notes

Title: DATA AND DATA COLLECTION

1
DATA AND DATA COLLECTION
• Lecture 3

2
What is STATISTICS?
• Statistics is a discipline which is concerned
with
• designing experiments and other data collection,
• summarising information to aid understanding,
• drawing conclusions from data, and
• estimating the present or predicting the future.

3
What is STATISTICS?
• A branch of applied mathematics concerned with
• the collection
• interpretation of quantitative data
• the use of probability theory to estimate
population parameters

4
What is STATISTICS?
• I like to think of statistics as the science of
learning from data
• Jon Kettenring 1997, President, American
Statistical Association

5
Jun-07 GSE All Share Index
1 5,225.80
4 5,226.04
5 5,226.40
6 5,226.72
7 5,226.92
8 5,237.34
9 5,237.74
10 5,238.90
13 5,238.99
14 5,250.31
15 5,258.52
18 5,263.37
19 5,263.58
6
(No Transcript)
7
(No Transcript)
8
What is Data?
• It is a collection of facts from which meaningful
conclusions can be drawn.
• Examples
• names,
• numbers,
• text,
• graphics,
• Decimals.
• The singular form is datum and the plural form is
data.

9
Types of Data
• Qualitative
• Quantitative

10
Qualitative
• Qualitative data is not provided numerically.
• They are non numeric data.
• E.g. colour, race, geographical region, industry,
sex, type of car, place of birth, etc.
• Qualitative data may also be referred to as
categorical.

11
Qualitative
• Quantitative data is given numerically numeric
data.
• This can be further categorised into
• Discrete
• continuous

12
Quantitative
• Discrete data are numeric data that have a finite
number of possible values and represents counts.
• finite subset of the counting numbers, 1, 2, 3,
4, and 5 or
• how many students were present on a given day.
• The representation of discrete is by use of
integers. E.g. Number of firms listed on Ghana
Stock Exchange

13
Quantitative
• Continuous quantitative data have infinite
possibilities.
• They can be represented by real numbers.
• These are continuous with no gaps or
interruptions.
• Physically measurable quantities of length,
volume, time, mass, etc. are generally considered
continuous.
• At the physical level, especially for mass, this
may not be true.
• E.g. company profit, Height, mass and length.

14
Quantitative
• The structure and nature of data will greatly
affect the choice of analysis method.

15
Data-Cross Sectional
• Data sets may also be described as
• cross-sectional
• time series
• Cross-sectional data refers to data collected by
observing many subjects (such as individuals,
firms or countries/regions) at the same point of
time, or without regard to differences in time.
• Cross sectional data defines data set containing
observations on multiple phenomena observed at a
single point in time.

16
Data-Cross Sectional
• the values of the data points have meaning, but
the ordering of the data points does not.
• Analysis of cross-sectional data usually consists
of comparing the differences among the subjects.
E.g.

17
Data-Time Series
• Time series data is a sequence of numerical data
points in successive order, usually occurring in
uniform intervals.
• A sequence of numbers collected at regular
intervals over a period of time.
• Stated in yet another way, time series data is a
data set containing observations on a single
phenomenon observed over multiple time periods.

18
Data-Time Series
• In time series data, both the values and the
ordering of the data points have meaning.

19
Data Panel
• A data set containing observations on multiple
phenomena observed over multiple time periods is
called panel data.
• the second dimension of data may be some other
than time.
• when there is a sample of groups, like company
subsidiaries, and several observations from every
group, the data is panel data.

20
Data Panel
• Whereas time series and cross-sectional data are
both one-dimensional, panel data sets are
two-dimensional.
• Some data sets could possess more than two
dimensions.
• In such case the nomenclature is
multi-dimensional panel data.

21
Source of Data
• Primary
• Secondary.

22
Source-Primary
• Primary data is gathered specifically for a
research project.
• Data collected from the original source.
• Examples include data from
• focus groups,
• telephone surveys,
• Interviews
• questionnaires.

23
Source-Primary
• Advantages of primary data
• Collection based on researcher's need
• Control over measurement selection and execution
• timeliness of the data can be controlled
• representativeness of the data can be ensured

24
Source-Primary
• Advantages of primary data
• type of information desired can be directly
determined by the design of the questions.
• collected to fit the specific purpose
• data are current
• secrecy can be maintained

25
Source-Primary
• Disadvantages of primary data
• Expensive
• Time-consuming
• Quality declines if interviews are lengthy
• Reluctance to participate in lengthy interviews

26
Source-Secondary
• Secondary data is information that has already
been collected and is available to the public.
• Examples
• population statistics from the Ghana Statistical
Service (GSS) Census Office,
• economic indicators from the GSS,
• Trading data from GSE,
• information in government documents,
• Industry and trade journals.

27
Source-Secondary
• data contained in published accounts of
organisations.
• Many businesses and organisations also collect
information about their customers or clients
(such as where they live), and this is also
considered secondary data

28
Types of Secondary data
29
Source-Secondary
• Advantages of secondary data
• Little cost or time required to access data
(inexpensive)
• Not confined to immediate level or unit of
analysis
• available more quickly
• Several sources are available.

30
Source-Secondary
• Advantages of secondary data
• Saves time and money if on target
• Aids in determining direction for primary data
collection
• Pinpoints the kinds of people to approach
• Serves as a basis of comparison for other data

31
Source-Secondary
• Disadvantages of secondary data
• Information may be outdated
• May not be suitable
• Methodology for collection may be inappropriate.
• May not be on target with the research problem
• Quality and accuracy of data may pose a problem

32
Evaluating Secondary Data
• Overall suitability
• Precise suitability
• Costs and benefits

33
Evaluating Secondary Data
• Overall suitability
• Does the data set contain the information you
require to answer your research question(s) and
• Do the measures used match those you require?
• Is the data set a proxy for the data you really
need?
• Does the data set cover the population which is
the subject of your research?

34
Evaluating Secondary Data
• Overall suitability
• Can data about the population which is the
subject of your research be separated from
• unwanted data?
• Are the data sufficiently up to date?
• Are data available for all the variables you
require to answer your research question(s) and

35
Evaluating Secondary Data
• Precise suitability
• How reliable is the data set you are thinking of
using?
• How credible is the data source?
• Is the methodology clearly described?
• If sampling was used what was the procedure and
what were the associated sampling errors and
response rates?

36
Evaluating Secondary Data
• Precise suitability
• Who were responsible for collecting or recording
the data?
• (For surveys) is a copy of the questionnaire or
interview checklist included?
• (For compiled data) are you clear how the data
were analysed and compiled?

37
Evaluating Secondary Data
• Precise Suitability
• Are the data likely to contain measurement bias?
• What was the original purpose for which the data
were collected?
• Who were the target audience and what was their
relationship to the data collector or compiler?

38
Evaluating Secondary Data
• Precise Suitability
• Have there been any documented changes in the way
the data are measured or recorded, including
definition changes?
• How consistent are the data obtained from this
source when compared with data from other
sources?

39
Evaluating Secondary Data
• Costs and benefits
• Are you happy that the data have been recorded
accurately?
• What are the financial and time costs of
obtaining these data?
• Have the data already been entered into a
computer?
• Do the overall benefits of using this secondary
data source outweigh the associated costs?

40
Methods of Data Collection
• Census
• Survey

41
Sample Selection
• Population
• Sample frame
• Sample size
• Sampling error

42
Principles of Sampling
• Probability
• Non-Probability

43
Methods of Sampling
• Random Sampling
• Purposive
• Stratified sampling
• Systematic sampling
• Multi-stage and multi-phase