Sound Music MIDI in Multimedia - PowerPoint PPT Presentation

View by Category
About This Presentation

Sound Music MIDI in Multimedia


Computer Representation of Sound, speech and MIDI – PowerPoint PPT presentation

Number of Views:100
Updated: 12 October 2015
Slides: 42
Provided by: bimray8729


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Sound Music MIDI in Multimedia

  • Mr.Bimal Kumar Ray
  • Dept. of Information Science Telecommunication
  • Ravenshaw University

What is Sound/Audio
  • The perception of sound by human beings is a very
    complex process.
  • The detector which receives and interprets the
  • Sound is the combination of both high and low
    pressure which is propagated through the air
    medium in the form of wave.
  • Sound is a physical phenomenon(situation)
    produced by the vibration of matter and
    transmitted as waves.
  • Sound is always Non-periodic.
  • Sound is mechanical wave that is an oscillation
    of pressure transmitted through a
    solid,liguid,gas composed of frequency within the
    range of hearing.
  • To create sound your computer feeds electricity
    at a certain wave length through speaker.
  • Every sound is compression of waves of many
    different frequencies and shapes. But the
    simplest sound we can hear is a sine wave.

What is Sound/Audio
  • Sound waves can be characterised by the following
  • Period, Pitch, Volume, Frequency, Amplitude,
    Bandwidth, sampling, Loudness and Dynamic.
  • Period the interval at which a periodic signal
    repeats regularly.
  • Pitch a perception of sound by human beings. It
    measures how high is the sound as it is
    perceived by a listener.
  • Volume the height of each peak in the sound
  • Frequency(sometimes referred to as pitch) the
    distance between the peaks. The greater the
    distance, the lower the sound.

What is Sound/Audio
  • Loudness important perceptual quality is
    loudness or volume.
  • Amplitude is the measure of sound levels. For a
    digital sound, amplitude is the sample value.
  • The reason that sounds have different loudness is
    that they carry different amount of power. the
    unit of power is watt.
  • The sounds is measured the unit Bel or more
    commonly deciBel (dB). examples for
  • 160 dB Jet engine
  • 130 dB Large orchestra
  • 100 dB Car/bus on highway
  • 70 dB Voice conversation
  • 50 dB Quiet residential areas
  • 30 dB Very soft whisper
  • 20 dB Sound studio

What is Sound/Audio
  • To include sound in a multimedia application, the
    sound waves must be converted from analog to
    digital form
  • This conversion is called sampling every fraction
    of a second a sample the of sound is recorded in
    digital bits
  • Two factors affect the quality of digitized sound
  • Sample rate the number of times the sample is
  • Most common sampling rates are 11.025, 22.05,
    and 44.1 kHz
  • Sample size the amount of information stored
    about the sample
  • Most common sampling sizes are 8 and 16 bit

What is Sound/Audio
  • Dynamic range means the change in sound levels.
  • For example, a large orchestra can reach 130dB at
    its climax and drop to as low as 30dB at its
    softest, giving a range of 100dB.
  • Bandwidth is the range of frequencies a device
    can produce, or a human can hear
  • FM radio 50Hz 15kHz
  • Childrens ears 20Hz 20kHz
  • Older ears 50Hz 10kHz
  • Ultra-sound 20kHz
  • Hyper-sound 1GHz -

Computer Representation of Sound
  • Sound waves are continuous while computers are
    good at handling discrete numbers.
  • In order to store a sound wave in a computer,
    samples of the wave are taken.
  • Each sample is represented by a number, the
  • This process is known as digitisation.
  • This method of digitising sound is know as pulse
    code modulation (PCM)
  • This is why one of the most popular sampling rate
    for high quality sound is 4410Hz.
  • Another aspect we need to consider is the
    resolution, i.e., the number of bits used to
    represent a sample.
  • 16 bits are used for each sample in high quality
  • Different sound card have different capability of
    processing digital sounds.

Computer Representation of Sound
  • Recording and Digitising sound
  • An analogue-to-digital converter (ADC) converts
    the analogue sound signal into digital samples.
  • A digital signal processor (DSP) processes the
    sample, e.g. filtering, modulation, compression,
    and so on.
  • Play back sound
  • A digital signal processor processes the sample,
    e.g. decompression, demodulation, and so on.
  • A digital-to-analogue converter (DAC) converts
    the digital samples into sound signal.

Quality vs File Size
  • The size of a digital recording depends on the
    sampling rate, resolution and number of channels.
  • S R x (b/8) x C x D
  • Higher sampling rate, higher resolution gives
    higher quality but bigger file size.
  • S file size bytes
  • R sampling rate samples per second
  • b resolution bits
  • C channels 1 - mono, 2 - stereo
  • D recording duration seconds

Quality vs File Size
  • For example, if we record 10 seconds of stereo
    music at 44.1kHz, 16 bits, the size will be
  • S 44100 x (16/8) x 2 x 10
  • 1,764,000bytes
  • 1722.7Kbytes
  • 1.68Mbytes
  • Note 1Kbytes 1024bytes
  • 1Mbytes 1024Kbytes
  • High quality sound files are very big, however,
    the file size can be reduced by compression.

Audio File Formats
  • The most commonly used digital sound format in
    Windows systems is .wav files.
  • Sound is stored in .wav as digital samples known
    as Pulse Code Modulation
  • Each .wav file has a header containing
    information of the file.
  • type of format, e.g., PCM or other modulations
  • size of the data
  • number of channels
  • samples per second
  • bytes per sample
  • There is usually no compression in .wav files.
  • Other format may use different compression
    technique to reduce file size.
  • .vox use Adaptive Delta Pulse Code Modulation
  • .mp3 MPEG-1 layer 3 audio.
  • RealAudio file is a proprietary format. (.ra
    .ram .rm)

Audio File Formats
  • WMA Windows Media Audio (.wma)
  • Windows Media Audio is a Microsoft file format
    for encoding digital audio files similar to MP3
    though can compress files at a higher rate than
  • MOV (movie) basically a video format where the
    pictures are omitted.
  • RIFF Resource Interchange File Format
  • a Microsoft developed format capable of
    handling digital audio and MIDI.
  • SDMI (Secure Digital Music Interface)
  • Designed to protect against most
    forms of unauthorised copying
  • SND (sound) limited to 8 bits with
    interpreters for the PC available.
  • Ogg (.ogg) Ogg is an audio compression format,
    comparable to other formats used to store and
    play digital music. It uses a specific audio
    compression scheme that's designed to be
    contained in Ogg.

Audio File Formats
  • AIFF Audio Interchange File Format is mostly
    used by Silicon Graphics. AIFF files are easily
    converted to other file formats, but can be quite
    large. One minute of 16-bit stereo audio sampled
    at 44.1 kHz usually takes up about 10 megabytes.
  • Dolby Digital Surround Sound Also known as AC3
    (Audio Coding), or Dolby 5.1 (where .1 indicates
    subwoofer bass channel). Dolby Digital has been
    chosen as the standard sound technology for DVD
    (digital video disk) and HDTV (High definition
  • Dolby Digital Surround Sound Digital Track on
    Film It is a digital encoded system of 6
    separate and independent surround sound channels,
    for 6 speakers (Front (Left/right), Rear
    (left/right), Front center and Sub-woofer.
  • MIDI - Musical Instrument Digital Interface
  • MIDI representation of a sound includes
    values for the notes, pitch, length, and volume.
  • It can also include additional
    characteristics, such as attack and delay time.

  • Music can be described in a symbolic way.
  • Music is the art of arranging tones in an orderly
    sequence so that produce a sound. 
  • Music is an art form whose medium is sound and
  • Music common elements are pitch, notes, scales
    and tempo etc.
  • Any sound may be represent in that way including

Musical Instrument Digital Interface
  • MIDI interface between electronics musical
    instruments and computers is a small piece of
    equipment that plugs directly into the computers
    serial port and allows the transmission of music
  • MIDI represents a set of different musical
    instruments to exchange musical information.
  • MIDI protocol is an entire music description
    language in binary forms.

Musical Instrument Digital Interface
  • MIDI each word describing an action of musical
    performance is assigned a specific binary code.
  • MIDI data is communicated digitally through a
    production system as a string of MIDI messages.
  • MIDI is a standard control language and hardware
  • MIDI allows equipment electronic musical
    instruments and devices to communicate real-time
    and non real-time performance and control data.

Musical Instrument Digital Interface
  • MIDI interface is 2 different components are
  • 1. Hardware to connect the equipment
  • The physical connection of musical instruments.
  • MIDI cable and processes electrical signals
    received over the cable.
  • 2. Data Format Encodes
  • Information to be processing by the hardware.
  • The MIDI data format does not include the
    encoding of individual sampling values such as
    audio format.

MIDI Connection
  • A computer can control output of individual MIDI
  • MIDI device to communicate with other MIDI
    devices over channels.
  • Musical data transmitted over a channel are
    reproduced in the synthesizer at the receive end.
  • Synthesizer an electronic musical instrument,
    typically operated by a keyboard, producing
    sounds by generating and combining signals of
    different frequencies.
  • The computer can use the same interface to
    receive, store, and process encoded musical data.
  • A computer uses the MIDI interface to control
    instruments for playout.

  • The microprocessor communicates with the keyboard
    to know what notes the musician is playing and
    with the control panel to know what commands the
    musician wants to send to the microprocessor.
  • Pressing keys on the keyboard signals the micro-
    processor what notes to play and how long to play
  • Sound generator is to produce an audio signal.
  • Sound generator changes the quality of sound. for
    examples are pitch, loudness, notes, tone etc

  • Sequencer
  • replay a sequence of MIDI messages
  • MIDI Interface
  • connect a group of MIDI devices together
  • Sound Sampler
  • record sound, then replay it on request
  • Can perform transposition shift of one base
    sample, to produce different pitches
  • Can take average of several samples,
  • then produce a unique quality inter-polated
  • output sound.
  • Control Panel
  • - Control all the MIDI Devices
  • Memory to store all information for sound

  • Keyboard (MIDI I/O)
  • i. Note Polyphony
  • Now a days, most keyboard have polyphony
  • ii. Touch response
  • A keyboard can sense different levels of input
  • Keyboard synthesizer keyboard synthesizer
  • have real-time audio output
  • Some keyboard synthesizers support DSP
  • (Digital Signal Processing)
  • Which gives more available effects echo, chorus
  • you can then compose and make music,
  • just with a keyboard
  • Guitar, Flute, Violin, Drumset

  • Controllers
  • Numbered controllers
  • e.g. volume panel
  • Continuous Controllers
  • You can roll the controller to get a particular
  • e.g. modulation wheel
  • On/Off Controllers
  • can send two different values (e.g. 0/127)
  • e.g. foot pedal (sustain pedal)

  • MIDI uses a specific data format for each
  • MIDI data format is digital and data are group of
  • The message is transmitted to connected system to
    the computer.
  • A musician play a key, the MIDI interface
    generates a MIDI message that defines the start
    of each strike and intensity.
  • Musician release the key to create digital sound
  • and transmitted.
  • Messages are assigned to channels .
  • - a channel is a separate path through which
    signals can flow.
  • Devices set to respond to particular channels
  • Every message (except system messages) have a
    channel number which is stored in bits 0..3 of
    the status byte

  • 1. MIDI Channel Messages have 4 modes
  • Mode 1 Omni On Poly, usually for testing
  • Mode 2 Omni On Mono, has little purpose
  • Mode 3 Omni Off Poly, for general purpose
  • Mode 4 Omni Off Mono, for general purpose
  • where
  • i. Omni On/Off
  • respond to all messages regarding of their
  • ii. Poly/Mono
  • respond to multiple/single notes per channel
  • 2. Channel Voice Messages
  • Carries the musical component of a piece. usually
    has 2 types
  • i. status byte
  • the first 4 most expressing bits identify the
    message type,
  • the 4 last expressing bits identify which channel
    is to be affected
  • ii. data byte
  • the most expressing (significant) bit is 0,
    indicating a data byte.
  • The rest are data bits

  • Real-time System Messages
  • Start
  • 1st byte Status byte? 11111010
  • Direct slave devices to start playback from time
  • Stop
  • 1st byte Status byte? 11111100
  • direct slave devices to stop playback
  • song position value doesnt change
  • ? can restore the playback at the place where it
    stops with the continue message
  • Continue
  • 1st byte Status byte? 11111011
  • direct slave devices to start playback from the
    present song position value

  • System Reset
  • 1st byte Status byte? 11111111
  • devices will return the control value to default
  • e.g. reset MIDI mode / program number assigned to
  • System Exclusive messages
  • MIDI specification cant address every unique
    need of each MIDI device
  • leave room for device-specific data
  • sysEx message are unique to a specific
  • 1st byte Status byte? 11110000
  • 2nd byte manufacturer ID,
  • e.g. 1 sequential, 67Yamaha
  • 3rd byte (onwards) data byte(s)

  • Music Recording and Performance Application.
  • Recording of MIDI Message as they enter the
    computer from other MIDI device, store, editing
    and play back the message in performance.

This is a Daisy-chain network, where device are
connected serially.
  • Recording software
  • Ex Sony Sound Forge, sonar, cool edit pro etc
  • Much more efficient than using tape recording
  • Can redo recording process
  • Can easily do editing
  • Also allows effects (reverb, echo, chorus etc)

  • Musical Notations and Printing Application
  • writing music traditional musical notion.
  • The user can then play back the music using a
    performance program or print the music on paper
    for live performance publication.
  • Music Education Application
  • Synthesizer Patch Editor and Librarians
  • Information stage of different synthesizer
    patches in the computer memory and editing of
    patches in the computer.

  • 1.Studio Production
  • recording, playback, and editing
  • creative control/effect can be added
  • 2. Making score
  • with score editing software, MIDI is excellent in
    making score
  • some MIDI software provide function of auto
  • 3. Learning
  • You can write a MIDI orchestra, who are always to
    practice with you
  • 4. Commercial products
  • mobile phone ring tones, music box music..
  • 5. Musical Analysis
  • MIDI has detailed parameters for every input note
  • It is useful for doing research
  • For example, a pianist can input his performance
    with a MIDI keyboard, then we can analyze his
    performance style by the parameters

Introduction in Speech
  • The expression of the ability to express
    thoughts and feelings by articulate (fluent)
  • Speech is our basic communication tool.
  • Speech power of speaking oral communication.
  • We have been hoping to be able to communicate
    with machines using speech.
  • Speech output deals with the machine generation
    of speech.
  • Voice speech signals have an almost periodic
    structure over a certain time interval.
  • The spectrum of some sounds has characteristic
    maxima that normally involve up to five

Speech Generation
  • Speech generation is a very interesting field for
    multimedia systems.
  • Speech recognition is the foundation of human,
    computer interaction using speech.
  • Speech generation is real-time signal generation.
  • Speech must be understandable and sound natural.
  • Speech recognition in different contexts
  • Dependent or independent on the speaker.
  • Discrete (individual) words or continuous speech.
  • Small vocabulary or large vocabulary.
  • In quiet environment or noisy environment.

Digital Speech
Speech Generation
  • A major challenge in speech output is how to
    generate these signals in real time for a speech
    output system to be able, for instance, to
    convert text to speech automatically.
  • The most important technical terms used in
    relation to speech output, including Speech
    basic frequency means the lowest periodic signal
    share in the speech signal.
  • A voiced sound is generated by oscillations of
    the vocal cords. The characters M, W, and L are
  • Unvoiced sounds are generated with the vocal
    cords open.
  • for example, F and S.

Reference patterns
Comparison and decision algorithm
Parameter analyzer
Language model
Voiced and Unvoiced Speech
Speech Synthesis
  • Speech synthesis is to generate speech with
    strong properties (pitch, speed, loudness etc.)
  • Speech synthesis has been widely used for
    text-to-speech systems and different telephone
  • The easiest and most often used speech synthesis
    method is waveform concatenation.

Increase the pitch without changing the speed
Speech Analysis
  • The primary quality characteristic of each speech
    recognition session is determined by a
    probability of to recognize a word correctly. A
    word is always recognized only with a certain
  • Speech analysis can serve to analyze who is
    speaking that is to understanding, recognize a
    speaker for his identification and verification.
  • The computer identifies and verifies fingerprint,
  • .

Speech Transmission
  • Speech processing and speech transmission
    technology are expanding fields of active
  • Speech transmission is a field relating to highly
    efficient encoding of speech signals to enable
    low-rate data transmission over network.
  • New challenges arise from the anywhere, anytime
    of mobile communications.
  • Internet based transmission protocols, such as
    Voice over IP.
  • Advances in digital speech transmission provides
    an up-to-date overview of the field, including
    topics such as speech coding in heterogeneous
    communication networks, wideband coding, and the
    quality assessment of wideband speech.