IT Essentials II Network Operating Systems - PowerPoint PPT Presentation


PPT – IT Essentials II Network Operating Systems PowerPoint presentation | free to view - id: 13ab74-OTk5Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

IT Essentials II Network Operating Systems


Tom's Root/Boot Disk (tomsrtbt), has a bootable root file' ... Custom Boot Disk required if the system being worked on contains hardware that ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 25
Provided by: richar509


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: IT Essentials II Network Operating Systems

IT Essentials II Network Operating Systems
  • Chapter 13
  • Troubleshooting the Operating System

Identifying Problems
  • Most problems can be assigned to the following
  • Hardware A component has malfunctioned, or
    expected but not present.
  • Kernel A bug or lack of functionality, i.e.
    module not loaded in the system kernel sometimes
    causes problems of ambiguous origin.
  • Application software User level application
    software or command utilities may behave
    strangely, or simply collapse.
  • Configuration System services or application
    software may be misconfigured.
  • User error One of the most frequent sources of
    error conditions is caused by computer users
    attempting to do something the wrong way.
  • All of the above can be categorized as
  • Consistent one that is reliably and demonstrably
    occurring again and again.
  • Inconsistent those that occur only sporadically,
    or under indeterminate conditions.

Identifying Application Problems
  • Common signs of application bugs are
  • Failure to execute
  • program wont start up at all, main file might
    not have permission to execute.
  • Or seem to start, but fails to initialize
    entirely, and either exits or stalls part way up.
  • Program crashes data not saved
  • Sometimes error messages are recorded
  • Sometimes a core file is left behind, indicating
    the application itself suffered a catastrophic
  • variant of this locked up program, application
    left running unable to proceed
  • Resource Exhaustion
  • refer primarily to CPU time, memory, and disk
  • An application consumes too much memory and
    ultimately begin to swap so badly that the whole
    system is affected
  • Program Specific Misbehaviour
  • To do with the running program itself

Configuration Problems
  • Can present themselves n many ways, i.e. poor
    screen resolution when running a high end monitor
    and graphics card, Xconfigurator program may need
    to be run
  • Programs that depend on networking services are
    particularly liable to cause problems
  • first place to look is in the configuration file
  • If configuration problem only happens to one
    person in a group, it is liable to be caused by
    something that person did.

System Tools and Utilities
  • Utilities return information about how the system
    or a file should be configured, but they which
    exact file or system configuration is
  • setserial utility provides information and set
    options for the serial ports on the system
  • lpq command that helps resolve printing
    problems, display all the jobs that are waiting
    to be printed
  • ifconfig - entered at the shell to return the
    current network interface configuration of the
  • route displays or sets the information on the
    routing of the system

Fixing Persistent Problems and Log Files
  • Most log files are located in the /var/log
    directory or a subdirectory
  • Log files can be used to
  • Monitoring System Loads - Server need to handle
    requests efficiently. Log files can be used to
    determine what requests are being made that might
    cause the server to run slowly
  • Intrusion Attempts and Detection - examination of
    system log files can help in finding out how and
    where the intrusion occurred, as well as what
    changes the attacker made to the system or
  • Normal System Functioning - log files can be
    examined to ensure the system is functioning
    normally. If something is wrong the information
    in the log files can help identify and eliminate
  • Missing Entries- any log files missing entries
    can indicate something on the server is not
    functioning properly or is misconfigured
  • Error Messages - many log files will contain
    various error messages that can be used to locate
    and identify problems or misconfiguration within
    the server

Ftab and Lilo Boot Errors
  • dmesg command can be used to display the recent
    kernel messages, also known as the kernel ring
    buffer. i.e. using variants of this command you
    can find details about drivers etc.
  • LILO boot loader is the first piece of code that
    takes control of the boot process form the BIOS.
    It loads the Linux kernel, and then passes
    control entirely to the Linux kernel, if LILO is
    not working properly the system wont boot.
    Following are some of the LILO error code
  • No error - codes LILO hasnt loaded
  • L error-code - LILO has started to boot but it is
    unable to boot the 2nd stage boot loader.
    (error-code two-digit number generated by BIOS).
  • LI - LILO has started and the 1st and 2nd stage
    loaders have been loaded, but the 2nd stage
    loader wont run
  • LI101010 LILO has been loaded and running
    properly but it cannot locate the kernel image
  • LIL - 1st and 2nd stage loaders successfully
    loaded and are running, but LILO is unable to
    read the information it needs to work
  • LIL? 2nd stage boot loader has been loaded
    correctly but is at an incorrect address
  • LILO - LILO has loaded and is running, indicating
    no problem with LILO that is causing the system
    not to boot

Booting Without LILO
  • When LILO fails completely the following can be
  • LOADLIN - DOS utility
  • comes with the installation CDs and located in
    dosutils directory.
  • to use, use a DOS partition or a DOS boot disk,
    a copy of LOADLIN.EXE, and a copy of the Linux
  • Boot from Raw kernel on a floppy
  • kernel copied to floppy disk using dd ifvmlinuz
    of/dev/fd0 command, where again vmlinuz is the
    name of the kernel
  • LILO on a floppy
  • Most preferred methods as it is the fastest.
  • To install LILO on a floppy edit lilo.conf file.
    Changing the boot line to boot/dev/fd0.

Emergency Boot System
  • Linux provides an emergency system copy of LILO,
    to boot system if original fails.
  • Known as the Emergency Boot System.
  • To use this copy of LILO configuration changes
    must be made in lilo.conf. The steps to take to
    make these changes are listed as follows
  • Change where the regular disks root partition is
  • Mount it somewhere in the emergency boot system
    like /mnt/std.
  • Ensure the /boot directory is in its own
    partition. Mount it instead of or in addition to
    the root partition.
  • Last changes the kernel images and other boot
    options to what is normal.
  • i.e. the boot and root options should point to
    the regular hard disk.

LILO Bootlabel An example of the GUI
configuration screen in which root mount is
Emergency Boot Disks
  • There are various types of Linux Boot Disks
  • Linux Installation Disks included in media used
    to install the OS in the first place. At lilo
    prompt type linux rescue
  • Toms Root/Boot Disk (tomsrtbt), has a bootable
    root file. downloadable from the Internet and
    will fit onto a floppy disk.
  • ZipSlack - available for Slackware Linux. Can be
    installed on a small partition or on a removable
    drive slightly too big to fit a floppy.
  • Demo Linux or SuSE Evaluation - one of the better
    emergency boot disk utilities available because
    it is the most complete, must be burned onto
  • Custom Boot Disk required if the system being
    worked on contains hardware that needs special
    drivers or other specialities.
  • Simplest method of creation modify an existing
    boot disks, by adding required extras.

Package Dependency Problems
  • Some packages require other packages or libraries
    to run.
  • Linux will usually notify users if a package has
  • A few examples of events that can cause
    dependency problems and conflicts are listed
  • Missing libraries or support programs Libraries
    are a type of support code used by many programs,
    as if they were part of the program itself.
  • Incompatible libraries or support programs -
    different versions of libraries and support
    programs available and correspond to current and
    past versions of programs installed. The correct
    version therefore needs to be used.
  • Duplicate Files or Features - can cause programs
    to not function correctly.

Solutions to Package Dependency Problems
  • Force the Installation
  • If the error is on a package the user has
    manually compiled the source code for, then
    installation can be forced.
  • Note The xxxxxxxx.rpm represents any rpm package
  • Modify the System The correct and recommended
    method solutions is to modify the system so that
    it has the necessary dependencies needed to run
  • Rebuild the Problem Package from Source Code In
    some instances it may be necessary to rebuild the
    package from source code if there are dependency
    error messages showing up.
  • Some dependencies are caused by recompilation of
    the program and dependencies changing.
  • To rebuild an RPM call the rpm with the
    rebuild command

Application Failure
  • Difficult to spot, as they dont present in an
    obvious way. No error message will be given
    outlining the problem. Problems include

Troubleshooting Loss of Network connectivity
  • Most basic networking problem is the inability of
    two computers to communicate. This can be due to
    hardware and/or software problem.
  • First rule check for physical connectivity.
  • ensure cables are properly plugged in at both
  • the network adapter is functioning (check the
    link light)
  • the hub status lights are on,
  • and no simple hardware malfunctions have happened

Operator Error
TCP/IP Utilities and Troubleshooting Steps
Connectivity testing Ping Traceroute
Other Tools
  • Windows 2000 Diagnostic tools
  • Netdiag runs a standard set of network tests and
    generates a report of the results
  • Pathping a combination of the ping command and
    the tracert command
  • Wake-On-LAN (WOL) - used to enable an
    administrator to power up a computer by sending a
    signal (magic packet) to the NIC with WOL

Disaster Recovery
Risk Analysis
RAID Redundacy
As well as disk, other components in the server
can be configured for redundancy, including.
Power supplies UPS Cooling fans Network
interface adapters Processors
  • Group of independent computers working together
    as a single system, two main types
  • shared-device model, applications running in a
    cluster can access any hardware resource
    connected to any node/server in the cluster
  • nothing-shared model, each node has ownership of
    a resource, so there is no competition for the
  • Used to ensure mission-critical applications and
    resources are as highly available as possible
  • Advantages of Clustering
  • Fault tolerance can support the failure of
    components up to a complete computer without
    impacting the capability to support
    mission-critical applications.
  • High availability the cluster will not be
    unavailable for reasons such as maintenance,
    upgrades, or configuration changes. A correctly
    configured cluster will come very close to 100
  • Scalability resources can be added to the
    cluster transparently to the system users.
  • Easier manageability servers can be managed as a
    group, dramatically reducing the number and
    amount of management tasks when compared to an
    equivalent number of standalone servers.
  • Disadvantages of clustering
  • Clusters can be significantly more expensive than
    the equivalent standalone servers, due to the
    additional software and specialized hardware.
  • More complex than setting up a server.

Hot Swapping, Warm Swapping and Hot Spares
  • Hot swap (also known as hot pluggable) - the
    capability to add and remove from a computer
    while its running and have the operating system
    automatically recognize the change.
  • i.e. hard drives, particularly useful in
    conjunction with a hardware RAID controller.
  • Hot-spare - a component kept on hand in case of
    an equipment failure.
  • Examples include
  • disk drives,
  • RAID controllers,
  • NICs,
  • and any other critical component that could be
    used to replace a failed component.
  • In some mission-critical environments, an entire
    server can be designated hot spares.
  • Warm swap - compromise between hot swap and hot
    spare. Generally done in conjunction with hard
    drive failure.
  • Shut down the disk array before the drive can be
    replaced. And stop all I/O for that array, users
    cannot access the system.
  • Referred to as a warm swap because the server
    does not have to be powered down to replace the

Disaster Recovery Steps
Testing the Plan
The Facility Goes Down
  • If the facility goes down, due to
  • a natural disaster such as an earthquake or a
  • sabotage such as a bomb,
  • or even just an extended power outage.
  • A place to resume critical business activities
    maybe required, a disaster-recovery site. Two
    types of disaster-recovery sites are commonly
    used in the industry.