Chapter 6 Distributed File Systems Summary - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Chapter 6 Distributed File Systems Summary

Description:

This chapter presents the fundamental concepts and design issues for the implementation of DFS ... Step 1 ... Step 1 cont. Step 2 ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 23
Provided by: GSU50
Learn more at: https://www.cs.gsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Chapter 6 Distributed File Systems Summary


1
Chapter 6 Distributed File Systems Summary
  • Bernard Chen 2007
  • CSc 8230

2
Outline
  • Summary
  • Load balancing based on mining access patterns
    for Distributed File System
  • References

3
Summary
  • This chapter presents the fundamental concepts
    and design issues for the implementation of DFS

4
6.1 DFS Characteristics
  • A DFS is characterized by
  • how dispersion and multiplicity of files and user
    are made transparent
  • how caching and replication are supported to
    increase the performance
  • what file sharing semantics is assumed
  • whether the system is scalable and fault tolerant

5
6.2 DFS design
  • Basically it tries to address the DFS
    Characteristics
  • Hierarchy files structure
  • File mounting protocol
  • Distribute state information between server and
    clients. Stateless or stateful server
  • File access

6
6.3 Transactions and concurrency control
  • Transaction process system, includes
  • Transaction Manager
  • Schedule
  • Object Manager
  • Serializability
  • Concurrency control
  • Two phase locking
  • Timestamp ordering
  • Optimistic

7
6.4 Data and file Replication
  • Architecture
  • One-copy serializability
  • Quorum voting
  • Gossip update

8
Outline
  • Summary
  • Load balancing based on mining access patterns
    for Distributed File System
  • References

9
Load balancing
  • Load balancing for distributed systems represents
    mapping or remapping of work to different
    processors
  • most of their applications work on redistributing
    the work load between multiple processors to
    speed up computational tasks
  • However, most of the work is done on
    computational tasks and not in the storage
    systems area (Dasgupta, 1997).

10
load balancing of DFS file servers
  • Placing file systems onto different servers in
    order to provide the optimal service for the end
    users, as well as optimize the use of available
    resources, is load balancing of DFS file servers.
  • The algorithm avoids all file access hits to the
    same file server

11
Data Mining- Association rule
  • Association rules techniques determine
    correlations within a set of items.
  • For example you have a lot of transaction
    history of a grocery store, using association
    rules you may find customers who buy milk might
    also buy bread
  • Using in DFS, you may tell that user who access
    file1 may also want to access file2

12
Graph Analysis coloring problem
  • The coloring problem lies in assigning a color to
    each item so that every vertex in the
    incompatible pair is assigned a different color.

13
MAPDFS Load Balancing Tool (By IBM)
  • It contains 3 major steps
  • DFS cell monitoring and statically analysis the
    load on each server
  • Mining association rules from previously
    collected data
  • Combine and analysis the first two steps to make
    recommendations on file sets transportation

14
Step 1
  • Each read-write file set access request number
    is pulled from raw DFS screen dumps along with
    the server name and the file set name, and stored
    in a file with a timestamp to record the time of
    the cells snapshot.
  • After this operation is completed, we can apply
    statistical analysis to determine which servers
    are overloaded and which ones are underutilized.

15
Step 1 cont.
  • We will separate all DFS servers into the
    following three groups and label them according
    to their load level
  • To underutilized servers
  • From over utilized servers
  • Stay remaining within the user specified
    threshold

16
Step 1 cont.

17
Step 2
  • identifying a candidate fileset, or a group of
    filesets to be moved from each of the overloaded
    FROM servers.
  • using association rules to uncover underlying
    file access patterns that dominate within each
    server as well as across the DFS cell.

18
Step 3
  • At this stage we need to combine knowledge gained
    from the first two steps statistical analysis
    of the DFS cell and mined association rules to
    generate the final decision

19
Step 3 cont.
  • filesets identified by data mining association
    rules become vertices of the graph, and the
    association rules represent the edges.
  • Vertex coloring approach was chosen to determine
    this step of the load balancing process.
  • each color indicates a server

20
Step 3 cont.
  • The decision on whether a color of filesets can
    be moved to a particular server is made after
    evaluating answers to the two following
    questions
  • 1. Does the collection of Cross Server
    Association Rules contain an entry that connects
    one of the filesets of this color to a fileset
    on the target server?
  • 2. Does the move of the current color of
    filesets push this server beyond the Threshold
    parameter into the FROM servers category?

21
Outline
  • Summary
  • Load balancing based on mining access patterns
    for Distributed File System
  • References

22
References
  • A. Glagoleva, A. Sathaye, A load balancing tool
    based on mining access patterns for distributed
    file system servers, Proceedings of the 35th
    Hawaii international conference on system
    sciences, 2002
  • Pallab Dasgupta, A.K. Majumder, and P.
    Bhattacharya, V_THR An Adaptive Load Balancing
    Algorithm, Journal of Parallel and Distributed
    Computing 42, 1997, 101-108.
  • An Overview of DFS
  • http//www.transarc.com/Library/documentation/dce/
    1.1/dfs_admin_gd_1.html.
  • An Overview of NFS, http//www.rs6000.ibm.com/doc_
    link/en_US/a_doc_lib/aixbman/commadmn/nfs_intro.ht
    m
  • Alexandra Glagoleva and Archana Sathaye, Load
    Balancing Distributed File System Servers A Rule
    Based Approach" , Proceedings of 13th
    International Conference on System Research,
    Information and Cybernetics, Baden-Baden Germany,
    July 30, 2001.
Write a Comment
User Comments (0)
About PowerShow.com