POSH Python Object Sharing - PowerPoint PPT Presentation

About This Presentation
Title:

POSH Python Object Sharing

Description:

Communication through standard Python container objects favors threads ... The GIL is here to stay, so making threads scale better is hard ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 19
Provided by: steffenvi
Category:
Tags: posh | object | python | sharing

less

Transcript and Presenter's Notes

Title: POSH Python Object Sharing


1
POSHPython Object Sharing
  • Steffen Viken Valvåg
  • In collaboration with
  • Kjetil Jacobsen
  • Åge Kvalnes

University of Tromsø, Norway Sponsored by Fast
Search Transfer
2
Python Execution Model
3
Python Threading Model
Thread A
Thread B
Byte codes
  • Each thread executes a separate sequence of byte
    codes
  • All threads must contend for one global
    interpreter lock

4
Example Matrix Multiplication
  • Performs a matrix multiplication A B x C
  • The work is split between several worker threads
  • The application runs on a machine with 8 CPUs
  • Threads do not scale for multiple CPUs due to
    lock contention on the GIL

5
Workaround Processes
Process A
Process B
IPC
  • Each process has its own interpreter lock
  • Requires inter-process communication, using e.g.
    message passing by means of pipes

6
Matrix Multiplication using Processes
  • A master process distributes the input matrices
    to a set of worker processes
  • Each worker process computes some part of the
    output matrix, and returns its result to the
    master
  • The master process assembles the final result
    matrix
  • More communication, and more complex pattern than
    using threads

7
Ways Ahead
  • Communication through standard Python container
    objects favors threads
  • Scalability on multiprocessor architectures
    favors processes
  • The GIL is here to stay, so making threads scale
    better is hard
  • However, there might be room for improvement of
    inter-process communication mechanisms

8
Using Shared Memory for IPC
Process B
Process A
Shared Memory
  • Processes communicate by accessing a shared
    memory region
  • Requires explicit synchronization and data
    marshalling, imposes a flat data structure

9
Using POSH for IPC
Process B
Process A
Shared Memory
  • Allocates regular Python objects in shared memory
  • Shared objects are accessed transparently through
    regular method calls
  • IPC is done by modifying shared, mutable objects

10
Complications
  • Processes must synchronize their access to
    shared, mutable objects (just like threads)
  • Explicit synchronization of critical regions must
    be possible, while implicit synchronization upon
    accessing shared objects is desireable
  • Pythons regular garbage collection algorithm is
    inadequate for shared objects, which may be
    referenced by multiple processes

11
Proxy Objects
X.method1()
X
return value
Shared Object
Proxy Object
  • Provides transparent access to a shared object by
    forwarding all attribute accesses and method
    calls
  • Provides a single entry point to a shared object,
    where synchronization policies may be enforced

12
Multi-Process Garbage Collection
  • Must account for references from all live
    processes
  • Must stay up-to-date when processes fork, as this
    may create new references to shared objects
  • Should be able to handle abnormal process
    termination without leaking shared objects

13
Garbage Collection in POSH
Process A
Shared Memory
Process B
X
M
Y
L
Shared Object
Regular Python reference
Reference from a process to a shared object (type
I)
Reference from one shared object to another (type
II)
14
Garbage Collection Details
  • POSH creates at most one proxy object per process
    for any given shared object
  • Shared objects are always referenced through
    their proxy objects
  • A bitmap in each shared object records the
    processes that have a corresponding proxy object.
    This tracks references of type I (from a
    process)
  • A separate count in each shared object records
    the number of references to the object from other
    shared objects. This tracks references of type
    II
  • Shared objects are deleted when there are no
    references to them of either type

15
Performance
  • Performing a matrix multiplication A B x C
    using POSH
  • The work is split between several worker
    processes
  • The application runs on a machine with 8 CPUs
  • More overhead, but scales for multiple CPUs

16
Summary
  • Python uses a global interpreter lock (GIL) to
    serialize execution of byte codes
  • This entails a lack of scalability on
    multiprocessor architectures for CPU-intensive
    multi-threaded apps
  • However, threads offer an attractive programming
    model, with implicit communication
  • Processes shared memory reduce IPC overheads,
    but normally impose flat data structures and
    require data marshalling
  • POSH uses processes shared memory to offer a
    programming model similar to threads, with the
    scalability of processes

17
Availability
  • Open source, hosted at SourceForge
  • http//poshmodule.sf.net/
  • Still not very stable
  • Developers wanted

18
Example Usage
import posh class Stuff(object)
pass posh.allow_sharing(Stuff,
posh.generic_init) mystuff posh.share(Stuff())
def worker1() mystuff.money 0 def
worker2() mystuff.debt 100000 def
worker3() mystuff.balance mystuff.money -
mystuff.debt for w in worker1, worker2,
worker3 posh.forkcall(w) posh.waitall()
Write a Comment
User Comments (0)
About PowerShow.com