Async Workgroup Update - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Async Workgroup Update

Description:

Fence completion allows for partial glFinish. All commands prior to the fence are forced to ... Multiple fences can be associated with the same sync object ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 16
Provided by: neiltr
Category:

less

Transcript and Presenter's Notes

Title: Async Workgroup Update


1
Async Workgroup Update
  • Barthold Lichtenbelt

2
Goals
  • Provide synchronization framework for OpenGL
  • Provide base functionality as defined in NV_fence
    and GL2_async_core
  • Build a framework for future, more complex,
    functionality, some of which discussed in
    GL2_async_core
  • Initially support CPU lt-gt GPU synchronization
  • Support synchronization across multiple OpenGL
    contexts
  • Resulted in GL_ARB_sync spec
  • Finished April 2006
  • Posted draft to opengl.org for feedback
  • Not quite official ARB extension yet

3
Functionality overview
  • ARB_sync provides synchronization primitives
  • Can be tested, set and waited upon
  • Specifically, a Fence Synchronization Object
    and corresponding Fence command
  • Fence completion allows for partial glFinish
  • All commands prior to the fence are forced to
    complete before control is returned to caller
  • Fence Sync Objects can be shared across contexts
  • Allows for synchronization of OpenGL command
    streams across contexts
  • New data type GLtime represents intervals in
    nanoseconds
  • 64 bit integer, same encoding as UST counter in
    OpenML
  • Accuracy implementation dependent, precision in
    nanoseconds
  • If you have used the Windows Event model, this
    will feel familiar

4
Synchronization model in ARB_sync 1/2
  • A sync object is a primitive used for
    synchronization between CPU and GPU, CPU, or
    something else.
  • Sync object has state type, condition, status
  • A sync objects status can be signaled or
    non-signaled
  • when created status is signaled unless a flag is
    set in which case it is non-signaled
  • A fence sync object is a specific type of sync
    object
  • Provides partial finish semantics
  • Only type of sync object currently defined
  • A fence is a token inserted in the GL command
    stream
  • A sync object is not inserted into the command
    stream
  • Fence has no state
  • A fence is associated with a fence sync object.
  • Multiple fences can be associated with the same
    sync object
  • When a fence is inserted in the command stream,
    the status of its sync object is set to
    non-signaled
  • A fence, once completed, will set the status of
    its sync object to signaled

5
Synchronization model in ARB_sync 2/2
  • A wait function waits on a sync object, not on a
    fence
  • A poll function polls a sync object, not a fence
  • A wait function called on a sync object in the
    non-signaled state will block. It unblocks when
    the sync object transitions to the signaled
    state.

6
Example RTT with two contexts
  • Context A
  • Sync_objectA glCreateSync(attrib)
  • ltrender to texture that context B needsgt
  • glFence(sync_objectA)
  • glFlush() // prevent deadlock
  • Context B
  • glClientWaitSync(sync_objectA,0,GL_FOREVER)
  • glBindTexture(.) // Just rendered
  • ltrender using texturegt

7
OS specific functionality
  • Convert sync object to the window system native
    event primitive
  • Allows applications to synchronize all events in
    a system using one API
  • All operations on ltsyncgt are reflected in OS
    event and vice-versa
  • Both ltsyncgt and the OS event are valid to use in
    your code
  • On windows, convert to an Event
  • HANDLE wglConvertSyncToEvent(object sync)
  • Need to specify, when sync object is created,
    that it can be converted to OS event
  • Separate extension WGL_ARB_sync_event
  • On Unix, convert to a file-descriptor, x-event or
    semaphore?
  • Still TBD

8
Possible future functionality
  • Add a WaitForMultipleSync(uint sync_objects, .)
    command
  • Synchronize with multiple sync objects at once
  • Add a payload to a fence
  • For example, the time it completed
  • Allow one GPU stream to wait for another GPU
    stream
  • WaitSync(sync_object)
  • A sync object whose status will pulse with every
    vblank
  • A sync object that can signal when data binding
    has completed
  • As opposed to when rendering has completed using
    the data

9
Example Streaming video processing
  • Loop
  • Draw frame 1 // To a FBO, for example
  • glFence(sync_object1) // inserts a fence in the
    command stream
  • Draw frame 2
  • glFence(sync_object2)
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • Read back data in frame 1
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • Read back data in frame 2

10
Variation with asynchronous read back
  • Loop
  • Draw frame 1 // To a FBO, for example
  • Read back frame 1 into PBO 1 // Asynchronous
    readback
  • glFence(sync_object1) // Inserts a fence in the
    command stream
  • Draw frame 2
  • Read back frame 2 into PBO 2
  • glFence(sync_object2)
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • glMapBuffer() // Access the data of frame 1 in
    PBO 1
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • glMapBuffer() // Access the data of frame 2 in
    PBO 2

11
Differences with GL_NV_Fence
  • No separation of sync objects and fences in
    NV_Fence
  • NV version only has fence objects
  • Fence object has state
  • Creation of sync object and inserting a fence in
    one command
  • SetFenceNV creates and inserts a fence (old
    object model)
  • NV Fence objects not shared across contexts

12
API Overview 1/2
  • Create a sync attribute object
  • object CreateSyncAttrib()
  • SYNC_TYPE has to be FENCE
  • SYNC_CONDITION has to be SYNC_PRIOR_COMMANDS_COMP
    LETE
  • SYNC_STATUS SIGNALED or UNSIGNALED
  • Create the sync object
  • object CreateSync(object attrib)
  • Insert a fence, associated with a sync object,
    into command stream
  • void Fence(object sync)

13
API Overview 2/2
  • Wait or test the status of a fence sync object
  • enum ClientWaitSync(object sync, uint flags,
  • time timeout)
  • Blocks until sync is signalled or timeout expired
  • If timeout 0, does not block, returns the
    status of sync
  • If timeout FOREVER, call does not timeout
  • Optionally will flush before blocking
  • Returns 3 values ALREADY_SIGNALED,
    TIMEOUT_EXPIRED, CONDITION_SATISFIED
  • Signal or unsignal a sync object
  • void SignalSync(object sync, enum mode)
  • If status transitions from unsignaled to
    signaled, ClientWaitSync will unblock

14
Example Streaming video processing
  • Loop
  • Draw frame 1 // To a FBO, for example
  • glFence(sync_object1) // inserts a fence in the
    command stream
  • Draw frame 2
  • glFence(sync_object2)
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • Read back data in frame 1
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • Read back data in frame 2

15
Variation with asynchronous read back
  • Loop
  • Draw frame 1 // To a FBO, for example
  • Read back frame 1 into PBO 1 // Asynchronous
    readback
  • glFence(sync_object1) // Inserts a fence in the
    command stream
  • Draw frame 2
  • Read back frame 2 into PBO 2
  • glFence(sync_object2)
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • glMapBuffer() // Access the data of frame 1 in
    PBO 1
  • while (glClientWaitSync(sync_object1,0,0)!GL_ALRE
    ADY_SIGNALED)
  • ltDo some useful workgt // App uses CPU cycles
    instead of blocking
  • glMapBuffer() // Access the data of frame 2 in
    PBO 2
Write a Comment
User Comments (0)
About PowerShow.com