## Queues (In and Out of Order)
## Learning Objectives
* Learn about out-of-order and in-order execution
* Learn about how to use in-order queues for maximal performance
#### Out-of-order execution
![SYCL](../common-revealjs/images/out_of_order_execution.png "SYCL")
* SYCL `queue`s are by default out-of-order.
* This means commands are allowed to be overlapped, re-ordered, and executed concurrently, providing dependencies are honoured to ensure consistency.
#### In-of-order execution
![SYCL](../common-revealjs/images/in_order_execution.png "SYCL")
* SYCL `queue`s can be configured to be in-order.
* This mean commands must execute strictly in the order they were enqueued.
#### Using an out-of-order queue (USM)
sycl::queue Q;
for (std::function task: tasks)
Q.submit(task);
Q.wait();
* All commands can execute concurrently if hardware allows it
* See exercise for real world data
#### Using an out-of-order queue (USM): how to handle dependencies?
sycl::queue Q;
auto e1 = Q.submit(taskA);
auto e2 = Q.submit([&](sycl::handle &cgh) {
cgh.depends_on(e1);
taskB();
}
auto e3 = Q.submit([&](sycl::handle &cgh) {
cgh.depends_on(e1);
taskC();
}
Q.submit([&](sycl::handle &cgh) {
cgh.depends_on({e2,e3});
taskD();
}.wait();
* Define an ordering manually (error prone)
* Scheduling done automatically (maximize concurrency)
* Dynamic scheduling (potentially higher latency overhead than manually crafted scheduling)
#### Using an in-order queue (USM)
sycl::queue Q{sycl::property::queue::in_order};
for (std::function task: tasks)
Q.submit(task);
Q.wait();
* All commands are executed serially
* Ease of programming (no race conditions can occur)
* Potentialy lower-latency than out-of-order queues
* Doesn't allow concurrency, potentially suboptimal performance
#### Using an out-of-order queue to enable concurrency in kernel execution (USM)
std::vector> queues_tasks;
for (auto [Q, task]: queues_tasks)
Q.submit(task);
for (auto [Q, _]: queues_tasks)
Q.wait();
* Manual Scheduling (tasks need to be mapped to queues)
* Allow concurency between queues
* Painful to extract full concurrency
## Note on buffer/accessors
* Best of Both worlds
* Buffer / Accessors automatically handle dependencies for you (no need for `depends_on`)
* Hence, using in or out-of-order queues with buffer/accessors will not change the semantics of your program!
* Can use out-of-order without any drawbacks.
#### Exercise
Code_Exercises/In_Order_Queue/source_queue_benchmarking.cpp
- Transform the serial in-order scheduling to allow concurrent execution (using an out-of-order queue or multiple in-order queues)
- Mesure speedup
#### Exercise
Code_Exercises/In_Order_Queue/source_vector_add.cpp
![SYCL](../common-revealjs/images/in_order_diamond_data_flow.png "SYCL")
Take the diamond data flow graph we implemented in the last exercise and convert it to use an in-order `queue`.