## Queues (In and Out of Order)
## Learning Objectives * Learn about out-of-order and in-order execution * Learn about how to use in-order queues for maximal performance
#### Out-of-order execution
![SYCL](../common-revealjs/images/out_of_order_execution.png "SYCL")
* SYCL `queue`s are by default out-of-order. * This means commands are allowed to be overlapped, re-ordered, and executed concurrently, providing dependencies are honoured to ensure consistency.
#### In-of-order execution
![SYCL](../common-revealjs/images/in_order_execution.png "SYCL")
* SYCL `queue`s can be configured to be in-order. * This mean commands must execute strictly in the order they were enqueued.
#### Using an out-of-order queue (USM)
sycl::queue Q;
for (std::function task: tasks)
  Q.submit(task);
Q.wait();
							
* All commands can execute concurrently if hardware allows it * See exercise for real world data
#### Using an out-of-order queue (USM): how to handle dependencies?
sycl::queue Q;
auto e1 = Q.submit(taskA);
auto e2 = Q.submit([&](sycl::handle &cgh) {
	cgh.depends_on(e1);
	taskB();
}
auto e3 = Q.submit([&](sycl::handle &cgh) {
	cgh.depends_on(e1);
	taskC();
}
Q.submit([&](sycl::handle &cgh) {
	cgh.depends_on({e2,e3});
	taskD();
}.wait();
							
* Define an ordering manually (error prone) * Scheduling done automatically (maximize concurrency) * Dynamic scheduling (potentially higher latency overhead than manually crafted scheduling)
#### Using an in-order queue (USM)
sycl::queue Q{sycl::property::queue::in_order};
for (std::function task: tasks)
  Q.submit(task);
Q.wait();
							
* All commands are executed serially * Ease of programming (no race conditions can occur) * Potentialy lower-latency than out-of-order queues * Doesn't allow concurrency, potentially suboptimal performance
#### Using an out-of-order queue to enable concurrency in kernel execution (USM)
std::vector> queues_tasks;
for (auto [Q, task]: queues_tasks)
  Q.submit(task);

for (auto [Q, _]: queues_tasks)
  Q.wait();
							
* Manual Scheduling (tasks need to be mapped to queues) * Allow concurency between queues * Painful to extract full concurrency
## Note on buffer/accessors * Best of Both worlds * Buffer / Accessors automatically handle dependencies for you (no need for `depends_on`) * Hence, using in or out-of-order queues with buffer/accessors will not change the semantics of your program! * Can use out-of-order without any drawbacks.
## Questions
#### Exercise
Code_Exercises/In_Order_Queue/source_queue_benchmarking.cpp
- Transform the serial in-order scheduling to allow concurrent execution (using an out-of-order queue or multiple in-order queues) - Mesure speedup
#### Exercise
Code_Exercises/In_Order_Queue/source_vector_add.cpp
![SYCL](../common-revealjs/images/in_order_diamond_data_flow.png "SYCL")
Take the diamond data flow graph we implemented in the last exercise and convert it to use an in-order `queue`.