## Learning Objectives
* Learn about the USM and buffer/accessor models for managing data
* Learn how to allocate, transfer and free memory using USM.
* Learn how a buffer synchronizes data
* Learn how to access data in a kernel function
#### Memory Models
* In SYCL there are two models for managing data:
* The buffer/accessor model.
* The USM (unified shared memory) model.
* Which model you choose can have an effect on how you enqueue kernel functions.
#### CPU and GPU Memory
* A GPU has its own memory, separate to CPU memory.
* In order for the GPU to use memory from the CPU, the following actions must take place (either explicitly or implicitly):
* Memory allocation on the GPU.
* Data migration from the CPU to the allocation on the GPU.
* Some computation on the GPU.
* Migration of the result back to the CPU.
#### CPU and GPU Memory
* Memory transfers between CPU and GPU are a bottleneck.
* We want to minimize these transfers, when possible.
#### USM Allocation Types
* There are different ways USM memory can be allocated: host, device and shared.
![SYCL](../common-revealjs/images/Figure6-1bookUSMtypes.png "SYCL")
(from book)
#### Using USM - Malloc Device
// Allocate memory on device
T *device_ptr = sycl::malloc_device<T>(n, myQueue);
// Copy data to device
myQueue.memcpy(device_ptr, cpu_ptr, n * sizeof(T));
// ...
// Do some computation on device
// ...
// Copy data back to CPU
myQueue.memcpy(result_ptr, device_ptr, n * sizeof(T)).wait();
// Free allocated data
sycl::free(device_ptr, myQueue);
* It is important to free memory after it has been
used to avoid memory leaks.
#### Using USM - Malloc Shared
// Allocate shared memory
T *shared_ptr = sycl::malloc_shared<T>(n, myQueue);
// Shared memory can be accessed on host as well as device
for (auto i = 0; i < n; ++i)
shared_ptr[i] = i;
// ...
// Do some computation on device
// ...
// Free allocated data
sycl::free(shared_ptr, myQueue);
* Shared memory is accessible on host and device.
* Performance of shared memory accesses may be poor depending on platform.
#### SYCL Buffers & Accessors
* SYCL provides an API which takes care of allocations and `memcpy`s, as well as some other things.
#### SYCL Buffers & Accessors
* The buffer/accessor model separates the storage and access of data
* A SYCL buffer manages data across the host and any number of devices
* A SYCL accessor requests access to data on the host or on a device for a specific SYCL kernel function
* Accessors are also used to access data within a SYCL kernel function
* This means they are declared in the host code but captured by and then accessed within a SYCL kernel function
#### SYCL Buffers & Accessors
* A SYCL buffer can be constructed with a pointer to host memory
* For the lifetime of the buffer this memory is owned by the SYCL runtime
* When a buffer object is constructed it will not allocate or copy to device memory at first
* This will only happen once the SYCL runtime knows the data needs to be accessed and where it needs to be accessed
![Buffer Host Memory](../common-revealjs/images/buffer-hostmemory.png "Buffer Host Memory")
#### SYCL Buffers & Accessors
* Constructing an accessor specifies a request to access the data managed by the buffer
* There are a range of different types of accessor which provide different ways to access data
![Buffer Host Memory Accessor](../common-revealjs/images/buffer-hostmemory-accessor.png "Buffer Host Memory Accessor")
#### SYCL Buffers & Accessors
* When an accessor is constructed it is associated with a command group via the handler object
* This connects the buffer that is being accessed, the way in which it’s being accessed and the device that the command group is being submitted to
![Buffer Host Memory Accessor CG](../common-revealjs/images/buffer-hostmemory-accessor-cg.png "Buffer Host Memory Accessor CG")
#### SYCL Buffers & Accessors
* Once the SYCL scheduler selects the command group to be executed it must first satisfy its data dependencies
* If necessary, this includes allocating and copying the data to the device accessing that data
* If the most recent copy of the data is already on the device then the runtime will not copy again
![Buffer Host Memory Accessor CG Device](../common-revealjs/images/buffer-hostmemory-accessor-cg-device.png "Buffer Host Memory Accessor CG Device")
#### SYCL Buffers & Accessors
* Data will remain in device memory after kernels finish executing until another accessor requests access in a different device or on the host
* When the buffer object is destroyed it will wait for any outstanding work that is accessing the data to complete and then copy back to the original host memory
![Buffer Destroyed](../common-revealjs/images/buffer-destroyed.png "Buffer Destroyed")
#### SYCL Buffers & Accessors
T var = 42;
{
// Create buffer pointing to var.
auto buf = sycl::buffer{&var, sycl::range<1>{1}};
// ...
// Do some computation on device. Use accessors to access buffer
// ...
} // var updated here
assert(var != 42);
* A buffer is associated with a type, range and
dimensionality. Dimensionality must be either 1, 2 or
3.
* Usually type and dimensionality can be inferred using
CTAD.
* If a buffer is associated with some allocation in host
memory, the host memory will be updated only once the
buffer goes out of scope.
#### Accessor class
![Accessor Types](../common-revealjs/images/accessor-types.png "Accessor Types")
#### Accessor class
* There are many different ways to use the `accessor`
class.
* Accessing data on a device.
* Accessing data immediately in the host application.
* Allocating local memory.
* For now we are going to focus on accessing data on a
device.
#### Constructing an accessor
auto acc = sycl::accessor{bufA, cgh};
* There are many ways to construct an `accessor`.
* The `accessor` class supports CTAD so it's not
necessary to specify all of the template arguments.
* The most common way to construct an `accessor` is from
a `buffer` and a `handler` associated with the command
group function you are within.
* The element type and dimensionality are inferred from
the `buffer`.
* The `access_mode` is defaulted to
`access_mode::read_write`.
#### Specifying the access mode
auto readAcc = sycl::accessor{bufA, cgh, sycl::read_only};
auto writeAcc = sycl::accessor{bufB, cgh, sycl::write_only};
* When constructing an `accessor` you will likely also
want to specify the `access_mode`
* You can do this by passing one of the CTAD tags:
* `read_only` will result in `access_mode::read`.
* `write_only` will result in `access_mode::write`.
#### Specifying no initialization
auto acc = sycl::accessor{buf, cgh, sycl::no_init};
* When constructing an `accessor` you may also want to
discard the original data of a `buffer`.
* You can do this by passing the `no_init` property.
#### Using Accessors
T var = 42;
{
// Create buffer pointing to var.
auto bufA = sycl::buffer{&var, sycl::range<1>{1}};
auto bufB = sycl::buffer{&var, sycl::range<1>{1}};
q.submit([&](sycl::handler &cgh) {
auto accA = sycl::accessor{bufA, cgh, sycl::read_only};
auto accB = sycl::accessor{bufA, cgh, sycl::no_init};
cgh.single_task<mykernel>(...); // Do some work
});
} // var updated here
assert(var != 42);
* Buffers and accessors take care of memory
migration, as well as dependency analysis.
* More to come later on dependencies.
#### operator[]
gpuQueue.submit([&](handler &cgh){
auto inA = sycl::accessor{bufA, cgh, sycl::read_only};
auto inB = sycl::accessor{bufB, cgh, sycl::read_only};
auto out = sycl::accessor{bufO, cgh, sycl::write_only};
cgh.single_task<mykernel>([=]{
out[0] = inA[0] + inB[0];
});
});
* As well as specifying data dependencies an `accessor`
can also be used to access the data from within a kernel
function.
* You can do this by calling `operator[]` on the
`accessor`.
* `operator[]` for USM pointers must take a `size_t`,
whereas `operator[]` for accessors can take a
multi-dimensional `sycl::id` or a `size_t`.
#### Exercise
Code_Exercises/Managing_Data/source
Implement a SYCL application that adds two variables
and returns the result using:
1. The USM memory model
2. The buffer/accessor memory model.