it seems when creating new thrust vector all elements 0 default - want confirm case.
if so, there way bypass constructor responsible behavior additional speed (since vectors don't need them have initial value, e.g. if raw pointers being passed cublas output)?
thrust::device_vector
constructs elements contains using supplied allocator, std::vector
. it's possible control allocator when vector asks construct element.
use custom allocator avoid default-initialization of vector elements:
// uninitialized_allocator allocator // derives device_allocator , has // no-op construct member function template<typename t> struct uninitialized_allocator : thrust::device_malloc_allocator<t> { // note construct annotated // __host__ __device__ function __host__ __device__ void construct(t *p) { // no-op } }; // make device_vector not initialize elements, // use uninitialized_allocator 2nd template parameter typedef thrust::device_vector<float, uninitialized_allocator<float> > uninitialized_vector;
you still incur cost of kernel launch invoke uninitialized_allocator::construct
, kernel no-op retire quickly. you're interested in avoiding memory bandwidth required fill array, solution does.
there's complete example code here.
note technique requires thrust 1.7 or better.
Comments
Post a Comment