STLdoc
STLdocumentation
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Classes | Typedefs | Enumerations | Functions | Variables
__gnu_parallel Namespace Reference

GNU parallel code for public use. More...

Classes

struct  _QSBThreadLocal
 Information local to one thread in the parallel quicksort run. More...
 
class  _EqualFromLess
 Constructs predicate for equality from strict weak ordering predicate. More...
 
class  __unary_negate
 Similar to std::unary_negate, but giving the argument types explicitly. More...
 
class  __binder1st
 Similar to std::binder1st, but giving the argument types explicitly. More...
 
class  __binder2nd
 Similar to std::binder2nd, but giving the argument types explicitly. More...
 
struct  _EqualTo
 Similar to std::equal_to, but allows two different types. More...
 
struct  _Less
 Similar to std::less, but allows two different types. More...
 
struct  _Less< _Tp, _Tp >
 
struct  _Plus
 Similar to std::plus, but allows two different types. More...
 
struct  _Plus< _Tp, _Tp, _Tp >
 
struct  _Multiplies
 Similar to std::multiplies, but allows two different types. More...
 
struct  _Multiplies< _Tp, _Tp, _Tp >
 
class  _PseudoSequenceIterator
 _Iterator associated with __gnu_parallel::_PseudoSequence. If features the usual random-access iterator functionality. More...
 
class  _PseudoSequence
 Sequence that conceptually consists of multiple copies of the same element. The copies are not stored explicitly, of course. More...
 
struct  __generic_find_selector
 Base class of all __gnu_parallel::__find_template selectors. More...
 
struct  __find_if_selector
 Test predicate on a single element, used for std::find() and std::find_if (). More...
 
struct  __adjacent_find_selector
 Test predicate on two adjacent elements. More...
 
struct  __mismatch_selector
 Test inverted predicate on a single element. More...
 
struct  __find_first_of_selector
 Test predicate on several elements. More...
 
struct  __generic_for_each_selector
 Generic __selector for embarrassingly parallel functions. More...
 
struct  __for_each_selector
 std::for_each() selector. More...
 
struct  __generate_selector
 std::generate() selector. More...
 
struct  __fill_selector
 std::fill() selector. More...
 
struct  __transform1_selector
 std::transform() __selector, one input sequence variant. More...
 
struct  __transform2_selector
 std::transform() __selector, two input sequences variant. More...
 
struct  __replace_selector
 std::replace() selector. More...
 
struct  __replace_if_selector
 std::replace() selector. More...
 
struct  __count_selector
 std::count() selector. More...
 
struct  __count_if_selector
 std::count_if () selector. More...
 
struct  __accumulate_selector
 std::accumulate() selector. More...
 
struct  __inner_product_selector
 std::inner_product() selector. More...
 
struct  __identity_selector
 Selector that just returns the passed iterator. More...
 
struct  __adjacent_difference_selector
 Selector that returns the difference between two adjacent __elements. More...
 
struct  _Nothing
 Functor doing nothing. More...
 
struct  _DummyReduct
 Reduction function doing nothing. More...
 
struct  __min_element_reduct
 Reduction for finding the maximum element, using a comparator. More...
 
struct  __max_element_reduct
 Reduction for finding the maximum element, using a comparator. More...
 
struct  __accumulate_binop_reduct
 General reduction, using a binary operator. More...
 
class  _IteratorPair
 A pair of iterators. The usual iterator operations are applied to both child iterators. More...
 
class  _IteratorTriple
 A triple of iterators. The usual iterator operations are applied to all three child iterators. More...
 
class  _LoserTreeBase
 Guarded loser/tournament tree. More...
 
class  _LoserTree
 Stable _LoserTree variant. More...
 
class  _LoserTree< false, _Tp, _Compare >
 Unstable _LoserTree variant. More...
 
class  _LoserTreePointerBase
 Base class of _Loser Tree implementation using pointers. More...
 
class  _LoserTreePointer
 Stable _LoserTree implementation. More...
 
class  _LoserTreePointer< false, _Tp, _Compare >
 Unstable _LoserTree implementation. More...
 
class  _LoserTreeUnguardedBase
 Base class for unguarded _LoserTree implementation. More...
 
class  _LoserTreeUnguarded
 Stable implementation of unguarded _LoserTree. More...
 
class  _LoserTreeUnguarded< false, _Tp, _Compare >
 Non-Stable implementation of unguarded _LoserTree. More...
 
class  _LoserTreePointerUnguardedBase
 Unguarded loser tree, keeping only pointers to the elements in the tree structure. More...
 
class  _LoserTreePointerUnguarded
 Stable unguarded _LoserTree variant storing pointers. More...
 
class  _LoserTreePointerUnguarded< false, _Tp, _Compare >
 Unstable unguarded _LoserTree variant storing pointers. More...
 
class  _Lexicographic
 Compare __a pair of types lexicographically, ascending. More...
 
class  _LexicographicReverse
 Compare __a pair of types lexicographically, descending. More...
 
class  _GuardedIterator
 _Iterator wrapper supporting an implicit supremum at the end of the sequence, dominating all comparisons. More...
 
class  _UnguardedIterator
 
struct  _LoserTreeTraits
 Traits for determining whether the loser tree should use pointers or copies. More...
 
struct  __multiway_merge_3_variant_sentinel_switch
 Switch for 3-way merging with __sentinels turned off. More...
 
struct  __multiway_merge_3_variant_sentinel_switch< true, _RAIterIterator, _RAIter3, _DifferenceTp, _Compare >
 Switch for 3-way merging with __sentinels turned on. More...
 
struct  __multiway_merge_4_variant_sentinel_switch
 Switch for 4-way merging with __sentinels turned off. More...
 
struct  __multiway_merge_4_variant_sentinel_switch< true, _RAIterIterator, _RAIter3, _DifferenceTp, _Compare >
 Switch for 4-way merging with __sentinels turned on. More...
 
struct  __multiway_merge_k_variant_sentinel_switch
 Switch for k-way merging with __sentinels turned on. More...
 
struct  __multiway_merge_k_variant_sentinel_switch< false, __stable, _RAIterIterator, _RAIter3, _DifferenceTp, _Compare >
 Switch for k-way merging with __sentinels turned off. More...
 
struct  _SamplingSorter
 Stable sorting functor. More...
 
struct  _SamplingSorter< false, _RAIter, _StrictWeakOrdering >
 Non-__stable sorting functor. More...
 
struct  _Piece
 Subsequence description. More...
 
struct  _PMWMSSortingData
 Data accessed by all threads. More...
 
struct  _SplitConsistently
 Split consistently. More...
 
struct  _SplitConsistently< true, _RAIter, _Compare, _SortingPlacesIterator >
 Split by exact splitting. More...
 
struct  _SplitConsistently< false, _RAIter, _Compare, _SortingPlacesIterator >
 Split by sampling. More...
 
struct  __possibly_stable_sort
 
struct  __possibly_stable_sort< true, _RAIter, _Compare >
 
struct  __possibly_stable_sort< false, _RAIter, _Compare >
 
struct  __possibly_stable_multiway_merge
 
struct  __possibly_stable_multiway_merge< true, Seq_RAIter, _RAIter, _Compare, _DiffType >
 
struct  __possibly_stable_multiway_merge< false, Seq_RAIter, _RAIter, _Compare, _DiffType >
 
class  _RestrictedBoundedConcurrentQueue
 Double-ended queue of bounded size, allowing lock-free atomic access. push_front() and pop_front() must not be called concurrently to each other, while pop_back() can be called concurrently at all times. empty(), size(), and top() are intentionally not provided. Calling them would not make sense in a concurrent setting. More...
 
class  _RandomNumber
 Random number generator, based on the Mersenne twister. More...
 
struct  _DRandomShufflingGlobalData
 Data known to every thread participating in __gnu_parallel::__parallel_random_shuffle(). More...
 
struct  _DRSSorterPU
 Local data for a thread participating in __gnu_parallel::__parallel_random_shuffle(). More...
 
struct  __symmetric_difference_func
 
struct  __difference_func
 
struct  __intersection_func
 
struct  __union_func
 
struct  _Settings
 
struct  sequential_tag
 Forces sequential execution at compile time. More...
 
struct  parallel_tag
 Recommends parallel execution at compile time, optionally using a user-specified number of threads. More...
 
struct  default_parallel_tag
 Recommends parallel execution using the default parallel algorithm. More...
 
struct  balanced_tag
 Recommends parallel execution using dynamic load-balancing at compile time. More...
 
struct  unbalanced_tag
 Recommends parallel execution using static load-balancing at compile time. More...
 
struct  omp_loop_tag
 Recommends parallel execution using OpenMP dynamic load-balancing at compile time. More...
 
struct  omp_loop_static_tag
 Recommends parallel execution using OpenMP static load-balancing at compile time. More...
 
struct  find_tag
 Base class for for std::find() variants. More...
 
struct  exact_tag
 Forces parallel merging with exact splitting, at compile time. More...
 
struct  sampling_tag
 Forces parallel merging with exact splitting, at compile time. More...
 
struct  multiway_mergesort_tag
 Forces parallel sorting using multiway mergesort at compile time. More...
 
struct  multiway_mergesort_exact_tag
 Forces parallel sorting using multiway mergesort with exact splitting at compile time. More...
 
struct  multiway_mergesort_sampling_tag
 Forces parallel sorting using multiway mergesort with splitting by sampling at compile time. More...
 
struct  quicksort_tag
 Forces parallel sorting using unbalanced quicksort at compile time. More...
 
struct  balanced_quicksort_tag
 Forces parallel sorting using balanced quicksort at compile time. More...
 
struct  growing_blocks_tag
 Selects the growing block size variant for std::find(). More...
 
struct  constant_size_blocks_tag
 Selects the constant block size variant for std::find(). More...
 
struct  equal_split_tag
 Selects the equal splitting variant for std::find(). More...
 
struct  _Job
 One __job for a certain thread. More...
 

Typedefs

typedef unsigned short _BinIndex
 Type to hold the index of a bin. More...
 
typedef uint64_t _SequenceIndex
 Unsigned integer to index __elements. The total number of elements for each algorithm must fit into this type. More...
 
typedef uint16_t _ThreadIndex
 Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit into this type. More...
 
typedef int64_t _CASable
 Longest compare-and-swappable integer type on this platform. More...
 

Enumerations

enum  _Parallelism {
  sequential, parallel_unbalanced, parallel_balanced, parallel_omp_loop,
  parallel_omp_loop_static, parallel_taskqueue
}
 Run-time equivalents for the compile-time tags. More...
 
enum  _AlgorithmStrategy { heuristic, force_sequential, force_parallel }
 Strategies for run-time algorithm selection: More...
 
enum  _SortAlgorithm { MWMS, QS, QS_BALANCED }
 Sorting algorithms: More...
 
enum  _MultiwayMergeAlgorithm { LOSER_TREE }
 Merging algorithms: More...
 
enum  _PartialSumAlgorithm { RECURSIVE, LINEAR }
 Partial sum algorithms: recursive, linear. More...
 
enum  _SplittingAlgorithm { SAMPLING, EXACT }
 Sorting/merging algorithms: sampling, __exact. More...
 
enum  _FindAlgorithm { GROWING_BLOCKS, CONSTANT_SIZE_BLOCKS, EQUAL_SPLIT }
 Find algorithms: More...
 

Functions

template<typename _RAIter , typename _Compare >
std::iterator_traits< _RAIter >
::difference_type 
__qsb_divide (_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
 Balanced quicksort divide step. More...
 
template<typename _RAIter , typename _Compare >
void __qsb_conquer (_QSBThreadLocal< _RAIter > **__tls, _RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __iam, _ThreadIndex __num_threads, bool __parent_wait)
 Quicksort conquer step. More...
 
template<typename _RAIter , typename _Compare >
void __qsb_local_sort_with_helping (_QSBThreadLocal< _RAIter > **__tls, _Compare &__comp, _ThreadIndex __iam, bool __wait)
 Quicksort step doing load-balanced local sort. More...
 
template<typename _RAIter , typename _Compare >
void __parallel_sort_qsb (_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
 Top-level quicksort routine. More...
 
_ThreadIndex __get_max_threads ()
 
bool __is_parallel (const _Parallelism __p)
 
template<typename _Size >
_Size __rd_log2 (_Size __n)
 Calculates the rounded-down logarithm of __n for base 2. More...
 
_CASable __encode2 (int __a, int __b)
 Encode two integers into one gnu_parallel::_CASable. More...
 
void __decode2 (_CASable __x, int &__a, int &__b)
 Decode two integers from one gnu_parallel::_CASable. More...
 
template<typename _Tp >
const _Tp & min (const _Tp &__a, const _Tp &__b)
 Equivalent to std::min. More...
 
template<typename _Tp >
const _Tp & max (const _Tp &__a, const _Tp &__b)
 Equivalent to std::max. More...
 
template<typename _RAIter , typename _Compare >
_RAIter __median_of_three_iterators (_RAIter __a, _RAIter __b, _RAIter __c, _Compare __comp)
 Compute the median of three referenced elements, according to __comp. More...
 
template<typename _IIter , typename _Compare >
bool __is_sorted (_IIter __begin, _IIter __end, _Compare __comp)
 Check whether [__begin, __end) is sorted according to __comp. More...
 
template<typename _Tp >
_Tp __add_omp (volatile _Tp *__ptr, _Tp __addend)
 
template<typename _Tp >
_Tp __fetch_and_add (volatile _Tp *__ptr, _Tp __addend)
 Add a value to a variable, atomically. More...
 
template<typename _Tp >
bool __cas_omp (volatile _Tp *__ptr, _Tp __comparand, _Tp __replacement)
 
template<typename _Tp >
bool __compare_and_swap (volatile _Tp *__ptr, _Tp __comparand, _Tp __replacement)
 Compare-and-swap. More...
 
void __yield ()
 Yield control to another thread, without waiting for the end of the time slice. More...
 
template<typename _DifferenceType , typename _OutputIterator >
_OutputIterator __equally_split (_DifferenceType __n, _ThreadIndex __num_threads, _OutputIterator __s)
 function to split a sequence into parts of almost equal size. More...
 
template<typename _DifferenceType >
_DifferenceType __equally_split_point (_DifferenceType __n, _ThreadIndex __num_threads, _ThreadIndex __thread_no)
 function to split a sequence into parts of almost equal size. More...
 
template<typename _RAIter1 , typename _RAIter2 , typename _Pred , typename _Selector >
std::pair< _RAIter1, _RAIter2 > __find_template (_RAIter1 __begin1, _RAIter1 __end1, _RAIter2 __begin2, _Pred __pred, _Selector __selector)
 Parallel std::find, switch for different algorithms. More...
 
template<typename _IIter , typename _UserOp , typename _Functionality , typename _Red , typename _Result >
_UserOp __for_each_template_random_access (_IIter __begin, _IIter __end, _UserOp __user_op, _Functionality &__functionality, _Red __reduction, _Result __reduction_start, _Result &__output, typename std::iterator_traits< _IIter >::difference_type __bound, _Parallelism __parallelism_tag)
 Chose the desired algorithm by evaluating __parallelism_tag. More...
 
template<typename _IIter >
void __shrink_and_double (std::vector< _IIter > &__os_starts, size_t &__count_to_two, size_t &__range_length, const bool __make_twice)
 Shrinks and doubles the ranges. More...
 
template<typename _IIter >
void __shrink (std::vector< _IIter > &__os_starts, size_t &__count_to_two, size_t &__range_length)
 Combines two ranges into one and thus halves the number of ranges. More...
 
template<typename _IIter , typename _FunctorType >
size_t list_partition (const _IIter __begin, const _IIter __end, _IIter *__starts, size_t *__lengths, const int __num_parts, _FunctorType &__f, int __oversampling=0)
 Splits a sequence given by input iterators into parts of almost equal size. More...
 
template<typename _RAIter1 , typename _RAIter2 , typename _OutputIterator , typename _DifferenceTp , typename _Compare >
_OutputIterator __merge_advance_usual (_RAIter1 &__begin1, _RAIter1 __end1, _RAIter2 &__begin2, _RAIter2 __end2, _OutputIterator __target, _DifferenceTp __max_length, _Compare __comp)
 Merge routine being able to merge only the __max_length smallest elements. More...
 
template<typename _RAIter1 , typename _RAIter2 , typename _OutputIterator , typename _DifferenceTp , typename _Compare >
_OutputIterator __merge_advance_movc (_RAIter1 &__begin1, _RAIter1 __end1, _RAIter2 &__begin2, _RAIter2 __end2, _OutputIterator __target, _DifferenceTp __max_length, _Compare __comp)
 Merge routine being able to merge only the __max_length smallest elements. More...
 
template<typename _RAIter1 , typename _RAIter2 , typename _OutputIterator , typename _DifferenceTp , typename _Compare >
_OutputIterator __merge_advance (_RAIter1 &__begin1, _RAIter1 __end1, _RAIter2 &__begin2, _RAIter2 __end2, _OutputIterator __target, _DifferenceTp __max_length, _Compare __comp)
 Merge routine being able to merge only the __max_length smallest elements. More...
 
template<typename _RAIter1 , typename _RAIter2 , typename _RAIter3 , typename _Compare >
_RAIter3 __parallel_merge_advance (_RAIter1 &__begin1, _RAIter1 __end1, _RAIter2 &__begin2, _RAIter2 __end2, _RAIter3 __target, typename std::iterator_traits< _RAIter1 >::difference_type __max_length, _Compare __comp)
 Merge routine fallback to sequential in case the iterators of the two input sequences are of different type. More...
 
template<typename _RAIter1 , typename _RAIter3 , typename _Compare >
_RAIter3 __parallel_merge_advance (_RAIter1 &__begin1, _RAIter1 __end1, _RAIter1 &__begin2, _RAIter1 __end2, _RAIter3 __target, typename std::iterator_traits< _RAIter1 >::difference_type __max_length, _Compare __comp)
 Parallel merge routine being able to merge only the __max_length smallest elements. More...
 
template<typename _RanSeqs , typename _RankType , typename _RankIterator , typename _Compare >
void multiseq_partition (_RanSeqs __begin_seqs, _RanSeqs __end_seqs, _RankType __rank, _RankIterator __begin_offsets, _Compare __comp=std::less< typename std::iterator_traits< typename std::iterator_traits< _RanSeqs >::value_type::first_type >::value_type >())
 Splits several sorted sequences at a certain global __rank, resulting in a splitting point for each sequence. The sequences are passed via a sequence of random-access iterator pairs, none of the sequences may be empty. If there are several equal elements across the split, the ones on the __left side will be chosen from sequences with smaller number. More...
 
template<typename _Tp , typename _RanSeqs , typename _RankType , typename _Compare >
_Tp multiseq_selection (_RanSeqs __begin_seqs, _RanSeqs __end_seqs, _RankType __rank, _RankType &__offset, _Compare __comp=std::less< _Tp >())
 Selects the element at a certain global __rank from several sorted sequences. More...
 
template<template< typename RAI, typename C > class iterator, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 multiway_merge_3_variant (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _DifferenceTp __length, _Compare __comp)
 Highly efficient 3-way merging procedure. More...
 
template<template< typename RAI, typename C > class iterator, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 multiway_merge_4_variant (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _DifferenceTp __length, _Compare __comp)
 Highly efficient 4-way merging procedure. More...
 
template<typename _LT , typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 multiway_merge_loser_tree (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _DifferenceTp __length, _Compare __comp)
 Multi-way merging procedure for a high branching factor, guarded case. More...
 
template<typename _LT , typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 multiway_merge_loser_tree_unguarded (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
 Multi-way merging procedure for a high branching factor, unguarded case. More...
 
template<typename UnguardedLoserTree , typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 multiway_merge_loser_tree_sentinel (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
 Multi-way merging procedure for a high branching factor, requiring sentinels to exist. More...
 
template<bool __stable, bool __sentinels, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 __sequential_multiway_merge (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
 Sequential multi-way merging switch. More...
 
template<bool __stable, typename _RAIterIterator , typename _Compare , typename _DifferenceType >
void multiway_merge_sampling_splitting (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
 Sampling based splitting for parallel multiway-merge routine. More...
 
template<bool __stable, typename _RAIterIterator , typename _Compare , typename _DifferenceType >
void multiway_merge_exact_splitting (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
 Exact splitting for parallel multiway-merge routine. More...
 
template<bool __stable, bool __sentinels, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Splitter , typename _Compare >
_RAIter3 parallel_multiway_merge (_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
 Parallel multi-way merge routine. More...
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::sequential_tag)
 Multiway Merge Frontend. More...
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::exact_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::sampling_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, parallel_tag __tag=parallel_tag(0))
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::sequential_tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::exact_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, sampling_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, parallel_tag __tag=parallel_tag(0))
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::sequential_tag)
 Multiway Merge Frontend. More...
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::exact_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, sampling_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, parallel_tag __tag=parallel_tag(0))
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::sequential_tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, __gnu_parallel::exact_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, sampling_tag __tag)
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, parallel_tag __tag=parallel_tag(0))
 
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut stable_multiway_merge_sentinels (_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
 
template<typename _RAIter , typename _DifferenceTp >
void __determine_samples (_PMWMSSortingData< _RAIter > *__sd, _DifferenceTp __num_samples)
 Select _M_samples from a sequence. More...
 
template<bool __stable, bool __exact, typename _RAIter , typename _Compare >
void parallel_sort_mwms_pu (_PMWMSSortingData< _RAIter > *__sd, _Compare &__comp)
 PMWMS code executed by each thread. More...
 
template<bool __stable, bool __exact, typename _RAIter , typename _Compare >
void parallel_sort_mwms (_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
 PMWMS main call. More...
 
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __for_each_template_random_access_omp_loop (_RAIter __begin, _RAIter __end, _Op __o, _Fu &__f, _Red __r, _Result __base, _Result &__output, typename std::iterator_traits< _RAIter >::difference_type __bound)
 Embarrassingly parallel algorithm for random access iterators, using an OpenMP for loop. More...
 
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __for_each_template_random_access_omp_loop_static (_RAIter __begin, _RAIter __end, _Op __o, _Fu &__f, _Red __r, _Result __base, _Result &__output, typename std::iterator_traits< _RAIter >::difference_type __bound)
 Embarrassingly parallel algorithm for random access iterators, using an OpenMP for loop with static scheduling. More...
 
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __for_each_template_random_access_ed (_RAIter __begin, _RAIter __end, _Op __o, _Fu &__f, _Red __r, _Result __base, _Result &__output, typename std::iterator_traits< _RAIter >::difference_type __bound)
 Embarrassingly parallel algorithm for random access iterators, using hand-crafted parallelization by equal splitting the work. More...
 
template<typename _IIter , typename _OutputIterator , typename _BinaryOperation >
_OutputIterator __parallel_partial_sum_basecase (_IIter __begin, _IIter __end, _OutputIterator __result, _BinaryOperation __bin_op, typename std::iterator_traits< _IIter >::value_type __value)
 Base case prefix sum routine. More...
 
template<typename _IIter , typename _OutputIterator , typename _BinaryOperation >
_OutputIterator __parallel_partial_sum_linear (_IIter __begin, _IIter __end, _OutputIterator __result, _BinaryOperation __bin_op, typename std::iterator_traits< _IIter >::difference_type __n)
 Parallel partial sum implementation, two-phase approach, no recursion. More...
 
template<typename _IIter , typename _OutputIterator , typename _BinaryOperation >
_OutputIterator __parallel_partial_sum (_IIter __begin, _IIter __end, _OutputIterator __result, _BinaryOperation __bin_op)
 Parallel partial sum front-__end. More...
 
template<typename _RAIter , typename _Predicate >
std::iterator_traits< _RAIter >
::difference_type 
__parallel_partition (_RAIter __begin, _RAIter __end, _Predicate __pred, _ThreadIndex __num_threads)
 Parallel implementation of std::partition. More...
 
template<typename _RAIter , typename _Compare >
void __parallel_nth_element (_RAIter __begin, _RAIter __nth, _RAIter __end, _Compare __comp)
 Parallel implementation of std::nth_element(). More...
 
template<typename _RAIter , typename _Compare >
void __parallel_partial_sort (_RAIter __begin, _RAIter __middle, _RAIter __end, _Compare __comp)
 Parallel implementation of std::partial_sort(). More...
 
template<typename _RAIter , typename _Compare >
std::iterator_traits< _RAIter >
::difference_type 
__parallel_sort_qs_divide (_RAIter __begin, _RAIter __end, _Compare __comp, typename std::iterator_traits< _RAIter >::difference_type __pivot_rank, typename std::iterator_traits< _RAIter >::difference_type __num_samples, _ThreadIndex __num_threads)
 Unbalanced quicksort divide step. More...
 
template<typename _RAIter , typename _Compare >
void __parallel_sort_qs_conquer (_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
 Unbalanced quicksort conquer step. More...
 
template<typename _RAIter , typename _Compare >
void __parallel_sort_qs (_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
 Unbalanced quicksort main call. More...
 
template<typename _RandomNumberGenerator >
int __random_number_pow2 (int __logp, _RandomNumberGenerator &__rng)
 Generate a random number in [0,2^__logp). More...
 
template<typename _RAIter , typename _RandomNumberGenerator >
void __parallel_random_shuffle_drs_pu (_DRSSorterPU< _RAIter, _RandomNumberGenerator > *__pus)
 Random shuffle code executed by each thread. More...
 
template<typename _Tp >
_Tp __round_up_to_pow2 (_Tp __x)
 Round up to the next greater power of 2. More...
 
template<typename _RAIter , typename _RandomNumberGenerator >
void __parallel_random_shuffle_drs (_RAIter __begin, _RAIter __end, typename std::iterator_traits< _RAIter >::difference_type __n, _ThreadIndex __num_threads, _RandomNumberGenerator &__rng)
 Main parallel random shuffle step. More...
 
template<typename _RAIter , typename _RandomNumberGenerator >
void __sequential_random_shuffle (_RAIter __begin, _RAIter __end, _RandomNumberGenerator &__rng)
 Sequential cache-efficient random shuffle. More...
 
template<typename _RAIter , typename _RandomNumberGenerator >
void __parallel_random_shuffle (_RAIter __begin, _RAIter __end, _RandomNumberGenerator __rng=_RandomNumber())
 Parallel random public call. More...
 
template<typename _RAIter , typename _DifferenceTp >
void __calc_borders (_RAIter __elements, _DifferenceTp __length, _DifferenceTp *__off)
 Precalculate __advances for Knuth-Morris-Pratt algorithm. More...
 
template<typename __RAIter1 , typename __RAIter2 , typename _Pred >
__RAIter1 __search_template (__RAIter1 __begin1, __RAIter1 __end1, __RAIter2 __begin2, __RAIter2 __end2, _Pred __pred)
 Parallel std::search. More...
 
template<typename _IIter , typename _OutputIterator >
_OutputIterator __copy_tail (std::pair< _IIter, _IIter > __b, std::pair< _IIter, _IIter > __e, _OutputIterator __r)
 
template<typename _IIter , typename _OutputIterator , typename _Operation >
_OutputIterator __parallel_set_operation (_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Operation __op)
 
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __parallel_set_union (_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Compare __comp)
 
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __parallel_set_intersection (_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Compare __comp)
 
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __parallel_set_difference (_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Compare __comp)
 
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __parallel_set_symmetric_difference (_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Compare __comp)
 
template<bool __stable, typename _RAIter , typename _Compare , typename _Parallelism >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, _Parallelism __parallelism)
 
template<bool __stable, typename _RAIter , typename _Compare >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, multiway_mergesort_tag __parallelism)
 Choose multiway mergesort, splitting variant at run-time, for parallel sorting. More...
 
template<bool __stable, typename _RAIter , typename _Compare >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, multiway_mergesort_exact_tag __parallelism)
 Choose multiway mergesort with exact splitting, for parallel sorting. More...
 
template<bool __stable, typename _RAIter , typename _Compare >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, multiway_mergesort_sampling_tag __parallelism)
 Choose multiway mergesort with splitting by sampling, for parallel sorting. More...
 
template<bool __stable, typename _RAIter , typename _Compare >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, quicksort_tag __parallelism)
 Choose quicksort for parallel sorting. More...
 
template<bool __stable, typename _RAIter , typename _Compare >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, balanced_quicksort_tag __parallelism)
 Choose balanced quicksort for parallel sorting. More...
 
template<bool __stable, typename _RAIter , typename _Compare >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, default_parallel_tag __parallelism)
 Choose multiway mergesort with exact splitting, for parallel sorting. More...
 
template<bool __stable, typename _RAIter , typename _Compare >
void __parallel_sort (_RAIter __begin, _RAIter __end, _Compare __comp, parallel_tag __parallelism)
 Choose a parallel sorting algorithm. More...
 
template<typename _IIter , class _OutputIterator , class _BinaryPredicate >
_OutputIterator __parallel_unique_copy (_IIter __first, _IIter __last, _OutputIterator __result, _BinaryPredicate __binary_pred)
 Parallel std::unique_copy(), w/__o explicit equality predicate. More...
 
template<typename _IIter , class _OutputIterator >
_OutputIterator __parallel_unique_copy (_IIter __first, _IIter __last, _OutputIterator __result)
 Parallel std::unique_copy(), without explicit equality predicate. More...
 
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __for_each_template_random_access_workstealing (_RAIter __begin, _RAIter __end, _Op __op, _Fu &__f, _Red __r, _Result __base, _Result &__output, typename std::iterator_traits< _RAIter >::difference_type __bound)
 Work stealing algorithm for random access iterators. More...
 

Variables

static const int _CASable_bits = std::numeric_limits<_CASable>::digits
 Number of bits of _CASable. More...
 
static const _CASable _CASable_mask
 _CASable with the right half of bits set to 1. More...
 

Detailed Description

GNU parallel code for public use.

Typedef Documentation

typedef unsigned short __gnu_parallel::_BinIndex

Type to hold the index of a bin.

Since many variables of this type are allocated, it should be chosen as small as possible.

typedef int64_t __gnu_parallel::_CASable

Longest compare-and-swappable integer type on this platform.

Unsigned integer to index __elements. The total number of elements for each algorithm must fit into this type.

typedef uint16_t __gnu_parallel::_ThreadIndex

Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit into this type.

Enumeration Type Documentation

Strategies for run-time algorithm selection:

Enumerator
heuristic 
force_sequential 
force_parallel 
68  {
69  heuristic,
72  };
Definition: types.h:71
Definition: types.h:69
Definition: types.h:70

Find algorithms:

Enumerator
GROWING_BLOCKS 
CONSTANT_SIZE_BLOCKS 
EQUAL_SPLIT 
107  {
110  EQUAL_SPLIT
111  };
Definition: types.h:110
Definition: types.h:108

Merging algorithms:

Enumerator
LOSER_TREE 
86  {
88  };
Definition: types.h:87

Run-time equivalents for the compile-time tags.

Enumerator
sequential 

Not parallel.

parallel_unbalanced 

Parallel unbalanced (equal-sized chunks).

parallel_balanced 

Parallel balanced (work-stealing).

parallel_omp_loop 

Parallel with OpenMP dynamic load-balancing.

parallel_omp_loop_static 

Parallel with OpenMP static load-balancing.

parallel_taskqueue 

Parallel with OpenMP taskqueue construct.

45  {
47  sequential,
48 
51 
54 
57 
60 
63  };
Parallel with OpenMP dynamic load-balancing.
Definition: types.h:56
Parallel with OpenMP static load-balancing.
Definition: types.h:59
Parallel balanced (work-stealing).
Definition: types.h:53
Not parallel.
Definition: types.h:47
Parallel with OpenMP taskqueue construct.
Definition: types.h:62
Parallel unbalanced (equal-sized chunks).
Definition: types.h:50

Partial sum algorithms: recursive, linear.

Enumerator
RECURSIVE 
LINEAR 
92  {
93  RECURSIVE,
94  LINEAR
95  };
Definition: types.h:93
Definition: types.h:94

Sorting algorithms:

Enumerator
MWMS 
QS 
QS_BALANCED 
77  {
78  MWMS,
79  QS,
81  };
Definition: types.h:79
Definition: types.h:78
Definition: types.h:80

Sorting/merging algorithms: sampling, __exact.

Enumerator
SAMPLING 
EXACT 
99  {
100  SAMPLING,
101  EXACT
102  };
Definition: types.h:100
Definition: types.h:101

Function Documentation

template<typename _Tp >
_Tp __gnu_parallel::__add_omp ( volatile _Tp *  __ptr,
_Tp  __addend 
)
inline
57  {
58  int64_t __res;
59 #pragma omp critical
60  {
61  __res = *__ptr;
62  *(__ptr) += __addend;
63  }
64  return __res;
65  }
template<typename _RAIter , typename _DifferenceTp >
void __gnu_parallel::__calc_borders ( _RAIter  __elements,
_DifferenceTp  __length,
_DifferenceTp *  __off 
)

Precalculate __advances for Knuth-Morris-Pratt algorithm.

Parameters
__elementsBegin iterator of sequence to search for.
__lengthLength of sequence to search for.
__offReturned __offsets.
53  {
54  typedef _DifferenceTp _DifferenceType;
55 
56  __off[0] = -1;
57  if (__length > 1)
58  __off[1] = 0;
59  _DifferenceType __k = 0;
60  for (_DifferenceType __j = 2; __j <= __length; __j++)
61  {
62  while ((__k >= 0) && !(__elements[__k] == __elements[__j-1]))
63  __k = __off[__k];
64  __off[__j] = ++__k;
65  }
66  }
template<typename _Tp >
bool __gnu_parallel::__cas_omp ( volatile _Tp *  __ptr,
_Tp  __comparand,
_Tp  __replacement 
)
inline
84  {
85  bool __res = false;
86 #pragma omp critical
87  {
88  if (*__ptr == __comparand)
89  {
90  *__ptr = __replacement;
91  __res = true;
92  }
93  }
94  return __res;
95  }
template<typename _Tp >
bool __gnu_parallel::__compare_and_swap ( volatile _Tp *  __ptr,
_Tp  __comparand,
_Tp  __replacement 
)
inline

Compare-and-swap.

Compare *__ptr and __comparand. If equal, let *__ptr=__replacement and return true, return false otherwise.

Parameters
__ptrPointer to signed integer.
__comparandCompare value.
__replacementReplacement value.
109  {
110  if (__atomic_always_lock_free(sizeof(_Tp), __ptr))
111  return __atomic_compare_exchange_n(__ptr, &__comparand, __replacement,
112  false, __ATOMIC_ACQ_REL,
113  __ATOMIC_RELAXED);
114  return __cas_omp(__ptr, __comparand, __replacement);
115  }
bool __cas_omp(volatile _Tp *__ptr, _Tp __comparand, _Tp __replacement)
Definition: compatibility.h:83
template<typename _IIter , typename _OutputIterator >
_OutputIterator __gnu_parallel::__copy_tail ( std::pair< _IIter, _IIter >  __b,
std::pair< _IIter, _IIter >  __e,
_OutputIterator  __r 
)
48  {
49  if (__b.first != __e.first)
50  {
51  do
52  {
53  *__r++ = *__b.first++;
54  }
55  while (__b.first != __e.first);
56  }
57  else
58  {
59  while (__b.second != __e.second)
60  *__r++ = *__b.second++;
61  }
62  return __r;
63  }
void __gnu_parallel::__decode2 ( _CASable  __x,
int &  __a,
int &  __b 
)
inline

Decode two integers from one gnu_parallel::_CASable.

Parameters
__x__gnu_parallel::_CASable to decode integers from.
__aFirst integer, to be decoded from the most-significant _CASable_bits/2 bits of __x.
__bSecond integer, to be encoded in the least-significant _CASable_bits/2 bits of __x.
See Also
__encode2
134  {
135  __a = (int)((__x >> (_CASable_bits / 2)) & _CASable_mask);
136  __b = (int)((__x >> 0 ) & _CASable_mask);
137  }
static const _CASable _CASable_mask
_CASable with the right half of bits set to 1.
Definition: types.h:133
static const int _CASable_bits
Number of bits of _CASable.
Definition: types.h:130
template<typename _RAIter , typename _DifferenceTp >
void __gnu_parallel::__determine_samples ( _PMWMSSortingData< _RAIter > *  __sd,
_DifferenceTp  __num_samples 
)

Select _M_samples from a sequence.

Parameters
__sdPointer to algorithm data. _Result will be placed in __sd->_M_samples.
__num_samplesNumber of _M_samples to select.
99  {
100  typedef std::iterator_traits<_RAIter> _TraitsType;
101  typedef typename _TraitsType::value_type _ValueType;
102  typedef _DifferenceTp _DifferenceType;
103 
105 
106  _DifferenceType* __es = new _DifferenceType[__num_samples + 2];
107 
108  __equally_split(__sd->_M_starts[__iam + 1] - __sd->_M_starts[__iam],
109  __num_samples + 1, __es);
110 
111  for (_DifferenceType __i = 0; __i < __num_samples; ++__i)
112  ::new(&(__sd->_M_samples[__iam * __num_samples + __i]))
113  _ValueType(__sd->_M_source[__sd->_M_starts[__iam]
114  + __es[__i + 1]]);
115 
116  delete[] __es;
117  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
_OutputIterator __equally_split(_DifferenceType __n, _ThreadIndex __num_threads, _OutputIterator __s)
function to split a sequence into parts of almost equal size.
Definition: equally_split.h:48
int omp_get_thread_num(void) __GOMP_NOTHROW
_CASable __gnu_parallel::__encode2 ( int  __a,
int  __b 
)
inline

Encode two integers into one gnu_parallel::_CASable.

Parameters
__aFirst integer, to be encoded in the most-significant _CASable_bits/2 bits.
__bSecond integer, to be encoded in the least-significant _CASable_bits/2 bits.
Returns
value encoding __a and __b.
See Also
__decode2
120  {
121  return (((_CASable)__a) << (_CASable_bits / 2)) | (((_CASable)__b) << 0);
122  }
static const int _CASable_bits
Number of bits of _CASable.
Definition: types.h:130
int64_t _CASable
Longest compare-and-swappable integer type on this platform.
Definition: types.h:127
template<typename _DifferenceType , typename _OutputIterator >
_OutputIterator __gnu_parallel::__equally_split ( _DifferenceType  __n,
_ThreadIndex  __num_threads,
_OutputIterator  __s 
)

function to split a sequence into parts of almost equal size.

The resulting sequence __s of length __num_threads+1 contains the splitting positions when splitting the range [0,__n) into parts of almost equal size (plus minus 1). The first entry is 0, the last one n. There may result empty parts.

Parameters
__nNumber of elements
__num_threadsNumber of parts
__sSplitters
Returns
End of __splitter sequence, i.e. __s+__num_threads+1
50  {
51  _DifferenceType __chunk_length = __n / __num_threads;
52  _DifferenceType __num_longer_chunks = __n % __num_threads;
53  _DifferenceType __pos = 0;
54  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
55  {
56  *__s++ = __pos;
57  __pos += ((__i < __num_longer_chunks)
58  ? (__chunk_length + 1) : __chunk_length);
59  }
60  *__s++ = __n;
61  return __s;
62  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
template<typename _DifferenceType >
_DifferenceType __gnu_parallel::__equally_split_point ( _DifferenceType  __n,
_ThreadIndex  __num_threads,
_ThreadIndex  __thread_no 
)

function to split a sequence into parts of almost equal size.

Returns the position of the splitting point between thread number __thread_no (included) and thread number __thread_no+1 (excluded).

Parameters
__nNumber of elements
__num_threadsNumber of parts
__thread_noNumber of threads
Returns
splitting point
78  {
79  _DifferenceType __chunk_length = __n / __num_threads;
80  _DifferenceType __num_longer_chunks = __n % __num_threads;
81  if (__thread_no < __num_longer_chunks)
82  return __thread_no * (__chunk_length + 1);
83  else
84  return __num_longer_chunks * (__chunk_length + 1)
85  + (__thread_no - __num_longer_chunks) * __chunk_length;
86  }
template<typename _Tp >
_Tp __gnu_parallel::__fetch_and_add ( volatile _Tp *  __ptr,
_Tp  __addend 
)
inline

Add a value to a variable, atomically.

Parameters
__ptrPointer to a signed integer.
__addendValue to add.
75  {
76  if (__atomic_always_lock_free(sizeof(_Tp), __ptr))
77  return __atomic_fetch_add(__ptr, __addend, __ATOMIC_ACQ_REL);
78  return __add_omp(__ptr, __addend);
79  }
_Tp __add_omp(volatile _Tp *__ptr, _Tp __addend)
Definition: compatibility.h:56
template<typename _RAIter1 , typename _RAIter2 , typename _Pred , typename _Selector >
std::pair<_RAIter1, _RAIter2> __gnu_parallel::__find_template ( _RAIter1  __begin1,
_RAIter1  __end1,
_RAIter2  __begin2,
_Pred  __pred,
_Selector  __selector 
)
inline

Parallel std::find, switch for different algorithms.

Parameters
__begin1Begin iterator of first sequence.
__end1End iterator of first sequence.
__begin2Begin iterator of second sequence. Must have same length as first sequence.
__predFind predicate.
__selector_Functionality (e. g. std::find_if(), std::equal(),...)
Returns
Place of finding in both sequences.
62  {
63  switch (_Settings::get().find_algorithm)
64  {
65  case GROWING_BLOCKS:
66  return __find_template(__begin1, __end1, __begin2, __pred,
67  __selector, growing_blocks_tag());
69  return __find_template(__begin1, __end1, __begin2, __pred,
70  __selector, constant_size_blocks_tag());
71  case EQUAL_SPLIT:
72  return __find_template(__begin1, __end1, __begin2, __pred,
73  __selector, equal_split_tag());
74  default:
76  return std::make_pair(__begin1, __begin2);
77  }
78  }
std::pair< _RAIter1, _RAIter2 > __find_template(_RAIter1 __begin1, _RAIter1 __end1, _RAIter2 __begin2, _Pred __pred, _Selector __selector)
Parallel std::find, switch for different algorithms.
Definition: find.h:60
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
Definition: types.h:110
Definition: types.h:108
template<typename _IIter , typename _UserOp , typename _Functionality , typename _Red , typename _Result >
_UserOp __gnu_parallel::__for_each_template_random_access ( _IIter  __begin,
_IIter  __end,
_UserOp  __user_op,
_Functionality &  __functionality,
_Red  __reduction,
_Result  __reduction_start,
_Result &  __output,
typename std::iterator_traits< _IIter >::difference_type  __bound,
_Parallelism  __parallelism_tag 
)

Chose the desired algorithm by evaluating __parallelism_tag.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__user_opA user-specified functor (comparator, predicate, associative operator,...)
__functionalityfunctor to process an element with __user_op (depends on desired functionality, e. g. accumulate, for_each,...
__reductionReduction functor.
__reduction_startInitial value for reduction.
__outputOutput iterator.
__boundMaximum number of elements processed.
__parallelism_tagParallelization method
70  {
71  if (__parallelism_tag == parallel_unbalanced)
73  (__begin, __end, __user_op, __functionality, __reduction,
74  __reduction_start, __output, __bound);
75  else if (__parallelism_tag == parallel_omp_loop)
77  (__begin, __end, __user_op, __functionality, __reduction,
78  __reduction_start, __output, __bound);
79  else if (__parallelism_tag == parallel_omp_loop_static)
81  (__begin, __end, __user_op, __functionality, __reduction,
82  __reduction_start, __output, __bound);
83  else //e. g. parallel_balanced
85  (__begin, __end, __user_op, __functionality, __reduction,
86  __reduction_start, __output, __bound);
87  }
Parallel with OpenMP dynamic load-balancing.
Definition: types.h:56
Parallel with OpenMP static load-balancing.
Definition: types.h:59
_Op __for_each_template_random_access_omp_loop(_RAIter __begin, _RAIter __end, _Op __o, _Fu &__f, _Red __r, _Result __base, _Result &__output, typename std::iterator_traits< _RAIter >::difference_type __bound)
Embarrassingly parallel algorithm for random access iterators, using an OpenMP for loop...
Definition: omp_loop.h:67
_Op __for_each_template_random_access_ed(_RAIter __begin, _RAIter __end, _Op __o, _Fu &__f, _Red __r, _Result __base, _Result &__output, typename std::iterator_traits< _RAIter >::difference_type __bound)
Embarrassingly parallel algorithm for random access iterators, using hand-crafted parallelization by ...
Definition: par_loop.h:67
_Op __for_each_template_random_access_workstealing(_RAIter __begin, _RAIter __end, _Op __op, _Fu &__f, _Red __r, _Result __base, _Result &__output, typename std::iterator_traits< _RAIter >::difference_type __bound)
Work stealing algorithm for random access iterators.
Definition: workstealing.h:99
Parallel unbalanced (equal-sized chunks).
Definition: types.h:50
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __gnu_parallel::__for_each_template_random_access_ed ( _RAIter  __begin,
_RAIter  __end,
_Op  __o,
_Fu &  __f,
_Red  __r,
_Result  __base,
_Result &  __output,
typename std::iterator_traits< _RAIter >::difference_type  __bound 
)

Embarrassingly parallel algorithm for random access iterators, using hand-crafted parallelization by equal splitting the work.

Parameters
__beginBegin iterator of element sequence.
__endEnd iterator of element sequence.
__oUser-supplied functor (comparator, predicate, adding functor, ...)
__fFunctor to "process" an element with __op (depends on desired functionality, e. g. for std::for_each(), ...).
__rFunctor to "add" a single __result to the already processed elements (depends on functionality).
__baseBase value for reduction.
__outputPointer to position where final result is written to
__boundMaximum number of elements processed (e. g. for std::count_n()).
Returns
User-supplied functor (that may contain a part of the result).
71  {
72  typedef std::iterator_traits<_RAIter> _TraitsType;
73  typedef typename _TraitsType::difference_type _DifferenceType;
74  const _DifferenceType __length = __end - __begin;
75  _Result *__thread_results;
76  bool* __constructed;
77 
78  _ThreadIndex __num_threads = __gnu_parallel::min<_DifferenceType>
79  (__get_max_threads(), __length);
80 
81 # pragma omp parallel num_threads(__num_threads)
82  {
83 # pragma omp single
84  {
85  __num_threads = omp_get_num_threads();
86  __thread_results = static_cast<_Result*>
87  (::operator new(__num_threads * sizeof(_Result)));
88  __constructed = new bool[__num_threads];
89  }
90 
92 
93  // Neutral element.
94  _Result* __reduct;
95 
96  _DifferenceType
97  __start = __equally_split_point(__length, __num_threads, __iam),
98  __stop = __equally_split_point(__length, __num_threads, __iam + 1);
99 
100  if (__start < __stop)
101  {
102  __reduct = new _Result(__f(__o, __begin + __start));
103  ++__start;
104  __constructed[__iam] = true;
105  }
106  else
107  __constructed[__iam] = false;
108 
109  for (; __start < __stop; ++__start)
110  *__reduct = __r(*__reduct, __f(__o, __begin + __start));
111 
112  if (__constructed[__iam])
113  {
114  ::new(&__thread_results[__iam]) _Result(*__reduct);
115  delete __reduct;
116  }
117  } //parallel
118 
119  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
120  if (__constructed[__i])
121  {
122  __output = __r(__output, __thread_results[__i]);
123  __thread_results[__i].~_Result();
124  }
125 
126  // Points to last element processed (needed as return value for
127  // some algorithms like transform).
128  __f._M_finish_iterator = __begin + __length;
129 
130  ::operator delete(__thread_results);
131 
132  delete[] __constructed;
133 
134  return __o;
135  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_num_threads(void) __GOMP_NOTHROW
_DifferenceType __equally_split_point(_DifferenceType __n, _ThreadIndex __num_threads, _ThreadIndex __thread_no)
function to split a sequence into parts of almost equal size.
Definition: equally_split.h:75
int omp_get_thread_num(void) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __gnu_parallel::__for_each_template_random_access_omp_loop ( _RAIter  __begin,
_RAIter  __end,
_Op  __o,
_Fu &  __f,
_Red  __r,
_Result  __base,
_Result &  __output,
typename std::iterator_traits< _RAIter >::difference_type  __bound 
)

Embarrassingly parallel algorithm for random access iterators, using an OpenMP for loop.

Parameters
__beginBegin iterator of element sequence.
__endEnd iterator of element sequence.
__oUser-supplied functor (comparator, predicate, adding functor, etc.).
__fFunctor to process an element with __op (depends on desired functionality, e. g. for std::for_each(), ...).
__rFunctor to add a single __result to the already processed elements (depends on functionality).
__baseBase value for reduction.
__outputPointer to position where final result is written to
__boundMaximum number of elements processed (e. g. for std::count_n()).
Returns
User-supplied functor (that may contain a part of the result).
72  {
73  typedef typename std::iterator_traits<_RAIter>::difference_type
74  _DifferenceType;
75 
76  _DifferenceType __length = __end - __begin;
77  _ThreadIndex __num_threads = __gnu_parallel::min<_DifferenceType>
78  (__get_max_threads(), __length);
79 
80  _Result *__thread_results;
81 
82 # pragma omp parallel num_threads(__num_threads)
83  {
84 # pragma omp single
85  {
86  __num_threads = omp_get_num_threads();
87  __thread_results = new _Result[__num_threads];
88 
89  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
90  __thread_results[__i] = _Result();
91  }
92 
94 
95 #pragma omp for schedule(dynamic, _Settings::get().workstealing_chunk_size)
96  for (_DifferenceType __pos = 0; __pos < __length; ++__pos)
97  __thread_results[__iam] = __r(__thread_results[__iam],
98  __f(__o, __begin+__pos));
99  } //parallel
100 
101  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
102  __output = __r(__output, __thread_results[__i]);
103 
104  delete [] __thread_results;
105 
106  // Points to last element processed (needed as return value for
107  // some algorithms like transform).
108  __f._M_finish_iterator = __begin + __length;
109 
110  return __o;
111  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_num_threads(void) __GOMP_NOTHROW
int omp_get_thread_num(void) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __gnu_parallel::__for_each_template_random_access_omp_loop_static ( _RAIter  __begin,
_RAIter  __end,
_Op  __o,
_Fu &  __f,
_Red  __r,
_Result  __base,
_Result &  __output,
typename std::iterator_traits< _RAIter >::difference_type  __bound 
)

Embarrassingly parallel algorithm for random access iterators, using an OpenMP for loop with static scheduling.

Parameters
__beginBegin iterator of element sequence.
__endEnd iterator of element sequence.
__oUser-supplied functor (comparator, predicate, adding functor, ...).
__fFunctor to process an element with __op (depends on desired functionality, e. g. for std::for_each(), ...).
__rFunctor to add a single __result to the already processed __elements (depends on functionality).
__baseBase value for reduction.
__outputPointer to position where final result is written to
__boundMaximum number of elements processed (e. g. for std::count_n()).
Returns
User-supplied functor (that may contain a part of the result).
72  {
73  typedef typename std::iterator_traits<_RAIter>::difference_type
74  _DifferenceType;
75 
76  _DifferenceType __length = __end - __begin;
77  _ThreadIndex __num_threads = std::min<_DifferenceType>
78  (__get_max_threads(), __length);
79 
80  _Result *__thread_results;
81 
82 # pragma omp parallel num_threads(__num_threads)
83  {
84 # pragma omp single
85  {
86  __num_threads = omp_get_num_threads();
87  __thread_results = new _Result[__num_threads];
88 
89  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
90  __thread_results[__i] = _Result();
91  }
92 
94 
95 #pragma omp for schedule(static, _Settings::get().workstealing_chunk_size)
96  for (_DifferenceType __pos = 0; __pos < __length; ++__pos)
97  __thread_results[__iam] = __r(__thread_results[__iam],
98  __f(__o, __begin+__pos));
99  } //parallel
100 
101  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
102  __output = __r(__output, __thread_results[__i]);
103 
104  delete [] __thread_results;
105 
106  // Points to last element processed (needed as return value for
107  // some algorithms like transform).
108  __f.finish_iterator = __begin + __length;
109 
110  return __o;
111  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_num_threads(void) __GOMP_NOTHROW
int omp_get_thread_num(void) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
template<typename _RAIter , typename _Op , typename _Fu , typename _Red , typename _Result >
_Op __gnu_parallel::__for_each_template_random_access_workstealing ( _RAIter  __begin,
_RAIter  __end,
_Op  __op,
_Fu &  __f,
_Red  __r,
_Result  __base,
_Result &  __output,
typename std::iterator_traits< _RAIter >::difference_type  __bound 
)

Work stealing algorithm for random access iterators.

Uses O(1) additional memory. Synchronization at job lists is done with atomic operations.

Parameters
__beginBegin iterator of element sequence.
__endEnd iterator of element sequence.
__opUser-supplied functor (comparator, predicate, adding functor, ...).
__fFunctor to process an element with __op (depends on desired functionality, e. g. for std::for_each(), ...).
__rFunctor to add a single __result to the already processed elements (depends on functionality).
__baseBase value for reduction.
__outputPointer to position where final result is written to
__boundMaximum number of elements processed (e. g. for std::count_n()).
Returns
User-supplied functor (that may contain a part of the result).
105  {
106  _GLIBCXX_CALL(__end - __begin)
107 
108  typedef std::iterator_traits<_RAIter> _TraitsType;
109  typedef typename _TraitsType::difference_type _DifferenceType;
110 
111  const _Settings& __s = _Settings::get();
112 
113  _DifferenceType __chunk_size =
114  static_cast<_DifferenceType>(__s.workstealing_chunk_size);
115 
116  // How many jobs?
117  _DifferenceType __length = (__bound < 0) ? (__end - __begin) : __bound;
118 
119  // To avoid false sharing in a cache line.
120  const int __stride = (__s.cache_line_size * 10
121  / sizeof(_Job<_DifferenceType>) + 1);
122 
123  // Total number of threads currently working.
124  _ThreadIndex __busy = 0;
125 
126  _Job<_DifferenceType> *__job;
127 
128  omp_lock_t __output_lock;
129  omp_init_lock(&__output_lock);
130 
131  // Write base value to output.
132  __output = __base;
133 
134  // No more threads than jobs, at least one thread.
135  _ThreadIndex __num_threads = __gnu_parallel::max<_ThreadIndex>
136  (1, __gnu_parallel::min<_DifferenceType>(__length,
137  __get_max_threads()));
138 
139 # pragma omp parallel shared(__busy) num_threads(__num_threads)
140  {
141 # pragma omp single
142  {
143  __num_threads = omp_get_num_threads();
144 
145  // Create job description array.
146  __job = new _Job<_DifferenceType>[__num_threads * __stride];
147  }
148 
149  // Initialization phase.
150 
151  // Flags for every thread if it is doing productive work.
152  bool __iam_working = false;
153 
154  // Thread id.
155  _ThreadIndex __iam = omp_get_thread_num();
156 
157  // This job.
158  _Job<_DifferenceType>& __my_job = __job[__iam * __stride];
159 
160  // Random number (for work stealing).
161  _ThreadIndex __victim;
162 
163  // Local value for reduction.
164  _Result __result = _Result();
165 
166  // Number of elements to steal in one attempt.
167  _DifferenceType __steal;
168 
169  // Every thread has its own random number generator
170  // (modulo __num_threads).
171  _RandomNumber __rand_gen(__iam, __num_threads);
172 
173  // This thread is currently working.
174 # pragma omp atomic
175  ++__busy;
176 
177  __iam_working = true;
178 
179  // How many jobs per thread? last thread gets the rest.
180  __my_job._M_first = static_cast<_DifferenceType>
181  (__iam * (__length / __num_threads));
182 
183  __my_job._M_last = (__iam == (__num_threads - 1)
184  ? (__length - 1)
185  : ((__iam + 1) * (__length / __num_threads) - 1));
186  __my_job._M_load = __my_job._M_last - __my_job._M_first + 1;
187 
188  // Init result with _M_first value (to have a base value for reduction)
189  if (__my_job._M_first <= __my_job._M_last)
190  {
191  // Cannot use volatile variable directly.
192  _DifferenceType __my_first = __my_job._M_first;
193  __result = __f(__op, __begin + __my_first);
194  ++__my_job._M_first;
195  --__my_job._M_load;
196  }
197 
198  _RAIter __current;
199 
200 # pragma omp barrier
201 
202  // Actual work phase
203  // Work on own or stolen current start
204  while (__busy > 0)
205  {
206  // Work until no productive thread left.
207 # pragma omp flush(__busy)
208 
209  // Thread has own work to do
210  while (__my_job._M_first <= __my_job._M_last)
211  {
212  // fetch-and-add call
213  // Reserve current job block (size __chunk_size) in my queue.
214  _DifferenceType __current_job =
215  __fetch_and_add<_DifferenceType>(&(__my_job._M_first),
216  __chunk_size);
217 
218  // Update _M_load, to make the three values consistent,
219  // _M_first might have been changed in the meantime
220  __my_job._M_load = __my_job._M_last - __my_job._M_first + 1;
221  for (_DifferenceType __job_counter = 0;
222  __job_counter < __chunk_size
223  && __current_job <= __my_job._M_last;
224  ++__job_counter)
225  {
226  // Yes: process it!
227  __current = __begin + __current_job;
228  ++__current_job;
229 
230  // Do actual work.
231  __result = __r(__result, __f(__op, __current));
232  }
233 
234 # pragma omp flush(__busy)
235  }
236 
237  // After reaching this point, a thread's __job list is empty.
238  if (__iam_working)
239  {
240  // This thread no longer has work.
241 # pragma omp atomic
242  --__busy;
243 
244  __iam_working = false;
245  }
246 
247  _DifferenceType __supposed_first, __supposed_last,
248  __supposed_load;
249  do
250  {
251  // Find random nonempty deque (not own), do consistency check.
252  __yield();
253 # pragma omp flush(__busy)
254  __victim = __rand_gen();
255  __supposed_first = __job[__victim * __stride]._M_first;
256  __supposed_last = __job[__victim * __stride]._M_last;
257  __supposed_load = __job[__victim * __stride]._M_load;
258  }
259  while (__busy > 0
260  && ((__supposed_load <= 0)
261  || ((__supposed_first + __supposed_load - 1)
262  != __supposed_last)));
263 
264  if (__busy == 0)
265  break;
266 
267  if (__supposed_load > 0)
268  {
269  // Has work and work to do.
270  // Number of elements to steal (at least one).
271  __steal = (__supposed_load < 2) ? 1 : __supposed_load / 2;
272 
273  // Push __victim's current start forward.
274  _DifferenceType __stolen_first =
275  __fetch_and_add<_DifferenceType>
276  (&(__job[__victim * __stride]._M_first), __steal);
277  _DifferenceType __stolen_try = (__stolen_first + __steal
278  - _DifferenceType(1));
279 
280  __my_job._M_first = __stolen_first;
281  __my_job._M_last = __gnu_parallel::min(__stolen_try,
282  __supposed_last);
283  __my_job._M_load = __my_job._M_last - __my_job._M_first + 1;
284 
285  // Has potential work again.
286 # pragma omp atomic
287  ++__busy;
288  __iam_working = true;
289 
290 # pragma omp flush(__busy)
291  }
292 # pragma omp flush(__busy)
293  } // end while __busy > 0
294  // Add accumulated result to output.
295  omp_set_lock(&__output_lock);
296  __output = __r(__output, __result);
297  omp_unset_lock(&__output_lock);
298  }
299 
300  delete[] __job;
301 
302  // Points to last element processed (needed as return value for
303  // some algorithms like transform)
304  __f._M_finish_iterator = __begin + __length;
305 
306  omp_destroy_lock(&__output_lock);
307 
308  return __op;
309  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
_Siter_base< _Iterator >::iterator_type __base(_Iterator __it)
Definition: functions.h:446
void __yield()
Yield control to another thread, without waiting for the end of the time slice.
Definition: compatibility.h:121
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
int omp_get_num_threads(void) __GOMP_NOTHROW
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
void omp_destroy_lock(omp_lock_t *) __GOMP_NOTHROW
void omp_unset_lock(omp_lock_t *) __GOMP_NOTHROW
const _Tp & max(const _Tp &__a, const _Tp &__b)
Equivalent to std::max.
Definition: base.h:150
void omp_set_lock(omp_lock_t *) __GOMP_NOTHROW
int omp_get_thread_num(void) __GOMP_NOTHROW
void omp_init_lock(omp_lock_t *) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
Definition: omp.h:34
_ThreadIndex __gnu_parallel::__get_max_threads ( )
inline
86  {
88  return __i > 1 ? __i : 1;
89  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_max_threads(void) __GOMP_NOTHROW
bool __gnu_parallel::__is_parallel ( const _Parallelism  __p)
inline
93 { return __p != sequential; }
Not parallel.
Definition: types.h:47
template<typename _IIter , typename _Compare >
bool __gnu_parallel::__is_sorted ( _IIter  __begin,
_IIter  __end,
_Compare  __comp 
)

Check whether [__begin, __end) is sorted according to __comp.

Parameters
__beginBegin iterator of sequence.
__endEnd iterator of sequence.
__compComparator.
Returns
true if sorted, false otherwise.
52  {
53  if (__begin == __end)
54  return true;
55 
56  _IIter __current(__begin), __recent(__begin);
57 
58  unsigned long long __position = 1;
59  for (__current++; __current != __end; __current++)
60  {
61  if (__comp(*__current, *__recent))
62  {
63  return false;
64  }
65  __recent = __current;
66  __position++;
67  }
68 
69  return true;
70  }
template<typename _RAIter , typename _Compare >
_RAIter __gnu_parallel::__median_of_three_iterators ( _RAIter  __a,
_RAIter  __b,
_RAIter  __c,
_Compare  __comp 
)

Compute the median of three referenced elements, according to __comp.

Parameters
__aFirst iterator.
__bSecond iterator.
__cThird iterator.
__compComparator.
400  {
401  if (__comp(*__a, *__b))
402  if (__comp(*__b, *__c))
403  return __b;
404  else
405  if (__comp(*__a, *__c))
406  return __c;
407  else
408  return __a;
409  else
410  {
411  // Just swap __a and __b.
412  if (__comp(*__a, *__c))
413  return __a;
414  else
415  if (__comp(*__b, *__c))
416  return __c;
417  else
418  return __b;
419  }
420  }
template<typename _RAIter1 , typename _RAIter2 , typename _OutputIterator , typename _DifferenceTp , typename _Compare >
_OutputIterator __gnu_parallel::__merge_advance ( _RAIter1 &  __begin1,
_RAIter1  __end1,
_RAIter2 &  __begin2,
_RAIter2  __end2,
_OutputIterator  __target,
_DifferenceTp  __max_length,
_Compare  __comp 
)
inline

Merge routine being able to merge only the __max_length smallest elements.

The __begin iterators are advanced accordingly, they might not reach __end, in contrast to the usual variant. Static switch on whether to use the conditional-move variant.

Parameters
__begin1Begin iterator of first sequence.
__end1End iterator of first sequence.
__begin2Begin iterator of second sequence.
__end2End iterator of second sequence.
__targetTarget begin iterator.
__max_lengthMaximum number of elements to merge.
__compComparator.
Returns
Output end iterator.
175  {
176  _GLIBCXX_CALL(__max_length)
177 
178  return __merge_advance_movc(__begin1, __end1, __begin2, __end2,
179  __target, __max_length, __comp);
180  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
_OutputIterator __merge_advance_movc(_RAIter1 &__begin1, _RAIter1 __end1, _RAIter2 &__begin2, _RAIter2 __end2, _OutputIterator __target, _DifferenceTp __max_length, _Compare __comp)
Merge routine being able to merge only the __max_length smallest elements.
Definition: merge.h:105
return(unsigned int) __res
template<typename _RAIter1 , typename _RAIter2 , typename _OutputIterator , typename _DifferenceTp , typename _Compare >
_OutputIterator __gnu_parallel::__merge_advance_movc ( _RAIter1 &  __begin1,
_RAIter1  __end1,
_RAIter2 &  __begin2,
_RAIter2  __end2,
_OutputIterator  __target,
_DifferenceTp  __max_length,
_Compare  __comp 
)

Merge routine being able to merge only the __max_length smallest elements.

The __begin iterators are advanced accordingly, they might not reach __end, in contrast to the usual variant. Specially designed code should allow the compiler to generate conditional moves instead of branches.

Parameters
__begin1Begin iterator of first sequence.
__end1End iterator of first sequence.
__begin2Begin iterator of second sequence.
__end2End iterator of second sequence.
__targetTarget begin iterator.
__max_lengthMaximum number of elements to merge.
__compComparator.
Returns
Output end iterator.
109  {
110  typedef _DifferenceTp _DifferenceType;
111  typedef typename std::iterator_traits<_RAIter1>::value_type
112  _ValueType1;
113  typedef typename std::iterator_traits<_RAIter2>::value_type
114  _ValueType2;
115 
116 #if _GLIBCXX_ASSERTIONS
117  _GLIBCXX_PARALLEL_ASSERT(__max_length >= 0);
118 #endif
119 
120  while (__begin1 != __end1 && __begin2 != __end2 && __max_length > 0)
121  {
122  _RAIter1 __next1 = __begin1 + 1;
123  _RAIter2 __next2 = __begin2 + 1;
124  _ValueType1 __element1 = *__begin1;
125  _ValueType2 __element2 = *__begin2;
126 
127  if (__comp(__element2, __element1))
128  {
129  __element1 = __element2;
130  __begin2 = __next2;
131  }
132  else
133  __begin1 = __next1;
134 
135  *__target = __element1;
136 
137  ++__target;
138  --__max_length;
139  }
140  if (__begin1 != __end1)
141  {
142  __target = std::copy(__begin1, __begin1 + __max_length, __target);
143  __begin1 += __max_length;
144  }
145  else
146  {
147  __target = std::copy(__begin2, __begin2 + __max_length, __target);
148  __begin2 += __max_length;
149  }
150  return __target;
151  }
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<typename _RAIter1 , typename _RAIter2 , typename _OutputIterator , typename _DifferenceTp , typename _Compare >
_OutputIterator __gnu_parallel::__merge_advance_usual ( _RAIter1 &  __begin1,
_RAIter1  __end1,
_RAIter2 &  __begin2,
_RAIter2  __end2,
_OutputIterator  __target,
_DifferenceTp  __max_length,
_Compare  __comp 
)

Merge routine being able to merge only the __max_length smallest elements.

The __begin iterators are advanced accordingly, they might not reach __end, in contrast to the usual variant.

Parameters
__begin1Begin iterator of first sequence.
__end1End iterator of first sequence.
__begin2Begin iterator of second sequence.
__end2End iterator of second sequence.
__targetTarget begin iterator.
__max_lengthMaximum number of elements to merge.
__compComparator.
Returns
Output end iterator.
61  {
62  typedef _DifferenceTp _DifferenceType;
63  while (__begin1 != __end1 && __begin2 != __end2 && __max_length > 0)
64  {
65  // array1[__i1] < array0[i0]
66  if (__comp(*__begin2, *__begin1))
67  *__target++ = *__begin2++;
68  else
69  *__target++ = *__begin1++;
70  --__max_length;
71  }
72 
73  if (__begin1 != __end1)
74  {
75  __target = std::copy(__begin1, __begin1 + __max_length, __target);
76  __begin1 += __max_length;
77  }
78  else
79  {
80  __target = std::copy(__begin2, __begin2 + __max_length, __target);
81  __begin2 += __max_length;
82  }
83  return __target;
84  }
template<typename _RAIter1 , typename _RAIter2 , typename _RAIter3 , typename _Compare >
_RAIter3 __gnu_parallel::__parallel_merge_advance ( _RAIter1 &  __begin1,
_RAIter1  __end1,
_RAIter2 &  __begin2,
_RAIter2  __end2,
_RAIter3  __target,
typename std::iterator_traits< _RAIter1 >::difference_type  __max_length,
_Compare  __comp 
)
inline

Merge routine fallback to sequential in case the iterators of the two input sequences are of different type.

Parameters
__begin1Begin iterator of first sequence.
__end1End iterator of first sequence.
__begin2Begin iterator of second sequence.
__end2End iterator of second sequence.
__targetTarget begin iterator.
__max_lengthMaximum number of elements to merge.
__compComparator.
Returns
Output end iterator.
202  { return __merge_advance(__begin1, __end1, __begin2, __end2, __target,
203  __max_length, __comp); }
_OutputIterator __merge_advance(_RAIter1 &__begin1, _RAIter1 __end1, _RAIter2 &__begin2, _RAIter2 __end2, _OutputIterator __target, _DifferenceTp __max_length, _Compare __comp)
Merge routine being able to merge only the __max_length smallest elements.
Definition: merge.h:171
template<typename _RAIter1 , typename _RAIter3 , typename _Compare >
_RAIter3 __gnu_parallel::__parallel_merge_advance ( _RAIter1 &  __begin1,
_RAIter1  __end1,
_RAIter1 &  __begin2,
_RAIter1  __end2,
_RAIter3  __target,
typename std::iterator_traits< _RAIter1 >::difference_type  __max_length,
_Compare  __comp 
)
inline

Parallel merge routine being able to merge only the __max_length smallest elements.

The __begin iterators are advanced accordingly, they might not reach __end, in contrast to the usual variant. The functionality is projected onto parallel_multiway_merge.

Parameters
__begin1Begin iterator of first sequence.
__end1End iterator of first sequence.
__begin2Begin iterator of second sequence.
__end2End iterator of second sequence.
__targetTarget begin iterator.
__max_lengthMaximum number of elements to merge.
__compComparator.
Returns
Output end iterator.
228  {
229  typedef typename
230  std::iterator_traits<_RAIter1>::value_type _ValueType;
231  typedef typename std::iterator_traits<_RAIter1>::
232  difference_type _DifferenceType1 /* == difference_type2 */;
233  typedef typename std::iterator_traits<_RAIter3>::
234  difference_type _DifferenceType3;
235  typedef typename std::pair<_RAIter1, _RAIter1>
236  _IteratorPair;
237 
238  _IteratorPair __seqs[2] = { std::make_pair(__begin1, __end1),
239  std::make_pair(__begin2, __end2) };
240  _RAIter3 __target_end = parallel_multiway_merge
241  < /* __stable = */ true, /* __sentinels = */ false>
242  (__seqs, __seqs + 2, __target, multiway_merge_exact_splitting
243  < /* __stable = */ true, _IteratorPair*,
244  _Compare, _DifferenceType1>, __max_length, __comp,
246 
247  return __target_end;
248  }
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
int omp_get_max_threads(void) __GOMP_NOTHROW
void multiway_merge_exact_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Exact splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1120
template<typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_nth_element ( _RAIter  __begin,
_RAIter  __nth,
_RAIter  __end,
_Compare  __comp 
)

Parallel implementation of std::nth_element().

Parameters
__beginBegin iterator of input sequence.
__nth_Iterator of element that must be in position afterwards.
__endEnd iterator of input sequence.
__compComparator.
334  {
335  typedef std::iterator_traits<_RAIter> _TraitsType;
336  typedef typename _TraitsType::value_type _ValueType;
337  typedef typename _TraitsType::difference_type _DifferenceType;
338 
339  _GLIBCXX_CALL(__end - __begin)
340 
341  _RAIter __split;
342  _RandomNumber __rng;
343 
344  const _Settings& __s = _Settings::get();
345  _DifferenceType __minimum_length = std::max<_DifferenceType>(2,
346  std::max(__s.nth_element_minimal_n, __s.partition_minimal_n));
347 
348  // Break if input range to small.
349  while (static_cast<_SequenceIndex>(__end - __begin) >= __minimum_length)
350  {
351  _DifferenceType __n = __end - __begin;
352 
353  _RAIter __pivot_pos = __begin + __rng(__n);
354 
355  // Swap __pivot_pos value to end.
356  if (__pivot_pos != (__end - 1))
357  std::iter_swap(__pivot_pos, __end - 1);
358  __pivot_pos = __end - 1;
359 
360  // _Compare must have first_value_type, second_value_type,
361  // result_type
362  // _Compare ==
363  // __gnu_parallel::_Lexicographic<S, int,
364  // __gnu_parallel::_Less<S, S> >
365  // __pivot_pos == std::pair<S, int>*
367  __pred(__comp, *__pivot_pos);
368 
369  // Divide, leave pivot unchanged in last place.
370  _RAIter __split_pos1, __split_pos2;
371  __split_pos1 = __begin + __parallel_partition(__begin, __end - 1,
372  __pred,
374 
375  // Left side: < __pivot_pos; __right side: >= __pivot_pos
376 
377  // Swap pivot back to middle.
378  if (__split_pos1 != __pivot_pos)
379  std::iter_swap(__split_pos1, __pivot_pos);
380  __pivot_pos = __split_pos1;
381 
382  // In case all elements are equal, __split_pos1 == 0
383  if ((__split_pos1 + 1 - __begin) < (__n >> 7)
384  || (__end - __split_pos1) < (__n >> 7))
385  {
386  // Very unequal split, one part smaller than one 128th
387  // elements not strictly larger than the pivot.
389  __binder1st<_Compare, _ValueType,
390  _ValueType, bool>, _ValueType>
391  __pred(__gnu_parallel::__binder1st<_Compare, _ValueType,
392  _ValueType, bool>(__comp, *__pivot_pos));
393 
394  // Find other end of pivot-equal range.
395  __split_pos2 = __gnu_sequential::partition(__split_pos1 + 1,
396  __end, __pred);
397  }
398  else
399  // Only skip the pivot.
400  __split_pos2 = __split_pos1 + 1;
401 
402  // Compare iterators.
403  if (__split_pos2 <= __nth)
404  __begin = __split_pos2;
405  else if (__nth < __split_pos1)
406  __end = __split_pos1;
407  else
408  break;
409  }
410 
411  // Only at most _Settings::partition_minimal_n __elements __left.
412  __gnu_sequential::nth_element(__begin, __nth, __end, __comp);
413  }
Similar to std::unary_negate, but giving the argument types explicitly.
Definition: base.h:173
std::iterator_traits< _RAIter >::difference_type __parallel_partition(_RAIter __begin, _RAIter __end, _Predicate __pred, _ThreadIndex __num_threads)
Parallel implementation of std::partition.
Definition: partition.h:56
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
Similar to std::binder2nd, but giving the argument types explicitly.
Definition: base.h:220
const _Tp & max(const _Tp &__a, const _Tp &__b)
Equivalent to std::max.
Definition: base.h:150
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
_ThreadIndex __get_max_threads()
Definition: base.h:85
Similar to std::binder1st, but giving the argument types explicitly.
Definition: base.h:192
template<typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_partial_sort ( _RAIter  __begin,
_RAIter  __middle,
_RAIter  __end,
_Compare  __comp 
)

Parallel implementation of std::partial_sort().

Parameters
__beginBegin iterator of input sequence.
__middleSort until this position.
__endEnd iterator of input sequence.
__compComparator.
425  {
426  __parallel_nth_element(__begin, __middle, __end, __comp);
427  std::sort(__begin, __middle, __comp);
428  }
void __parallel_nth_element(_RAIter __begin, _RAIter __nth, _RAIter __end, _Compare __comp)
Parallel implementation of std::nth_element().
Definition: partition.h:332
template<typename _IIter , typename _OutputIterator , typename _BinaryOperation >
_OutputIterator __gnu_parallel::__parallel_partial_sum ( _IIter  __begin,
_IIter  __end,
_OutputIterator  __result,
_BinaryOperation  __bin_op 
)

Parallel partial sum front-__end.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__resultBegin iterator of output sequence.
__bin_opAssociative binary function.
Returns
End iterator of output sequence.
207  {
208  _GLIBCXX_CALL(__begin - __end)
209 
210  typedef std::iterator_traits<_IIter> _TraitsType;
211  typedef typename _TraitsType::value_type _ValueType;
212  typedef typename _TraitsType::difference_type _DifferenceType;
213 
214  _DifferenceType __n = __end - __begin;
215 
216  switch (_Settings::get().partial_sum_algorithm)
217  {
218  case LINEAR:
219  // Need an initial offset.
220  return __parallel_partial_sum_linear(__begin, __end, __result,
221  __bin_op, __n);
222  default:
223  // Partial_sum algorithm not implemented.
225  return __result + __n;
226  }
227  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
_OutputIterator __parallel_partial_sum_linear(_IIter __begin, _IIter __end, _OutputIterator __result, _BinaryOperation __bin_op, typename std::iterator_traits< _IIter >::difference_type __n)
Parallel partial sum implementation, two-phase approach, no recursion.
Definition: partial_sum.h:89
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
Definition: types.h:94
template<typename _IIter , typename _OutputIterator , typename _BinaryOperation >
_OutputIterator __gnu_parallel::__parallel_partial_sum_basecase ( _IIter  __begin,
_IIter  __end,
_OutputIterator  __result,
_BinaryOperation  __bin_op,
typename std::iterator_traits< _IIter >::value_type  __value 
)

Base case prefix sum routine.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__resultBegin iterator of output sequence.
__bin_opAssociative binary function.
__valueStart value. Must be passed since the neutral element is unknown in general.
Returns
End iterator of output sequence.
62  {
63  if (__begin == __end)
64  return __result;
65 
66  while (__begin != __end)
67  {
68  __value = __bin_op(__value, *__begin);
69  *__result = __value;
70  ++__result;
71  ++__begin;
72  }
73  return __result;
74  }
template<typename _IIter , typename _OutputIterator , typename _BinaryOperation >
_OutputIterator __gnu_parallel::__parallel_partial_sum_linear ( _IIter  __begin,
_IIter  __end,
_OutputIterator  __result,
_BinaryOperation  __bin_op,
typename std::iterator_traits< _IIter >::difference_type  __n 
)

Parallel partial sum implementation, two-phase approach, no recursion.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__resultBegin iterator of output sequence.
__bin_opAssociative binary function.
__nLength of sequence.
Returns
End iterator of output sequence.
93  {
94  typedef std::iterator_traits<_IIter> _TraitsType;
95  typedef typename _TraitsType::value_type _ValueType;
96  typedef typename _TraitsType::difference_type _DifferenceType;
97 
98  if (__begin == __end)
99  return __result;
100 
101  _ThreadIndex __num_threads =
102  std::min<_DifferenceType>(__get_max_threads(), __n - 1);
103 
104  if (__num_threads < 2)
105  {
106  *__result = *__begin;
107  return __parallel_partial_sum_basecase(__begin + 1, __end,
108  __result + 1, __bin_op,
109  *__begin);
110  }
111 
112  _DifferenceType* __borders;
113  _ValueType* __sums;
114 
115  const _Settings& __s = _Settings::get();
116 
117 # pragma omp parallel num_threads(__num_threads)
118  {
119 # pragma omp single
120  {
121  __num_threads = omp_get_num_threads();
122 
123  __borders = new _DifferenceType[__num_threads + 2];
124 
125  if (__s.partial_sum_dilation == 1.0f)
126  __equally_split(__n, __num_threads + 1, __borders);
127  else
128  {
129  _DifferenceType __first_part_length =
130  std::max<_DifferenceType>(1,
131  __n / (1.0f + __s.partial_sum_dilation * __num_threads));
132  _DifferenceType __chunk_length =
133  (__n - __first_part_length) / __num_threads;
134  _DifferenceType __borderstart =
135  __n - __num_threads * __chunk_length;
136  __borders[0] = 0;
137  for (_ThreadIndex __i = 1; __i < (__num_threads + 1); ++__i)
138  {
139  __borders[__i] = __borderstart;
140  __borderstart += __chunk_length;
141  }
142  __borders[__num_threads + 1] = __n;
143  }
144 
145  __sums = static_cast<_ValueType*>(::operator new(sizeof(_ValueType)
146  * __num_threads));
147  _OutputIterator __target_end;
148  } //single
149 
151  if (__iam == 0)
152  {
153  *__result = *__begin;
155  __begin + __borders[1],
156  __result + 1,
157  __bin_op, *__begin);
158  ::new(&(__sums[__iam])) _ValueType(*(__result + __borders[1] - 1));
159  }
160  else
161  {
162  ::new(&(__sums[__iam]))
163  _ValueType(__gnu_parallel::accumulate(
164  __begin + __borders[__iam] + 1,
165  __begin + __borders[__iam + 1],
166  *(__begin + __borders[__iam]),
167  __bin_op,
169  }
170 
171 # pragma omp barrier
172 
173 # pragma omp single
174  __parallel_partial_sum_basecase(__sums + 1, __sums + __num_threads,
175  __sums + 1, __bin_op, __sums[0]);
176 
177 # pragma omp barrier
178 
179  // Still same team.
180  __parallel_partial_sum_basecase(__begin + __borders[__iam + 1],
181  __begin + __borders[__iam + 2],
182  __result + __borders[__iam + 1],
183  __bin_op, __sums[__iam]);
184  } //parallel
185 
186  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
187  __sums[__i].~_ValueType();
188  ::operator delete(__sums);
189 
190  delete[] __borders;
191 
192  return __result + __n;
193  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_num_threads(void) __GOMP_NOTHROW
_OutputIterator __parallel_partial_sum_basecase(_IIter __begin, _IIter __end, _OutputIterator __result, _BinaryOperation __bin_op, typename std::iterator_traits< _IIter >::value_type __value)
Base case prefix sum routine.
Definition: partial_sum.h:58
_OutputIterator __equally_split(_DifferenceType __n, _ThreadIndex __num_threads, _OutputIterator __s)
function to split a sequence into parts of almost equal size.
Definition: equally_split.h:48
Forces sequential execution at compile time.
Definition: tags.h:42
int omp_get_thread_num(void) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
template<typename _RAIter , typename _Predicate >
std::iterator_traits<_RAIter>::difference_type __gnu_parallel::__parallel_partition ( _RAIter  __begin,
_RAIter  __end,
_Predicate  __pred,
_ThreadIndex  __num_threads 
)

Parallel implementation of std::partition.

Parameters
__beginBegin iterator of input sequence to split.
__endEnd iterator of input sequence to split.
__predPartition predicate, possibly including some kind of pivot.
__num_threadsMaximum number of threads to use for this task.
Returns
Number of elements not fulfilling the predicate.
58  {
59  typedef std::iterator_traits<_RAIter> _TraitsType;
60  typedef typename _TraitsType::value_type _ValueType;
61  typedef typename _TraitsType::difference_type _DifferenceType;
62 
63  _DifferenceType __n = __end - __begin;
64 
65  _GLIBCXX_CALL(__n)
66 
67  const _Settings& __s = _Settings::get();
68 
69  // shared
70  _GLIBCXX_VOLATILE _DifferenceType __left = 0, __right = __n - 1,
71  __dist = __n,
72  __leftover_left, __leftover_right,
73  __leftnew, __rightnew;
74 
75  // just 0 or 1, but int to allow atomic operations
76  int* __reserved_left = 0, * __reserved_right = 0;
77 
78  _DifferenceType __chunk_size = __s.partition_chunk_size;
79 
80  //at least two chunks per thread
81  if (__dist >= 2 * __num_threads * __chunk_size)
82 # pragma omp parallel num_threads(__num_threads)
83  {
84 # pragma omp single
85  {
86  __num_threads = omp_get_num_threads();
87  __reserved_left = new int[__num_threads];
88  __reserved_right = new int[__num_threads];
89 
90  if (__s.partition_chunk_share > 0.0)
91  __chunk_size = std::max<_DifferenceType>
92  (__s.partition_chunk_size, (double)__n
93  * __s.partition_chunk_share / (double)__num_threads);
94  else
95  __chunk_size = __s.partition_chunk_size;
96  }
97 
98  while (__dist >= 2 * __num_threads * __chunk_size)
99  {
100 # pragma omp single
101  {
102  _DifferenceType __num_chunks = __dist / __chunk_size;
103 
104  for (_ThreadIndex __r = 0; __r < __num_threads; ++__r)
105  {
106  __reserved_left [__r] = 0; // false
107  __reserved_right[__r] = 0; // false
108  }
109  __leftover_left = 0;
110  __leftover_right = 0;
111  } //implicit barrier
112 
113  // Private.
114  _DifferenceType __thread_left, __thread_left_border,
115  __thread_right, __thread_right_border;
116 
117  __thread_left = __left + 1;
118  // Just to satisfy the condition below.
119  __thread_left_border = __thread_left - 1;
120 
121  __thread_right = __n - 1;
122  // Just to satisfy the condition below.
123  __thread_right_border = __thread_right + 1;
124 
125  bool __iam_finished = false;
126  while (!__iam_finished)
127  {
128  if (__thread_left > __thread_left_border)
129  {
130  _DifferenceType __former_dist =
131  __fetch_and_add(&__dist, -__chunk_size);
132  if (__former_dist < __chunk_size)
133  {
134  __fetch_and_add(&__dist, __chunk_size);
135  __iam_finished = true;
136  break;
137  }
138  else
139  {
140  __thread_left =
141  __fetch_and_add(&__left, __chunk_size);
142  __thread_left_border =
143  __thread_left + (__chunk_size - 1);
144  }
145  }
146 
147  if (__thread_right < __thread_right_border)
148  {
149  _DifferenceType __former_dist =
150  __fetch_and_add(&__dist, -__chunk_size);
151  if (__former_dist < __chunk_size)
152  {
153  __fetch_and_add(&__dist, __chunk_size);
154  __iam_finished = true;
155  break;
156  }
157  else
158  {
159  __thread_right =
160  __fetch_and_add(&__right, -__chunk_size);
161  __thread_right_border =
162  __thread_right - (__chunk_size - 1);
163  }
164  }
165 
166  // Swap as usual.
167  while (__thread_left < __thread_right)
168  {
169  while (__pred(__begin[__thread_left])
170  && __thread_left <= __thread_left_border)
171  ++__thread_left;
172  while (!__pred(__begin[__thread_right])
173  && __thread_right >= __thread_right_border)
174  --__thread_right;
175 
176  if (__thread_left > __thread_left_border
177  || __thread_right < __thread_right_border)
178  // Fetch new chunk(__s).
179  break;
180 
181  std::iter_swap(__begin + __thread_left,
182  __begin + __thread_right);
183  ++__thread_left;
184  --__thread_right;
185  }
186  }
187 
188  // Now swap the leftover chunks to the right places.
189  if (__thread_left <= __thread_left_border)
190 # pragma omp atomic
191  ++__leftover_left;
192  if (__thread_right >= __thread_right_border)
193 # pragma omp atomic
194  ++__leftover_right;
195 
196 # pragma omp barrier
197 
198  _DifferenceType
199  __leftold = __left,
200  __leftnew = __left - __leftover_left * __chunk_size,
201  __rightold = __right,
202  __rightnew = __right + __leftover_right * __chunk_size;
203 
204  // <=> __thread_left_border + (__chunk_size - 1) >= __leftnew
205  if (__thread_left <= __thread_left_border
206  && __thread_left_border >= __leftnew)
207  {
208  // Chunk already in place, reserve spot.
209  __reserved_left[(__left - (__thread_left_border + 1))
210  / __chunk_size] = 1;
211  }
212 
213  // <=> __thread_right_border - (__chunk_size - 1) <= __rightnew
214  if (__thread_right >= __thread_right_border
215  && __thread_right_border <= __rightnew)
216  {
217  // Chunk already in place, reserve spot.
218  __reserved_right[((__thread_right_border - 1) - __right)
219  / __chunk_size] = 1;
220  }
221 
222 # pragma omp barrier
223 
224  if (__thread_left <= __thread_left_border
225  && __thread_left_border < __leftnew)
226  {
227  // Find spot and swap.
228  _DifferenceType __swapstart = -1;
229  for (int __r = 0; __r < __leftover_left; ++__r)
230  if (__reserved_left[__r] == 0
231  && __compare_and_swap(&(__reserved_left[__r]), 0, 1))
232  {
233  __swapstart = __leftold - (__r + 1) * __chunk_size;
234  break;
235  }
236 
237 #if _GLIBCXX_ASSERTIONS
238  _GLIBCXX_PARALLEL_ASSERT(__swapstart != -1);
239 #endif
240 
241  std::swap_ranges(__begin + __thread_left_border
242  - (__chunk_size - 1),
243  __begin + __thread_left_border + 1,
244  __begin + __swapstart);
245  }
246 
247  if (__thread_right >= __thread_right_border
248  && __thread_right_border > __rightnew)
249  {
250  // Find spot and swap
251  _DifferenceType __swapstart = -1;
252  for (int __r = 0; __r < __leftover_right; ++__r)
253  if (__reserved_right[__r] == 0
254  && __compare_and_swap(&(__reserved_right[__r]), 0, 1))
255  {
256  __swapstart = __rightold + __r * __chunk_size + 1;
257  break;
258  }
259 
260 #if _GLIBCXX_ASSERTIONS
261  _GLIBCXX_PARALLEL_ASSERT(__swapstart != -1);
262 #endif
263 
264  std::swap_ranges(__begin + __thread_right_border,
265  __begin + __thread_right_border
266  + __chunk_size, __begin + __swapstart);
267  }
268 #if _GLIBCXX_ASSERTIONS
269 # pragma omp barrier
270 
271 # pragma omp single
272  {
273  for (_DifferenceType __r = 0; __r < __leftover_left; ++__r)
274  _GLIBCXX_PARALLEL_ASSERT(__reserved_left[__r] == 1);
275  for (_DifferenceType __r = 0; __r < __leftover_right; ++__r)
276  _GLIBCXX_PARALLEL_ASSERT(__reserved_right[__r] == 1);
277  }
278 #endif
279 
280  __left = __leftnew;
281  __right = __rightnew;
282  __dist = __right - __left + 1;
283  }
284 
285 # pragma omp flush(__left, __right)
286  } // end "recursion" //parallel
287 
288  _DifferenceType __final_left = __left, __final_right = __right;
289 
290  while (__final_left < __final_right)
291  {
292  // Go right until key is geq than pivot.
293  while (__pred(__begin[__final_left])
294  && __final_left < __final_right)
295  ++__final_left;
296 
297  // Go left until key is less than pivot.
298  while (!__pred(__begin[__final_right])
299  && __final_left < __final_right)
300  --__final_right;
301 
302  if (__final_left == __final_right)
303  break;
304  std::iter_swap(__begin + __final_left, __begin + __final_right);
305  ++__final_left;
306  --__final_right;
307  }
308 
309  // All elements on the left side are < piv, all elements on the
310  // right are >= piv
311  delete[] __reserved_left;
312  delete[] __reserved_right;
313 
314  // Element "between" __final_left and __final_right might not have
315  // been regarded yet
316  if (__final_left < __n && !__pred(__begin[__final_left]))
317  // Really swapped.
318  return __final_left;
319  else
320  return __final_left + 1;
321  }
_Tp __fetch_and_add(volatile _Tp *__ptr, _Tp __addend)
Add a value to a variable, atomically.
Definition: compatibility.h:74
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define _GLIBCXX_VOLATILE
Decide whether to declare certain variables volatile.
Definition: partition.h:43
int omp_get_num_threads(void) __GOMP_NOTHROW
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
bool __compare_and_swap(volatile _Tp *__ptr, _Tp __comparand, _Tp __replacement)
Compare-and-swap.
Definition: compatibility.h:108
template<typename _RAIter , typename _RandomNumberGenerator >
void __gnu_parallel::__parallel_random_shuffle ( _RAIter  __begin,
_RAIter  __end,
_RandomNumberGenerator  __rng = _RandomNumber() 
)
inline

Parallel random public call.

Parameters
__beginBegin iterator of sequence.
__endEnd iterator of sequence.
__rngRandom number generator to use.
524  {
525  typedef std::iterator_traits<_RAIter> _TraitsType;
526  typedef typename _TraitsType::difference_type _DifferenceType;
527  _DifferenceType __n = __end - __begin;
528  __parallel_random_shuffle_drs(__begin, __end, __n,
529  __get_max_threads(), __rng);
530  }
void __parallel_random_shuffle_drs(_RAIter __begin, _RAIter __end, typename std::iterator_traits< _RAIter >::difference_type __n, _ThreadIndex __num_threads, _RandomNumberGenerator &__rng)
Main parallel random shuffle step.
Definition: random_shuffle.h:265
_ThreadIndex __get_max_threads()
Definition: base.h:85
template<typename _RAIter , typename _RandomNumberGenerator >
void __gnu_parallel::__parallel_random_shuffle_drs ( _RAIter  __begin,
_RAIter  __end,
typename std::iterator_traits< _RAIter >::difference_type  __n,
_ThreadIndex  __num_threads,
_RandomNumberGenerator &  __rng 
)

Main parallel random shuffle step.

Parameters
__beginBegin iterator of sequence.
__endEnd iterator of sequence.
__nLength of sequence.
__num_threadsNumber of threads to use.
__rngRandom number generator to use.
270  {
271  typedef std::iterator_traits<_RAIter> _TraitsType;
272  typedef typename _TraitsType::value_type _ValueType;
273  typedef typename _TraitsType::difference_type _DifferenceType;
274 
275  _GLIBCXX_CALL(__n)
276 
277  const _Settings& __s = _Settings::get();
278 
279  if (__num_threads > __n)
280  __num_threads = static_cast<_ThreadIndex>(__n);
281 
282  _BinIndex __num_bins, __num_bins_cache;
283 
284 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_L1
285  // Try the L1 cache first.
286 
287  // Must fit into L1.
288  __num_bins_cache =
289  std::max<_DifferenceType>(1, __n / (__s.L1_cache_size_lb
290  / sizeof(_ValueType)));
291  __num_bins_cache = __round_up_to_pow2(__num_bins_cache);
292 
293  // No more buckets than TLB entries, power of 2
294  // Power of 2 and at least one element per bin, at most the TLB size.
295  __num_bins = std::min<_DifferenceType>(__n, __num_bins_cache);
296 
297 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_TLB
298  // 2 TLB entries needed per bin.
299  __num_bins = std::min<_DifferenceType>(__s.TLB_size / 2, __num_bins);
300 #endif
301  __num_bins = __round_up_to_pow2(__num_bins);
302 
303  if (__num_bins < __num_bins_cache)
304  {
305 #endif
306  // Now try the L2 cache
307  // Must fit into L2
308  __num_bins_cache = static_cast<_BinIndex>
309  (std::max<_DifferenceType>(1, __n / (__s.L2_cache_size
310  / sizeof(_ValueType))));
311  __num_bins_cache = __round_up_to_pow2(__num_bins_cache);
312 
313  // No more buckets than TLB entries, power of 2.
314  __num_bins = static_cast<_BinIndex>
315  (std::min(__n, static_cast<_DifferenceType>(__num_bins_cache)));
316  // Power of 2 and at least one element per bin, at most the TLB size.
317 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_TLB
318  // 2 TLB entries needed per bin.
319  __num_bins = std::min(static_cast<_DifferenceType>(__s.TLB_size / 2),
320  __num_bins);
321 #endif
322  __num_bins = __round_up_to_pow2(__num_bins);
323 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_L1
324  }
325 #endif
326 
327  __num_bins = __round_up_to_pow2(
328  std::max<_BinIndex>(__num_threads, __num_bins));
329 
330  if (__num_threads <= 1)
331  {
332  _RandomNumber __derived_rng(
334  __sequential_random_shuffle(__begin, __end, __derived_rng);
335  return;
336  }
337 
338  _DRandomShufflingGlobalData<_RAIter> __sd(__begin);
339  _DRSSorterPU<_RAIter, _RandomNumber >* __pus;
340  _DifferenceType* __starts;
341 
342 # pragma omp parallel num_threads(__num_threads)
343  {
344  _ThreadIndex __num_threads = omp_get_num_threads();
345 # pragma omp single
346  {
347  __pus = new _DRSSorterPU<_RAIter, _RandomNumber>[__num_threads];
348 
349  __sd._M_temporaries = new _ValueType*[__num_threads];
350  __sd._M_dist = new _DifferenceType*[__num_bins + 1];
351  __sd._M_bin_proc = new _ThreadIndex[__num_bins];
352  for (_BinIndex __b = 0; __b < __num_bins + 1; ++__b)
353  __sd._M_dist[__b] = new _DifferenceType[__num_threads + 1];
354  for (_BinIndex __b = 0; __b < (__num_bins + 1); ++__b)
355  {
356  __sd._M_dist[0][0] = 0;
357  __sd._M_dist[__b][0] = 0;
358  }
359  __starts = __sd._M_starts = new _DifferenceType[__num_threads + 1];
360  int __bin_cursor = 0;
361  __sd._M_num_bins = __num_bins;
362  __sd._M_num_bits = __rd_log2(__num_bins);
363 
364  _DifferenceType __chunk_length = __n / __num_threads,
365  __split = __n % __num_threads,
366  __start = 0;
367  _DifferenceType __bin_chunk_length = __num_bins / __num_threads,
368  __bin_split = __num_bins % __num_threads;
369  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
370  {
371  __starts[__i] = __start;
372  __start += (__i < __split
373  ? (__chunk_length + 1) : __chunk_length);
374  int __j = __pus[__i]._M_bins_begin = __bin_cursor;
375 
376  // Range of bins for this processor.
377  __bin_cursor += (__i < __bin_split
378  ? (__bin_chunk_length + 1)
379  : __bin_chunk_length);
380  __pus[__i].__bins_end = __bin_cursor;
381  for (; __j < __bin_cursor; ++__j)
382  __sd._M_bin_proc[__j] = __i;
383  __pus[__i]._M_num_threads = __num_threads;
384  __pus[__i]._M_seed = __rng(std::numeric_limits<uint32_t>::max());
385  __pus[__i]._M_sd = &__sd;
386  }
387  __starts[__num_threads] = __start;
388  } //single
389  // Now shuffle in parallel.
391  } // parallel
392 
393  delete[] __starts;
394  delete[] __sd._M_bin_proc;
395  for (int __s = 0; __s < (__num_bins + 1); ++__s)
396  delete[] __sd._M_dist[__s];
397  delete[] __sd._M_dist;
398  delete[] __sd._M_temporaries;
399 
400  delete[] __pus;
401  }
void __sequential_random_shuffle(_RAIter __begin, _RAIter __end, _RandomNumberGenerator &__rng)
Sequential cache-efficient random shuffle.
Definition: random_shuffle.h:410
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
unsigned short _BinIndex
Type to hold the index of a bin.
Definition: random_shuffle.h:47
void __parallel_random_shuffle_drs_pu(_DRSSorterPU< _RAIter, _RandomNumberGenerator > *__pus)
Random shuffle code executed by each thread.
Definition: random_shuffle.h:122
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
int omp_get_num_threads(void) __GOMP_NOTHROW
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
_Size __rd_log2(_Size __n)
Calculates the rounded-down logarithm of __n for base 2.
Definition: base.h:102
_Tp __round_up_to_pow2(_Tp __x)
Round up to the next greater power of 2.
Definition: random_shuffle.h:248
const _Tp & max(const _Tp &__a, const _Tp &__b)
Equivalent to std::max.
Definition: base.h:150
template<typename _RAIter , typename _RandomNumberGenerator >
void __gnu_parallel::__parallel_random_shuffle_drs_pu ( _DRSSorterPU< _RAIter, _RandomNumberGenerator > *  __pus)

Random shuffle code executed by each thread.

Parameters
__pusArray of thread-local data records.
124  {
125  typedef std::iterator_traits<_RAIter> _TraitsType;
126  typedef typename _TraitsType::value_type _ValueType;
127  typedef typename _TraitsType::difference_type _DifferenceType;
128 
130  _DRSSorterPU<_RAIter, _RandomNumberGenerator>* __d = &__pus[__iam];
131  _DRandomShufflingGlobalData<_RAIter>* __sd = __d->_M_sd;
132 
133  // Indexing: _M_dist[bin][processor]
134  _DifferenceType __length = (__sd->_M_starts[__iam + 1]
135  - __sd->_M_starts[__iam]);
136  _BinIndex* __oracles = new _BinIndex[__length];
137  _DifferenceType* __dist = new _DifferenceType[__sd->_M_num_bins + 1];
138  _BinIndex* __bin_proc = new _BinIndex[__sd->_M_num_bins];
139  _ValueType** __temporaries = new _ValueType*[__d->_M_num_threads];
140 
141  // Compute oracles and count appearances.
142  for (_BinIndex __b = 0; __b < __sd->_M_num_bins + 1; ++__b)
143  __dist[__b] = 0;
144  int __num_bits = __sd->_M_num_bits;
145 
146  _RandomNumber __rng(__d->_M_seed);
147 
148  // First main loop.
149  for (_DifferenceType __i = 0; __i < __length; ++__i)
150  {
151  _BinIndex __oracle = __random_number_pow2(__num_bits, __rng);
152  __oracles[__i] = __oracle;
153 
154  // To allow prefix (partial) sum.
155  ++(__dist[__oracle + 1]);
156  }
157 
158  for (_BinIndex __b = 0; __b < __sd->_M_num_bins + 1; ++__b)
159  __sd->_M_dist[__b][__iam + 1] = __dist[__b];
160 
161 # pragma omp barrier
162 
163 # pragma omp single
164  {
165  // Sum up bins, __sd->_M_dist[__s + 1][__d->_M_num_threads] now
166  // contains the total number of items in bin __s
167  for (_BinIndex __s = 0; __s < __sd->_M_num_bins; ++__s)
168  __gnu_sequential::partial_sum(__sd->_M_dist[__s + 1],
169  __sd->_M_dist[__s + 1]
170  + __d->_M_num_threads + 1,
171  __sd->_M_dist[__s + 1]);
172  }
173 
174 # pragma omp barrier
175 
176  _SequenceIndex __offset = 0, __global_offset = 0;
177  for (_BinIndex __s = 0; __s < __d->_M_bins_begin; ++__s)
178  __global_offset += __sd->_M_dist[__s + 1][__d->_M_num_threads];
179 
180 # pragma omp barrier
181 
182  for (_BinIndex __s = __d->_M_bins_begin; __s < __d->__bins_end; ++__s)
183  {
184  for (int __t = 0; __t < __d->_M_num_threads + 1; ++__t)
185  __sd->_M_dist[__s + 1][__t] += __offset;
186  __offset = __sd->_M_dist[__s + 1][__d->_M_num_threads];
187  }
188 
189  __sd->_M_temporaries[__iam] = static_cast<_ValueType*>
190  (::operator new(sizeof(_ValueType) * __offset));
191 
192 # pragma omp barrier
193 
194  // Draw local copies to avoid false sharing.
195  for (_BinIndex __b = 0; __b < __sd->_M_num_bins + 1; ++__b)
196  __dist[__b] = __sd->_M_dist[__b][__iam];
197  for (_BinIndex __b = 0; __b < __sd->_M_num_bins; ++__b)
198  __bin_proc[__b] = __sd->_M_bin_proc[__b];
199  for (_ThreadIndex __t = 0; __t < __d->_M_num_threads; ++__t)
200  __temporaries[__t] = __sd->_M_temporaries[__t];
201 
202  _RAIter __source = __sd->_M_source;
203  _DifferenceType __start = __sd->_M_starts[__iam];
204 
205  // Distribute according to oracles, second main loop.
206  for (_DifferenceType __i = 0; __i < __length; ++__i)
207  {
208  _BinIndex __target_bin = __oracles[__i];
209  _ThreadIndex __target_p = __bin_proc[__target_bin];
210 
211  // Last column [__d->_M_num_threads] stays unchanged.
212  ::new(&(__temporaries[__target_p][__dist[__target_bin + 1]++]))
213  _ValueType(*(__source + __i + __start));
214  }
215 
216  delete[] __oracles;
217  delete[] __dist;
218  delete[] __bin_proc;
219  delete[] __temporaries;
220 
221 # pragma omp barrier
222 
223  // Shuffle bins internally.
224  for (_BinIndex __b = __d->_M_bins_begin; __b < __d->__bins_end; ++__b)
225  {
226  _ValueType* __begin =
227  (__sd->_M_temporaries[__iam]
228  + (__b == __d->_M_bins_begin
229  ? 0 : __sd->_M_dist[__b][__d->_M_num_threads])),
230  *__end = (__sd->_M_temporaries[__iam]
231  + __sd->_M_dist[__b + 1][__d->_M_num_threads]);
232 
233  __sequential_random_shuffle(__begin, __end, __rng);
234  std::copy(__begin, __end, __sd->_M_source + __global_offset
235  + (__b == __d->_M_bins_begin
236  ? 0 : __sd->_M_dist[__b][__d->_M_num_threads]));
237  }
238 
239  for (_SequenceIndex __i = 0; __i < __offset; ++__i)
240  __sd->_M_temporaries[__iam][__i].~_ValueType();
241  ::operator delete(__sd->_M_temporaries[__iam]);
242  }
void __sequential_random_shuffle(_RAIter __begin, _RAIter __end, _RandomNumberGenerator &__rng)
Sequential cache-efficient random shuffle.
Definition: random_shuffle.h:410
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
unsigned short _BinIndex
Type to hold the index of a bin.
Definition: random_shuffle.h:47
int __random_number_pow2(int __logp, _RandomNumberGenerator &__rng)
Generate a random number in [0,2^__logp).
Definition: random_shuffle.h:115
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
int omp_get_thread_num(void) __GOMP_NOTHROW
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __gnu_parallel::__parallel_set_difference ( _IIter  __begin1,
_IIter  __end1,
_IIter  __begin2,
_IIter  __end2,
_OutputIterator  __result,
_Compare  __comp 
)
inline
506  {
507  return __parallel_set_operation(__begin1, __end1, __begin2, __end2,
508  __result,
509  __difference_func<_IIter,
510  _OutputIterator, _Compare>(__comp));
511  }
_OutputIterator __parallel_set_operation(_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Operation __op)
Definition: set_operations.h:350
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __gnu_parallel::__parallel_set_intersection ( _IIter  __begin1,
_IIter  __end1,
_IIter  __begin2,
_IIter  __end2,
_OutputIterator  __result,
_Compare  __comp 
)
inline
492  {
493  return __parallel_set_operation(__begin1, __end1, __begin2, __end2,
494  __result,
495  __intersection_func<_IIter,
496  _OutputIterator, _Compare>(__comp));
497  }
_OutputIterator __parallel_set_operation(_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Operation __op)
Definition: set_operations.h:350
template<typename _IIter , typename _OutputIterator , typename _Operation >
_OutputIterator __gnu_parallel::__parallel_set_operation ( _IIter  __begin1,
_IIter  __end1,
_IIter  __begin2,
_IIter  __end2,
_OutputIterator  __result,
_Operation  __op 
)
353  {
354  _GLIBCXX_CALL((__end1 - __begin1) + (__end2 - __begin2))
355 
356  typedef std::iterator_traits<_IIter> _TraitsType;
357  typedef typename _TraitsType::difference_type _DifferenceType;
358  typedef typename std::pair<_IIter, _IIter> _IteratorPair;
359 
360  if (__begin1 == __end1)
361  return __op.__first_empty(__begin2, __end2, __result);
362 
363  if (__begin2 == __end2)
364  return __op.__second_empty(__begin1, __end1, __result);
365 
366  const _DifferenceType __size = (__end1 - __begin1) + (__end2 - __begin2);
367 
368  const _IteratorPair __sequence[2] = { std::make_pair(__begin1, __end1),
369  std::make_pair(__begin2, __end2) };
370  _OutputIterator __return_value = __result;
371  _DifferenceType *__borders;
372  _IteratorPair *__block_begins;
373  _DifferenceType* __lengths;
374 
375  _ThreadIndex __num_threads =
376  std::min<_DifferenceType>(__get_max_threads(),
377  std::min(__end1 - __begin1, __end2 - __begin2));
378 
379 # pragma omp parallel num_threads(__num_threads)
380  {
381 # pragma omp single
382  {
383  __num_threads = omp_get_num_threads();
384 
385  __borders = new _DifferenceType[__num_threads + 2];
386  __equally_split(__size, __num_threads + 1, __borders);
387  __block_begins = new _IteratorPair[__num_threads + 1];
388  // Very __start.
389  __block_begins[0] = std::make_pair(__begin1, __begin2);
390  __lengths = new _DifferenceType[__num_threads];
391  } //single
392 
394 
395  // _Result from multiseq_partition.
396  _IIter __offset[2];
397  const _DifferenceType __rank = __borders[__iam + 1];
398 
399  multiseq_partition(__sequence, __sequence + 2,
400  __rank, __offset, __op._M_comp);
401 
402  // allowed to read?
403  // together
404  // *(__offset[ 0 ] - 1) == *__offset[ 1 ]
405  if (__offset[ 0 ] != __begin1 && __offset[1] != __end2
406  && !__op._M_comp(*(__offset[0] - 1), *__offset[1])
407  && !__op._M_comp(*__offset[1], *(__offset[0] - 1)))
408  {
409  // Avoid split between globally equal elements: move one to
410  // front in first sequence.
411  --__offset[0];
412  }
413 
414  _IteratorPair __block_end = __block_begins[__iam + 1] =
415  _IteratorPair(__offset[0], __offset[1]);
416 
417  // Make sure all threads have their block_begin result written out.
418 # pragma omp barrier
419 
420  _IteratorPair __block_begin = __block_begins[__iam];
421 
422  // Begin working for the first block, while the others except
423  // the last start to count.
424  if (__iam == 0)
425  {
426  // The first thread can copy already.
427  __lengths[ __iam ] =
428  __op._M_invoke(__block_begin.first, __block_end.first,
429  __block_begin.second, __block_end.second,
430  __result) - __result;
431  }
432  else
433  {
434  __lengths[ __iam ] =
435  __op.__count(__block_begin.first, __block_end.first,
436  __block_begin.second, __block_end.second);
437  }
438 
439  // Make sure everyone wrote their lengths.
440 # pragma omp barrier
441 
442  _OutputIterator __r = __result;
443 
444  if (__iam == 0)
445  {
446  // Do the last block.
447  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
448  __r += __lengths[__i];
449 
450  __block_begin = __block_begins[__num_threads];
451 
452  // Return the result iterator of the last block.
453  __return_value =
454  __op._M_invoke(__block_begin.first, __end1,
455  __block_begin.second, __end2, __r);
456 
457  }
458  else
459  {
460  for (_ThreadIndex __i = 0; __i < __iam; ++__i)
461  __r += __lengths[ __i ];
462 
463  // Reset begins for copy pass.
464  __op._M_invoke(__block_begin.first, __block_end.first,
465  __block_begin.second, __block_end.second, __r);
466  }
467  }
468  return __return_value;
469  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
std::size_t __size(__stack_t __stack)
Definition: profiler_node.h:68
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
int omp_get_num_threads(void) __GOMP_NOTHROW
void multiseq_partition(_RanSeqs __begin_seqs, _RanSeqs __end_seqs, _RankType __rank, _RankIterator __begin_offsets, _Compare __comp=std::less< typename std::iterator_traits< typename std::iterator_traits< _RanSeqs >::value_type::first_type >::value_type >())
Splits several sorted sequences at a certain global __rank, resulting in a splitting point for each s...
Definition: multiseq_selection.h:122
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
return(unsigned int) __res
_OutputIterator __equally_split(_DifferenceType __n, _ThreadIndex __num_threads, _OutputIterator __s)
function to split a sequence into parts of almost equal size.
Definition: equally_split.h:48
int omp_get_thread_num(void) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __gnu_parallel::__parallel_set_symmetric_difference ( _IIter  __begin1,
_IIter  __end1,
_IIter  __begin2,
_IIter  __end2,
_OutputIterator  __result,
_Compare  __comp 
)
inline
521  {
522  return __parallel_set_operation(__begin1, __end1, __begin2, __end2,
523  __result,
524  __symmetric_difference_func<_IIter,
525  _OutputIterator, _Compare>(__comp));
526  }
_OutputIterator __parallel_set_operation(_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Operation __op)
Definition: set_operations.h:350
template<typename _IIter , typename _OutputIterator , typename _Compare >
_OutputIterator __gnu_parallel::__parallel_set_union ( _IIter  __begin1,
_IIter  __end1,
_IIter  __begin2,
_IIter  __end2,
_OutputIterator  __result,
_Compare  __comp 
)
inline
478  {
479  return __parallel_set_operation(__begin1, __end1, __begin2, __end2,
480  __result,
481  __union_func< _IIter, _OutputIterator,
482  _Compare>(__comp));
483  }
_OutputIterator __parallel_set_operation(_IIter __begin1, _IIter __end1, _IIter __begin2, _IIter __end2, _OutputIterator __result, _Operation __op)
Definition: set_operations.h:350
template<bool __stable, typename _RAIter , typename _Compare , typename _Parallelism >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
_Parallelism  __parallelism 
)
template<bool __stable, typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
multiway_mergesort_tag  __parallelism 
)
inline

Choose multiway mergesort, splitting variant at run-time, for parallel sorting.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__compComparator.
Template Parameters
__stableSort stable.
77  {
78  _GLIBCXX_CALL(__end - __begin)
79 
80  if(_Settings::get().sort_splitting == EXACT)
81  parallel_sort_mwms<__stable, true>
82  (__begin, __end, __comp, __parallelism.__get_num_threads());
83  else
84  parallel_sort_mwms<__stable, false>
85  (__begin, __end, __comp, __parallelism.__get_num_threads());
86  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
Definition: types.h:101
void parallel_sort_mwms(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
PMWMS main call.
Definition: multiway_mergesort.h:395
template<bool __stable, typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
multiway_mergesort_exact_tag  __parallelism 
)
inline

Choose multiway mergesort with exact splitting, for parallel sorting.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__compComparator.
Template Parameters
__stableSort stable.
102  {
103  _GLIBCXX_CALL(__end - __begin)
104 
105  parallel_sort_mwms<__stable, true>
106  (__begin, __end, __comp, __parallelism.__get_num_threads());
107  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
void parallel_sort_mwms(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
PMWMS main call.
Definition: multiway_mergesort.h:395
template<bool __stable, typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
multiway_mergesort_sampling_tag  __parallelism 
)
inline

Choose multiway mergesort with splitting by sampling, for parallel sorting.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__compComparator.
Template Parameters
__stableSort stable.
123  {
124  _GLIBCXX_CALL(__end - __begin)
125 
126  parallel_sort_mwms<__stable, false>
127  (__begin, __end, __comp, __parallelism.__get_num_threads());
128  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
void parallel_sort_mwms(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
PMWMS main call.
Definition: multiway_mergesort.h:395
template<bool __stable, typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
quicksort_tag  __parallelism 
)
inline

Choose quicksort for parallel sorting.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__compComparator.
Template Parameters
__stableSort stable.
142  {
143  _GLIBCXX_CALL(__end - __begin)
144 
145  _GLIBCXX_PARALLEL_ASSERT(__stable == false);
146 
147  __parallel_sort_qs(__begin, __end, __comp,
148  __parallelism.__get_num_threads());
149  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
void __parallel_sort_qs(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
Unbalanced quicksort main call.
Definition: quicksort.h:154
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<bool __stable, typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
balanced_quicksort_tag  __parallelism 
)
inline

Choose balanced quicksort for parallel sorting.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__compComparator.
Template Parameters
__stableSort stable.
163  {
164  _GLIBCXX_CALL(__end - __begin)
165 
166  _GLIBCXX_PARALLEL_ASSERT(__stable == false);
167 
168  __parallel_sort_qsb(__begin, __end, __comp,
169  __parallelism.__get_num_threads());
170  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
void __parallel_sort_qsb(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
Top-level quicksort routine.
Definition: balanced_quicksort.h:430
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<bool __stable, typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
default_parallel_tag  __parallelism 
)
inline

Choose multiway mergesort with exact splitting, for parallel sorting.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__compComparator.
Template Parameters
__stableSort stable.
185  {
186  _GLIBCXX_CALL(__end - __begin)
187 
188  __parallel_sort<__stable>
189  (__begin, __end, __comp,
190  multiway_mergesort_exact_tag(__parallelism.__get_num_threads()));
191  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
void __parallel_sort(_RAIter __begin, _RAIter __end, _Compare __comp, parallel_tag __parallelism)
Choose a parallel sorting algorithm.
Definition: sort.h:203
template<bool __stable, typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
parallel_tag  __parallelism 
)
inline

Choose a parallel sorting algorithm.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__compComparator.
Template Parameters
__stableSort stable.
205  {
206  _GLIBCXX_CALL(__end - __begin)
207  typedef std::iterator_traits<_RAIter> _TraitsType;
208  typedef typename _TraitsType::value_type _ValueType;
209  typedef typename _TraitsType::difference_type _DifferenceType;
210 
211  if (false) ;
212 #if _GLIBCXX_MERGESORT
213  else if (__stable || _Settings::get().sort_algorithm == MWMS)
214  {
215  if(_Settings::get().sort_splitting == EXACT)
216  parallel_sort_mwms<__stable, true>
217  (__begin, __end, __comp, __parallelism.__get_num_threads());
218  else
219  parallel_sort_mwms<false, false>
220  (__begin, __end, __comp, __parallelism.__get_num_threads());
221  }
222 #endif
223 #if _GLIBCXX_QUICKSORT
224  else if (_Settings::get().sort_algorithm == QS)
225  __parallel_sort_qs(__begin, __end, __comp,
226  __parallelism.__get_num_threads());
227 #endif
228 #if _GLIBCXX_BAL_QUICKSORT
229  else if (_Settings::get().sort_algorithm == QS_BALANCED)
230  __parallel_sort_qsb(__begin, __end, __comp,
231  __parallelism.__get_num_threads());
232 #endif
233  else
234  __gnu_sequential::sort(__begin, __end, __comp);
235  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
void __parallel_sort_qsb(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
Top-level quicksort routine.
Definition: balanced_quicksort.h:430
Definition: types.h:79
void __parallel_sort_qs(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
Unbalanced quicksort main call.
Definition: quicksort.h:154
Definition: types.h:78
Definition: types.h:80
Definition: types.h:101
template<typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort_qs ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
_ThreadIndex  __num_threads 
)

Unbalanced quicksort main call.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator input sequence, ignored.
__compComparator.
__num_threadsNumber of threads that are allowed to work on this part.
157  {
158  _GLIBCXX_CALL(__n)
159 
160  typedef std::iterator_traits<_RAIter> _TraitsType;
161  typedef typename _TraitsType::value_type _ValueType;
162  typedef typename _TraitsType::difference_type _DifferenceType;
163 
164  _DifferenceType __n = __end - __begin;
165 
166  // At least one element per processor.
167  if (__num_threads > __n)
168  __num_threads = static_cast<_ThreadIndex>(__n);
169 
171  __begin, __begin + __n, __comp, __num_threads);
172  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
void __parallel_sort_qs_conquer(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
Unbalanced quicksort conquer step.
Definition: quicksort.h:101
template<typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort_qs_conquer ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
_ThreadIndex  __num_threads 
)

Unbalanced quicksort conquer step.

Parameters
__beginBegin iterator of subsequence.
__endEnd iterator of subsequence.
__compComparator.
__num_threadsNumber of threads that are allowed to work on this part.
104  {
105  typedef std::iterator_traits<_RAIter> _TraitsType;
106  typedef typename _TraitsType::value_type _ValueType;
107  typedef typename _TraitsType::difference_type _DifferenceType;
108 
109  if (__num_threads <= 1)
110  {
111  __gnu_sequential::sort(__begin, __end, __comp);
112  return;
113  }
114 
115  _DifferenceType __n = __end - __begin, __pivot_rank;
116 
117  if (__n <= 1)
118  return;
119 
120  _ThreadIndex __num_threads_left;
121 
122  if ((__num_threads % 2) == 1)
123  __num_threads_left = __num_threads / 2 + 1;
124  else
125  __num_threads_left = __num_threads / 2;
126 
127  __pivot_rank = __n * __num_threads_left / __num_threads;
128 
129  _DifferenceType __split = __parallel_sort_qs_divide
130  (__begin, __end, __comp, __pivot_rank,
131  _Settings::get().sort_qs_num_samples_preset, __num_threads);
132 
133 #pragma omp parallel sections num_threads(2)
134  {
135 #pragma omp section
136  __parallel_sort_qs_conquer(__begin, __begin + __split,
137  __comp, __num_threads_left);
138 #pragma omp section
139  __parallel_sort_qs_conquer(__begin + __split, __end,
140  __comp, __num_threads - __num_threads_left);
141  }
142  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
std::iterator_traits< _RAIter >::difference_type __parallel_sort_qs_divide(_RAIter __begin, _RAIter __end, _Compare __comp, typename std::iterator_traits< _RAIter >::difference_type __pivot_rank, typename std::iterator_traits< _RAIter >::difference_type __num_samples, _ThreadIndex __num_threads)
Unbalanced quicksort divide step.
Definition: quicksort.h:51
void __parallel_sort_qs_conquer(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
Unbalanced quicksort conquer step.
Definition: quicksort.h:101
template<typename _RAIter , typename _Compare >
std::iterator_traits<_RAIter>::difference_type __gnu_parallel::__parallel_sort_qs_divide ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
typename std::iterator_traits< _RAIter >::difference_type  __pivot_rank,
typename std::iterator_traits< _RAIter >::difference_type  __num_samples,
_ThreadIndex  __num_threads 
)

Unbalanced quicksort divide step.

Parameters
__beginBegin iterator of subsequence.
__endEnd iterator of subsequence.
__compComparator.
__pivot_rankDesired __rank of the pivot.
__num_samplesChoose pivot from that many samples.
__num_threadsNumber of threads that are allowed to work on this part.
57  {
58  typedef std::iterator_traits<_RAIter> _TraitsType;
59  typedef typename _TraitsType::value_type _ValueType;
60  typedef typename _TraitsType::difference_type _DifferenceType;
61 
62  _DifferenceType __n = __end - __begin;
63  __num_samples = std::min(__num_samples, __n);
64 
65  // Allocate uninitialized, to avoid default constructor.
66  _ValueType* __samples = static_cast<_ValueType*>
67  (::operator new(__num_samples * sizeof(_ValueType)));
68 
69  for (_DifferenceType __s = 0; __s < __num_samples; ++__s)
70  {
71  const unsigned long long __index = static_cast<unsigned long long>
72  (__s) * __n / __num_samples;
73  ::new(&(__samples[__s])) _ValueType(__begin[__index]);
74  }
75 
76  __gnu_sequential::sort(__samples, __samples + __num_samples, __comp);
77 
78  _ValueType& __pivot = __samples[__pivot_rank * __num_samples / __n];
79 
81  __pred(__comp, __pivot);
82  _DifferenceType __split = __parallel_partition(__begin, __end,
83  __pred, __num_threads);
84 
85  for (_DifferenceType __s = 0; __s < __num_samples; ++__s)
86  __samples[__s].~_ValueType();
87  ::operator delete(__samples);
88 
89  return __split;
90  }
std::iterator_traits< _RAIter >::difference_type __parallel_partition(_RAIter __begin, _RAIter __end, _Predicate __pred, _ThreadIndex __num_threads)
Parallel implementation of std::partition.
Definition: partition.h:56
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
Similar to std::binder2nd, but giving the argument types explicitly.
Definition: base.h:220
template<typename _RAIter , typename _Compare >
void __gnu_parallel::__parallel_sort_qsb ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
_ThreadIndex  __num_threads 
)

Top-level quicksort routine.

Parameters
__beginBegin iterator of sequence.
__endEnd iterator of sequence.
__compComparator.
__num_threadsNumber of threads that are allowed to work on this part.
432  {
433  _GLIBCXX_CALL(__end - __begin)
434 
435  typedef std::iterator_traits<_RAIter> _TraitsType;
436  typedef typename _TraitsType::value_type _ValueType;
437  typedef typename _TraitsType::difference_type _DifferenceType;
438  typedef std::pair<_RAIter, _RAIter> _Piece;
439 
440  typedef _QSBThreadLocal<_RAIter> _TLSType;
441 
442  _DifferenceType __n = __end - __begin;
443 
444  if (__n <= 1)
445  return;
446 
447  // At least one element per processor.
448  if (__num_threads > __n)
449  __num_threads = static_cast<_ThreadIndex>(__n);
450 
451  // Initialize thread local storage
452  _TLSType** __tls = new _TLSType*[__num_threads];
453  _DifferenceType __queue_size = (__num_threads
454  * (_ThreadIndex)(__rd_log2(__n) + 1));
455  for (_ThreadIndex __t = 0; __t < __num_threads; ++__t)
456  __tls[__t] = new _QSBThreadLocal<_RAIter>(__queue_size);
457 
458  // There can never be more than ceil(__rd_log2(__n)) ranges on the
459  // stack, because
460  // 1. Only one processor pushes onto the stack
461  // 2. The largest range has at most length __n
462  // 3. Each range is larger than half of the range remaining
463  volatile _DifferenceType __elements_leftover = __n;
464  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
465  {
466  __tls[__i]->_M_elements_leftover = &__elements_leftover;
467  __tls[__i]->_M_num_threads = __num_threads;
468  __tls[__i]->_M_global = std::make_pair(__begin, __end);
469 
470  // Just in case nothing is left to assign.
471  __tls[__i]->_M_initial = std::make_pair(__end, __end);
472  }
473 
474  // Main recursion call.
475  __qsb_conquer(__tls, __begin, __begin + __n, __comp, 0,
476  __num_threads, true);
477 
478 #if _GLIBCXX_ASSERTIONS
479  // All stack must be empty.
480  _Piece __dummy;
481  for (_ThreadIndex __i = 1; __i < __num_threads; ++__i)
483  !__tls[__i]->_M_leftover_parts.pop_back(__dummy));
484 #endif
485 
486  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
487  delete __tls[__i];
488  delete[] __tls;
489  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
return(unsigned int) __res
_Size __rd_log2(_Size __n)
Calculates the rounded-down logarithm of __n for base 2.
Definition: base.h:102
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
void __qsb_conquer(_QSBThreadLocal< _RAIter > **__tls, _RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __iam, _ThreadIndex __num_threads, bool __parent_wait)
Quicksort conquer step.
Definition: balanced_quicksort.h:171
template<typename _IIter , class _OutputIterator , class _BinaryPredicate >
_OutputIterator __gnu_parallel::__parallel_unique_copy ( _IIter  __first,
_IIter  __last,
_OutputIterator  __result,
_BinaryPredicate  __binary_pred 
)

Parallel std::unique_copy(), w/__o explicit equality predicate.

Parameters
__firstBegin iterator of input sequence.
__lastEnd iterator of input sequence.
__resultBegin iterator of result __sequence.
__binary_predEquality predicate.
Returns
End iterator of result __sequence.
53  {
54  _GLIBCXX_CALL(__last - __first)
55 
56  typedef std::iterator_traits<_IIter> _TraitsType;
57  typedef typename _TraitsType::value_type _ValueType;
58  typedef typename _TraitsType::difference_type _DifferenceType;
59 
60  _DifferenceType __size = __last - __first;
61 
62  if (__size == 0)
63  return __result;
64 
65  // Let the first thread process two parts.
66  _DifferenceType *__counter;
67  _DifferenceType *__borders;
68 
69  _ThreadIndex __num_threads = __get_max_threads();
70  // First part contains at least one element.
71 # pragma omp parallel num_threads(__num_threads)
72  {
73 # pragma omp single
74  {
75  __num_threads = omp_get_num_threads();
76  __borders = new _DifferenceType[__num_threads + 2];
77  __equally_split(__size, __num_threads + 1, __borders);
78  __counter = new _DifferenceType[__num_threads + 1];
79  }
80 
82 
83  _DifferenceType __begin, __end;
84 
85  // Check for length without duplicates
86  // Needed for position in output
87  _DifferenceType __i = 0;
88  _OutputIterator __out = __result;
89 
90  if (__iam == 0)
91  {
92  __begin = __borders[0] + 1; // == 1
93  __end = __borders[__iam + 1];
94 
95  ++__i;
96  *__out++ = *__first;
97 
98  for (_IIter __iter = __first + __begin; __iter < __first + __end;
99  ++__iter)
100  {
101  if (!__binary_pred(*__iter, *(__iter - 1)))
102  {
103  ++__i;
104  *__out++ = *__iter;
105  }
106  }
107  }
108  else
109  {
110  __begin = __borders[__iam]; //one part
111  __end = __borders[__iam + 1];
112 
113  for (_IIter __iter = __first + __begin; __iter < __first + __end;
114  ++__iter)
115  {
116  if (!__binary_pred(*__iter, *(__iter - 1)))
117  ++__i;
118  }
119  }
120  __counter[__iam] = __i;
121 
122  // Last part still untouched.
123  _DifferenceType __begin_output;
124 
125 # pragma omp barrier
126 
127  // Store result in output on calculated positions.
128  __begin_output = 0;
129 
130  if (__iam == 0)
131  {
132  for (_ThreadIndex __t = 0; __t < __num_threads; ++__t)
133  __begin_output += __counter[__t];
134 
135  __i = 0;
136 
137  _OutputIterator __iter_out = __result + __begin_output;
138 
139  __begin = __borders[__num_threads];
140  __end = __size;
141 
142  for (_IIter __iter = __first + __begin; __iter < __first + __end;
143  ++__iter)
144  {
145  if (__iter == __first
146  || !__binary_pred(*__iter, *(__iter - 1)))
147  {
148  ++__i;
149  *__iter_out++ = *__iter;
150  }
151  }
152 
153  __counter[__num_threads] = __i;
154  }
155  else
156  {
157  for (_ThreadIndex __t = 0; __t < __iam; __t++)
158  __begin_output += __counter[__t];
159 
160  _OutputIterator __iter_out = __result + __begin_output;
161  for (_IIter __iter = __first + __begin; __iter < __first + __end;
162  ++__iter)
163  {
164  if (!__binary_pred(*__iter, *(__iter - 1)))
165  *__iter_out++ = *__iter;
166  }
167  }
168  }
169 
170  _DifferenceType __end_output = 0;
171  for (_ThreadIndex __t = 0; __t < __num_threads + 1; __t++)
172  __end_output += __counter[__t];
173 
174  delete[] __borders;
175 
176  return __result + __end_output;
177  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
std::size_t __size(__stack_t __stack)
Definition: profiler_node.h:68
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
int omp_get_num_threads(void) __GOMP_NOTHROW
return(unsigned int) __res
_OutputIterator __equally_split(_DifferenceType __n, _ThreadIndex __num_threads, _OutputIterator __s)
function to split a sequence into parts of almost equal size.
Definition: equally_split.h:48
int omp_get_thread_num(void) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
template<typename _IIter , class _OutputIterator >
_OutputIterator __gnu_parallel::__parallel_unique_copy ( _IIter  __first,
_IIter  __last,
_OutputIterator  __result 
)
inline

Parallel std::unique_copy(), without explicit equality predicate.

Parameters
__firstBegin iterator of input sequence.
__lastEnd iterator of input sequence.
__resultBegin iterator of result __sequence.
Returns
End iterator of result __sequence.
188  {
189  typedef typename std::iterator_traits<_IIter>::value_type
190  _ValueType;
191  return __parallel_unique_copy(__first, __last, __result,
192  std::equal_to<_ValueType>());
193  }
_OutputIterator __parallel_unique_copy(_IIter __first, _IIter __last, _OutputIterator __result)
Parallel std::unique_copy(), without explicit equality predicate.
Definition: unique_copy.h:186
template<typename _RAIter , typename _Compare >
void __gnu_parallel::__qsb_conquer ( _QSBThreadLocal< _RAIter > **  __tls,
_RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
_ThreadIndex  __iam,
_ThreadIndex  __num_threads,
bool  __parent_wait 
)

Quicksort conquer step.

Parameters
__tlsArray of thread-local storages.
__beginBegin iterator of subsequence.
__endEnd iterator of subsequence.
__compComparator.
__iamNumber of the thread processing this function.
__num_threadsNumber of threads that are allowed to work on this part.
176  {
177  typedef std::iterator_traits<_RAIter> _TraitsType;
178  typedef typename _TraitsType::value_type _ValueType;
179  typedef typename _TraitsType::difference_type _DifferenceType;
180 
181  _DifferenceType __n = __end - __begin;
182 
183  if (__num_threads <= 1 || __n <= 1)
184  {
185  __tls[__iam]->_M_initial.first = __begin;
186  __tls[__iam]->_M_initial.second = __end;
187 
188  __qsb_local_sort_with_helping(__tls, __comp, __iam, __parent_wait);
189 
190  return;
191  }
192 
193  // Divide step.
194  _DifferenceType __split_pos =
195  __qsb_divide(__begin, __end, __comp, __num_threads);
196 
197 #if _GLIBCXX_ASSERTIONS
198  _GLIBCXX_PARALLEL_ASSERT(0 <= __split_pos &&
199  __split_pos < (__end - __begin));
200 #endif
201 
203  __num_threads_leftside = std::max<_ThreadIndex>
204  (1, std::min<_ThreadIndex>(__num_threads - 1, __split_pos
205  * __num_threads / __n));
206 
207 # pragma omp atomic
208  *__tls[__iam]->_M_elements_leftover -= (_DifferenceType)1;
209 
210  // Conquer step.
211 # pragma omp parallel num_threads(2)
212  {
213  bool __wait;
214  if(omp_get_num_threads() < 2)
215  __wait = false;
216  else
217  __wait = __parent_wait;
218 
219 # pragma omp sections
220  {
221 # pragma omp section
222  {
223  __qsb_conquer(__tls, __begin, __begin + __split_pos, __comp,
224  __iam, __num_threads_leftside, __wait);
225  __wait = __parent_wait;
226  }
227  // The pivot_pos is left in place, to ensure termination.
228 # pragma omp section
229  {
230  __qsb_conquer(__tls, __begin + __split_pos + 1, __end, __comp,
231  __iam + __num_threads_leftside,
232  __num_threads - __num_threads_leftside, __wait);
233  __wait = __parent_wait;
234  }
235  }
236  }
237  }
std::iterator_traits< _RAIter >::difference_type __qsb_divide(_RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __num_threads)
Balanced quicksort divide step.
Definition: balanced_quicksort.h:100
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_num_threads(void) __GOMP_NOTHROW
void __qsb_local_sort_with_helping(_QSBThreadLocal< _RAIter > **__tls, _Compare &__comp, _ThreadIndex __iam, bool __wait)
Quicksort step doing load-balanced local sort.
Definition: balanced_quicksort.h:247
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
void __qsb_conquer(_QSBThreadLocal< _RAIter > **__tls, _RAIter __begin, _RAIter __end, _Compare __comp, _ThreadIndex __iam, _ThreadIndex __num_threads, bool __parent_wait)
Quicksort conquer step.
Definition: balanced_quicksort.h:171
template<typename _RAIter , typename _Compare >
std::iterator_traits<_RAIter>::difference_type __gnu_parallel::__qsb_divide ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
_ThreadIndex  __num_threads 
)

Balanced quicksort divide step.

Parameters
__beginBegin iterator of subsequence.
__endEnd iterator of subsequence.
__compComparator.
__num_threadsNumber of threads that are allowed to work on this part.
Precondition
(__end-__begin)>=1
102  {
103  _GLIBCXX_PARALLEL_ASSERT(__num_threads > 0);
104 
105  typedef std::iterator_traits<_RAIter> _TraitsType;
106  typedef typename _TraitsType::value_type _ValueType;
107  typedef typename _TraitsType::difference_type _DifferenceType;
108 
109  _RAIter __pivot_pos =
110  __median_of_three_iterators(__begin, __begin + (__end - __begin) / 2,
111  __end - 1, __comp);
112 
113 #if defined(_GLIBCXX_ASSERTIONS)
114  // Must be in between somewhere.
115  _DifferenceType __n = __end - __begin;
116 
117  _GLIBCXX_PARALLEL_ASSERT((!__comp(*__pivot_pos, *__begin)
118  && !__comp(*(__begin + __n / 2),
119  *__pivot_pos))
120  || (!__comp(*__pivot_pos, *__begin)
121  && !__comp(*(__end - 1), *__pivot_pos))
122  || (!__comp(*__pivot_pos, *(__begin + __n / 2))
123  && !__comp(*__begin, *__pivot_pos))
124  || (!__comp(*__pivot_pos, *(__begin + __n / 2))
125  && !__comp(*(__end - 1), *__pivot_pos))
126  || (!__comp(*__pivot_pos, *(__end - 1))
127  && !__comp(*__begin, *__pivot_pos))
128  || (!__comp(*__pivot_pos, *(__end - 1))
129  && !__comp(*(__begin + __n / 2),
130  *__pivot_pos)));
131 #endif
132 
133  // Swap pivot value to end.
134  if (__pivot_pos != (__end - 1))
135  std::iter_swap(__pivot_pos, __end - 1);
136  __pivot_pos = __end - 1;
137 
139  __pred(__comp, *__pivot_pos);
140 
141  // Divide, returning __end - __begin - 1 in the worst case.
142  _DifferenceType __split_pos = __parallel_partition(__begin, __end - 1,
143  __pred,
144  __num_threads);
145 
146  // Swap back pivot to middle.
147  std::iter_swap(__begin + __split_pos, __pivot_pos);
148  __pivot_pos = __begin + __split_pos;
149 
150 #if _GLIBCXX_ASSERTIONS
151  _RAIter __r;
152  for (__r = __begin; __r != __pivot_pos; ++__r)
153  _GLIBCXX_PARALLEL_ASSERT(__comp(*__r, *__pivot_pos));
154  for (; __r != __end; ++__r)
155  _GLIBCXX_PARALLEL_ASSERT(!__comp(*__r, *__pivot_pos));
156 #endif
157 
158  return __split_pos;
159  }
std::iterator_traits< _RAIter >::difference_type __parallel_partition(_RAIter __begin, _RAIter __end, _Predicate __pred, _ThreadIndex __num_threads)
Parallel implementation of std::partition.
Definition: partition.h:56
_RAIter __median_of_three_iterators(_RAIter __a, _RAIter __b, _RAIter __c, _Compare __comp)
Compute the median of three referenced elements, according to __comp.
Definition: base.h:398
Similar to std::binder2nd, but giving the argument types explicitly.
Definition: base.h:220
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<typename _RAIter , typename _Compare >
void __gnu_parallel::__qsb_local_sort_with_helping ( _QSBThreadLocal< _RAIter > **  __tls,
_Compare &  __comp,
_ThreadIndex  __iam,
bool  __wait 
)

Quicksort step doing load-balanced local sort.

Parameters
__tlsArray of thread-local storages.
__compComparator.
__iamNumber of the thread processing this function.
250  {
251  typedef std::iterator_traits<_RAIter> _TraitsType;
252  typedef typename _TraitsType::value_type _ValueType;
253  typedef typename _TraitsType::difference_type _DifferenceType;
254  typedef std::pair<_RAIter, _RAIter> _Piece;
255 
256  _QSBThreadLocal<_RAIter>& __tl = *__tls[__iam];
257 
258  _DifferenceType
259  __base_case_n = _Settings::get().sort_qsb_base_case_maximal_n;
260  if (__base_case_n < 2)
261  __base_case_n = 2;
262  _ThreadIndex __num_threads = __tl._M_num_threads;
263 
264  // Every thread has its own random number generator.
265  _RandomNumber __rng(__iam + 1);
266 
267  _Piece __current = __tl._M_initial;
268 
269  _DifferenceType __elements_done = 0;
270 #if _GLIBCXX_ASSERTIONS
271  _DifferenceType __total_elements_done = 0;
272 #endif
273 
274  for (;;)
275  {
276  // Invariant: __current must be a valid (maybe empty) range.
277  _RAIter __begin = __current.first, __end = __current.second;
278  _DifferenceType __n = __end - __begin;
279 
280  if (__n > __base_case_n)
281  {
282  // Divide.
283  _RAIter __pivot_pos = __begin + __rng(__n);
284 
285  // Swap __pivot_pos value to end.
286  if (__pivot_pos != (__end - 1))
287  std::iter_swap(__pivot_pos, __end - 1);
288  __pivot_pos = __end - 1;
289 
291  <_Compare, _ValueType, _ValueType, bool>
292  __pred(__comp, *__pivot_pos);
293 
294  // Divide, leave pivot unchanged in last place.
295  _RAIter __split_pos1, __split_pos2;
296  __split_pos1 = __gnu_sequential::partition(__begin, __end - 1,
297  __pred);
298 
299  // Left side: < __pivot_pos; __right side: >= __pivot_pos.
300 #if _GLIBCXX_ASSERTIONS
301  _GLIBCXX_PARALLEL_ASSERT(__begin <= __split_pos1
302  && __split_pos1 < __end);
303 #endif
304  // Swap pivot back to middle.
305  if (__split_pos1 != __pivot_pos)
306  std::iter_swap(__split_pos1, __pivot_pos);
307  __pivot_pos = __split_pos1;
308 
309  // In case all elements are equal, __split_pos1 == 0.
310  if ((__split_pos1 + 1 - __begin) < (__n >> 7)
311  || (__end - __split_pos1) < (__n >> 7))
312  {
313  // Very unequal split, one part smaller than one 128th
314  // elements not strictly larger than the pivot.
316  <_Compare, _ValueType, _ValueType, bool>, _ValueType>
318  <_Compare, _ValueType, _ValueType, bool>
319  (__comp, *__pivot_pos));
320 
321  // Find other end of pivot-equal range.
322  __split_pos2 = __gnu_sequential::partition(__split_pos1 + 1,
323  __end, __pred);
324  }
325  else
326  // Only skip the pivot.
327  __split_pos2 = __split_pos1 + 1;
328 
329  // Elements equal to pivot are done.
330  __elements_done += (__split_pos2 - __split_pos1);
331 #if _GLIBCXX_ASSERTIONS
332  __total_elements_done += (__split_pos2 - __split_pos1);
333 #endif
334  // Always push larger part onto stack.
335  if (((__split_pos1 + 1) - __begin) < (__end - (__split_pos2)))
336  {
337  // Right side larger.
338  if ((__split_pos2) != __end)
339  __tl._M_leftover_parts.push_front
340  (std::make_pair(__split_pos2, __end));
341 
342  //__current.first = __begin; //already set anyway
343  __current.second = __split_pos1;
344  continue;
345  }
346  else
347  {
348  // Left side larger.
349  if (__begin != __split_pos1)
350  __tl._M_leftover_parts.push_front(std::make_pair
351  (__begin, __split_pos1));
352 
353  __current.first = __split_pos2;
354  //__current.second = __end; //already set anyway
355  continue;
356  }
357  }
358  else
359  {
360  __gnu_sequential::sort(__begin, __end, __comp);
361  __elements_done += __n;
362 #if _GLIBCXX_ASSERTIONS
363  __total_elements_done += __n;
364 #endif
365 
366  // Prefer own stack, small pieces.
367  if (__tl._M_leftover_parts.pop_front(__current))
368  continue;
369 
370 # pragma omp atomic
371  *__tl._M_elements_leftover -= __elements_done;
372 
373  __elements_done = 0;
374 
375 #if _GLIBCXX_ASSERTIONS
376  double __search_start = omp_get_wtime();
377 #endif
378 
379  // Look for new work.
380  bool __successfully_stolen = false;
381  while (__wait && *__tl._M_elements_leftover > 0
382  && !__successfully_stolen
384  // Possible dead-lock.
385  && (omp_get_wtime() < (__search_start + 1.0))
386 #endif
387  )
388  {
389  _ThreadIndex __victim;
390  __victim = __rng(__num_threads);
391 
392  // Large pieces.
393  __successfully_stolen = (__victim != __iam)
394  && __tls[__victim]->_M_leftover_parts.pop_back(__current);
395  if (!__successfully_stolen)
396  __yield();
397 #if !defined(__ICC) && !defined(__ECC)
398 # pragma omp flush
399 #endif
400  }
401 
402 #if _GLIBCXX_ASSERTIONS
403  if (omp_get_wtime() >= (__search_start + 1.0))
404  {
405  sleep(1);
407  < (__search_start + 1.0));
408  }
409 #endif
410  if (!__successfully_stolen)
411  {
412 #if _GLIBCXX_ASSERTIONS
413  _GLIBCXX_PARALLEL_ASSERT(*__tl._M_elements_leftover == 0);
414 #endif
415  return;
416  }
417  }
418  }
419  }
#define _GLIBCXX_ASSERTIONS
Switch on many _GLIBCXX_PARALLEL_ASSERTions in parallel code. Should be switched on only locally...
Definition: compiletime_settings.h:61
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
Similar to std::unary_negate, but giving the argument types explicitly.
Definition: base.h:173
void __yield()
Yield control to another thread, without waiting for the end of the time slice.
Definition: compatibility.h:121
Similar to std::binder2nd, but giving the argument types explicitly.
Definition: base.h:220
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
double omp_get_wtime(void) __GOMP_NOTHROW
Similar to std::binder1st, but giving the argument types explicitly.
Definition: base.h:192
template<typename _RandomNumberGenerator >
int __gnu_parallel::__random_number_pow2 ( int  __logp,
_RandomNumberGenerator &  __rng 
)
inline

Generate a random number in [0,2^__logp).

Parameters
__logpLogarithm (basis 2) of the upper range __bound.
__rngRandom number generator to use.
116  { return __rng.__genrand_bits(__logp); }
template<typename _Size >
_Size __gnu_parallel::__rd_log2 ( _Size  __n)
inline

Calculates the rounded-down logarithm of __n for base 2.

Parameters
__nArgument.
Returns
Returns 0 for any argument <1.
103  {
104  _Size __k;
105  for (__k = 0; __n > 1; __n >>= 1)
106  ++__k;
107  return __k;
108  }
template<typename _Tp >
_Tp __gnu_parallel::__round_up_to_pow2 ( _Tp  __x)

Round up to the next greater power of 2.

Parameters
__x_Integer to round up
249  {
250  if (__x <= 1)
251  return 1;
252  else
253  return (_Tp)1 << (__rd_log2(__x - 1) + 1);
254  }
_Size __rd_log2(_Size __n)
Calculates the rounded-down logarithm of __n for base 2.
Definition: base.h:102
template<typename __RAIter1 , typename __RAIter2 , typename _Pred >
__RAIter1 __gnu_parallel::__search_template ( __RAIter1  __begin1,
__RAIter1  __end1,
__RAIter2  __begin2,
__RAIter2  __end2,
_Pred  __pred 
)

Parallel std::search.

Parameters
__begin1Begin iterator of first sequence.
__end1End iterator of first sequence.
__begin2Begin iterator of second sequence.
__end2End iterator of second sequence.
__predFind predicate.
Returns
Place of finding in first sequences.
84  {
85  typedef std::iterator_traits<__RAIter1> _TraitsType;
86  typedef typename _TraitsType::difference_type _DifferenceType;
87 
88  _GLIBCXX_CALL((__end1 - __begin1) + (__end2 - __begin2));
89 
90  _DifferenceType __pattern_length = __end2 - __begin2;
91 
92  // Pattern too short.
93  if(__pattern_length <= 0)
94  return __end1;
95 
96  // Last point to start search.
97  _DifferenceType __input_length = (__end1 - __begin1) - __pattern_length;
98 
99  // Where is first occurrence of pattern? defaults to end.
100  _DifferenceType __result = (__end1 - __begin1);
101  _DifferenceType *__splitters;
102 
103  // Pattern too long.
104  if (__input_length < 0)
105  return __end1;
106 
107  omp_lock_t __result_lock;
108  omp_init_lock(&__result_lock);
109 
110  _ThreadIndex __num_threads = std::max<_DifferenceType>
111  (1, std::min<_DifferenceType>(__input_length,
112  __get_max_threads()));
113 
114  _DifferenceType __advances[__pattern_length];
115  __calc_borders(__begin2, __pattern_length, __advances);
116 
117 # pragma omp parallel num_threads(__num_threads)
118  {
119 # pragma omp single
120  {
121  __num_threads = omp_get_num_threads();
122  __splitters = new _DifferenceType[__num_threads + 1];
123  __equally_split(__input_length, __num_threads, __splitters);
124  }
125 
127 
128  _DifferenceType __start = __splitters[__iam],
129  __stop = __splitters[__iam + 1];
130 
131  _DifferenceType __pos_in_pattern = 0;
132  bool __found_pattern = false;
133 
134  while (__start <= __stop && !__found_pattern)
135  {
136  // Get new value of result.
137 #pragma omp flush(__result)
138  // No chance for this thread to find first occurrence.
139  if (__result < __start)
140  break;
141  while (__pred(__begin1[__start + __pos_in_pattern],
142  __begin2[__pos_in_pattern]))
143  {
144  ++__pos_in_pattern;
145  if (__pos_in_pattern == __pattern_length)
146  {
147  // Found new candidate for result.
148  omp_set_lock(&__result_lock);
149  __result = std::min(__result, __start);
150  omp_unset_lock(&__result_lock);
151 
152  __found_pattern = true;
153  break;
154  }
155  }
156  // Make safe jump.
157  __start += (__pos_in_pattern - __advances[__pos_in_pattern]);
158  __pos_in_pattern = (__advances[__pos_in_pattern] < 0
159  ? 0 : __advances[__pos_in_pattern]);
160  }
161  } //parallel
162 
163  omp_destroy_lock(&__result_lock);
164 
165  delete[] __splitters;
166 
167  // Return iterator on found element.
168  return (__begin1 + __result);
169  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
int omp_get_num_threads(void) __GOMP_NOTHROW
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
void omp_destroy_lock(omp_lock_t *) __GOMP_NOTHROW
_OutputIterator __equally_split(_DifferenceType __n, _ThreadIndex __num_threads, _OutputIterator __s)
function to split a sequence into parts of almost equal size.
Definition: equally_split.h:48
void __calc_borders(_RAIter __elements, _DifferenceTp __length, _DifferenceTp *__off)
Precalculate __advances for Knuth-Morris-Pratt algorithm.
Definition: search.h:51
void omp_unset_lock(omp_lock_t *) __GOMP_NOTHROW
void omp_set_lock(omp_lock_t *) __GOMP_NOTHROW
int omp_get_thread_num(void) __GOMP_NOTHROW
void omp_init_lock(omp_lock_t *) __GOMP_NOTHROW
_ThreadIndex __get_max_threads()
Definition: base.h:85
Definition: omp.h:34
template<bool __stable, bool __sentinels, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 __gnu_parallel::__sequential_multiway_merge ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_RAIter3  __target,
const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &  __sentinel,
_DifferenceTp  __length,
_Compare  __comp 
)

Sequential multi-way merging switch.

The _GLIBCXX_PARALLEL_DECISION is based on the branching factor and runtime settings.

Parameters
__seqs_beginBegin iterator of iterator pair input sequence.
__seqs_endEnd iterator of iterator pair input sequence.
__targetBegin iterator of output sequence.
__compComparator.
__lengthMaximum length to merge, possibly larger than the number of elements available.
__sentinelThe sequences have __a __sentinel element.
Returns
End iterator of output sequence.
927  {
928  _GLIBCXX_CALL(__length)
929 
930  typedef _DifferenceTp _DifferenceType;
931  typedef typename std::iterator_traits<_RAIterIterator>
932  ::difference_type _SeqNumber;
933  typedef typename std::iterator_traits<_RAIterIterator>
934  ::value_type::first_type
935  _RAIter1;
936  typedef typename std::iterator_traits<_RAIter1>::value_type
937  _ValueType;
938 
939 #if _GLIBCXX_ASSERTIONS
940  for (_RAIterIterator __s = __seqs_begin; __s != __seqs_end; ++__s)
941  {
943  (*__s).second, __comp));
944  }
945 #endif
946 
947  _DifferenceTp __total_length = 0;
948  for (_RAIterIterator __s = __seqs_begin; __s != __seqs_end; ++__s)
949  __total_length += _GLIBCXX_PARALLEL_LENGTH(*__s);
950 
951  __length = std::min<_DifferenceTp>(__length, __total_length);
952 
953  if(__length == 0)
954  return __target;
955 
956  _RAIter3 __return_target = __target;
957  _SeqNumber __k = static_cast<_SeqNumber>(__seqs_end - __seqs_begin);
958 
959  switch (__k)
960  {
961  case 0:
962  break;
963  case 1:
964  __return_target = std::copy(__seqs_begin[0].first,
965  __seqs_begin[0].first + __length,
966  __target);
967  __seqs_begin[0].first += __length;
968  break;
969  case 2:
970  __return_target = __merge_advance(__seqs_begin[0].first,
971  __seqs_begin[0].second,
972  __seqs_begin[1].first,
973  __seqs_begin[1].second,
974  __target, __length, __comp);
975  break;
976  case 3:
977  __return_target = __multiway_merge_3_variant_sentinel_switch
978  <__sentinels, _RAIterIterator, _RAIter3, _DifferenceTp, _Compare>()
979  (__seqs_begin, __seqs_end, __target, __length, __comp);
980  break;
981  case 4:
982  __return_target = __multiway_merge_4_variant_sentinel_switch
983  <__sentinels, _RAIterIterator, _RAIter3, _DifferenceTp, _Compare>()
984  (__seqs_begin, __seqs_end, __target, __length, __comp);
985  break;
986  default:
987  __return_target = __multiway_merge_k_variant_sentinel_switch
988  <__sentinels, __stable, _RAIterIterator, _RAIter3, _DifferenceTp,
989  _Compare>()
990  (__seqs_begin, __seqs_end, __target, __sentinel, __length, __comp);
991  break;
992  }
993 #if _GLIBCXX_ASSERTIONS
995  __is_sorted(__target, __target + __length, __comp));
996 #endif
997 
998  return __return_target;
999  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define _GLIBCXX_PARALLEL_LENGTH(__s)
Length of a sequence described by a pair of iterators.
Definition: multiway_merge.h:54
bool __is_sorted(_IIter __begin, _IIter __end, _Compare __comp)
Check whether [__begin, __end) is sorted according to __comp.
Definition: checkers.h:51
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
_OutputIterator __merge_advance(_RAIter1 &__begin1, _RAIter1 __end1, _RAIter2 &__begin2, _RAIter2 __end2, _OutputIterator __target, _DifferenceTp __max_length, _Compare __comp)
Merge routine being able to merge only the __max_length smallest elements.
Definition: merge.h:171
template<typename _RAIter , typename _RandomNumberGenerator >
void __gnu_parallel::__sequential_random_shuffle ( _RAIter  __begin,
_RAIter  __end,
_RandomNumberGenerator &  __rng 
)

Sequential cache-efficient random shuffle.

Parameters
__beginBegin iterator of sequence.
__endEnd iterator of sequence.
__rngRandom number generator to use.
412  {
413  typedef std::iterator_traits<_RAIter> _TraitsType;
414  typedef typename _TraitsType::value_type _ValueType;
415  typedef typename _TraitsType::difference_type _DifferenceType;
416 
417  _DifferenceType __n = __end - __begin;
418  const _Settings& __s = _Settings::get();
419 
420  _BinIndex __num_bins, __num_bins_cache;
421 
422 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_L1
423  // Try the L1 cache first, must fit into L1.
424  __num_bins_cache = std::max<_DifferenceType>
425  (1, __n / (__s.L1_cache_size_lb / sizeof(_ValueType)));
426  __num_bins_cache = __round_up_to_pow2(__num_bins_cache);
427 
428  // No more buckets than TLB entries, power of 2
429  // Power of 2 and at least one element per bin, at most the TLB size
430  __num_bins = std::min(__n, (_DifferenceType)__num_bins_cache);
431 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_TLB
432  // 2 TLB entries needed per bin
433  __num_bins = std::min((_DifferenceType)__s.TLB_size / 2, __num_bins);
434 #endif
435  __num_bins = __round_up_to_pow2(__num_bins);
436 
437  if (__num_bins < __num_bins_cache)
438  {
439 #endif
440  // Now try the L2 cache, must fit into L2.
441  __num_bins_cache = static_cast<_BinIndex>
442  (std::max<_DifferenceType>(1, __n / (__s.L2_cache_size
443  / sizeof(_ValueType))));
444  __num_bins_cache = __round_up_to_pow2(__num_bins_cache);
445 
446  // No more buckets than TLB entries, power of 2
447  // Power of 2 and at least one element per bin, at most the TLB size.
448  __num_bins = static_cast<_BinIndex>
449  (std::min(__n, static_cast<_DifferenceType>(__num_bins_cache)));
450 
451 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_TLB
452  // 2 TLB entries needed per bin
453  __num_bins = std::min<_DifferenceType>(__s.TLB_size / 2, __num_bins);
454 #endif
455  __num_bins = __round_up_to_pow2(__num_bins);
456 #if _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_L1
457  }
458 #endif
459 
460  int __num_bits = __rd_log2(__num_bins);
461 
462  if (__num_bins > 1)
463  {
464  _ValueType* __target =
465  static_cast<_ValueType*>(::operator new(sizeof(_ValueType) * __n));
466  _BinIndex* __oracles = new _BinIndex[__n];
467  _DifferenceType* __dist0 = new _DifferenceType[__num_bins + 1],
468  * __dist1 = new _DifferenceType[__num_bins + 1];
469 
470  for (int __b = 0; __b < __num_bins + 1; ++__b)
471  __dist0[__b] = 0;
472 
473  _RandomNumber __bitrng(__rng(0xFFFFFFFF));
474 
475  for (_DifferenceType __i = 0; __i < __n; ++__i)
476  {
477  _BinIndex __oracle = __random_number_pow2(__num_bits, __bitrng);
478  __oracles[__i] = __oracle;
479 
480  // To allow prefix (partial) sum.
481  ++(__dist0[__oracle + 1]);
482  }
483 
484  // Sum up bins.
485  __gnu_sequential::partial_sum(__dist0, __dist0 + __num_bins + 1,
486  __dist0);
487 
488  for (int __b = 0; __b < __num_bins + 1; ++__b)
489  __dist1[__b] = __dist0[__b];
490 
491  // Distribute according to oracles.
492  for (_DifferenceType __i = 0; __i < __n; ++__i)
493  ::new(&(__target[(__dist0[__oracles[__i]])++]))
494  _ValueType(*(__begin + __i));
495 
496  for (int __b = 0; __b < __num_bins; ++__b)
497  __sequential_random_shuffle(__target + __dist1[__b],
498  __target + __dist1[__b + 1], __rng);
499 
500  // Copy elements back.
501  std::copy(__target, __target + __n, __begin);
502 
503  delete[] __dist0;
504  delete[] __dist1;
505  delete[] __oracles;
506 
507  for (_DifferenceType __i = 0; __i < __n; ++__i)
508  __target[__i].~_ValueType();
509  ::operator delete(__target);
510  }
511  else
512  __gnu_sequential::random_shuffle(__begin, __end, __rng);
513  }
void __sequential_random_shuffle(_RAIter __begin, _RAIter __end, _RandomNumberGenerator &__rng)
Sequential cache-efficient random shuffle.
Definition: random_shuffle.h:410
unsigned short _BinIndex
Type to hold the index of a bin.
Definition: random_shuffle.h:47
int __random_number_pow2(int __logp, _RandomNumberGenerator &__rng)
Generate a random number in [0,2^__logp).
Definition: random_shuffle.h:115
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
_Size __rd_log2(_Size __n)
Calculates the rounded-down logarithm of __n for base 2.
Definition: base.h:102
_Tp __round_up_to_pow2(_Tp __x)
Round up to the next greater power of 2.
Definition: random_shuffle.h:248
template<typename _IIter >
void __gnu_parallel::__shrink ( std::vector< _IIter > &  __os_starts,
size_t __count_to_two,
size_t __range_length 
)

Combines two ranges into one and thus halves the number of ranges.

Parameters
__os_startsStart positions worked on (oversampled).
__count_to_twoCounts up to 2.
__range_lengthCurrent length of a chunk.
72  {
73  for (typename std::vector<_IIter>::size_type __i = 0;
74  __i <= (__os_starts.size() / 2); ++__i)
75  __os_starts[__i] = __os_starts[__i * 2];
76  __range_length *= 2;
77  }
template<typename _IIter >
void __gnu_parallel::__shrink_and_double ( std::vector< _IIter > &  __os_starts,
size_t __count_to_two,
size_t __range_length,
const bool  __make_twice 
)

Shrinks and doubles the ranges.

Parameters
__os_startsStart positions worked on (oversampled).
__count_to_twoCounts up to 2.
__range_lengthCurrent length of a chunk.
__make_twiceWhether the __os_starts is allowed to be grown or not
53  {
54  ++__count_to_two;
55  if (!__make_twice || __count_to_two < 2)
56  __shrink(__os_starts, __count_to_two, __range_length);
57  else
58  {
59  __os_starts.resize((__os_starts.size() - 1) * 2 + 1);
60  __count_to_two = 0;
61  }
62  }
void __shrink(std::vector< _IIter > &__os_starts, size_t &__count_to_two, size_t &__range_length)
Combines two ranges into one and thus halves the number of ranges.
Definition: list_partition.h:70
void __gnu_parallel::__yield ( )
inline

Yield control to another thread, without waiting for the end of the time slice.

122  {
123 #if defined (_WIN32) && !defined (__CYGWIN__)
124  Sleep(0);
125 #else
126  sched_yield();
127 #endif
128  }
template<typename _IIter , typename _FunctorType >
size_t __gnu_parallel::list_partition ( const _IIter  __begin,
const _IIter  __end,
_IIter *  __starts,
size_t __lengths,
const int  __num_parts,
_FunctorType &  __f,
int  __oversampling = 0 
)

Splits a sequence given by input iterators into parts of almost equal size.

The function needs only one pass over the sequence.

Parameters
__beginBegin iterator of input sequence.
__endEnd iterator of input sequence.
__startsStart iterators for the resulting parts, dimension __num_parts+1. For convenience, __starts [__num_parts] contains the end iterator of the sequence.
__lengthsLength of the resulting parts.
__num_partsNumber of parts to split the sequence into.
__fFunctor to be applied to each element by traversing __it
__oversamplingOversampling factor. If 0, then the partitions will differ in at most {{__end} - {__begin}} __elements. Otherwise, the ratio between the longest and the shortest part is bounded by 1/({__oversampling} {num_parts})
Returns
Length of the whole sequence.
104  {
105  bool __make_twice = false;
106 
107  // The resizing algorithm is chosen according to the oversampling factor.
108  if (__oversampling == 0)
109  {
110  __make_twice = true;
111  __oversampling = 1;
112  }
113 
114  std::vector<_IIter> __os_starts(2 * __oversampling * __num_parts + 1);
115 
116  __os_starts[0] = __begin;
117  _IIter __prev = __begin, __it = __begin;
118  size_t __dist_limit = 0, __dist = 0;
119  size_t __cur = 1, __next = 1;
120  size_t __range_length = 1;
121  size_t __count_to_two = 0;
122  while (__it != __end)
123  {
124  __cur = __next;
125  for (; __cur < __os_starts.size() and __it != __end; ++__cur)
126  {
127  for (__dist_limit += __range_length;
128  __dist < __dist_limit and __it != __end; ++__dist)
129  {
130  __f(__it);
131  ++__it;
132  }
133  __os_starts[__cur] = __it;
134  }
135 
136  // Must compare for end and not __cur < __os_starts.size() , because
137  // __cur could be == __os_starts.size() as well
138  if (__it == __end)
139  break;
140 
141  __shrink_and_double(__os_starts, __count_to_two, __range_length,
142  __make_twice);
143  __next = __os_starts.size() / 2 + 1;
144  }
145 
146  // Calculation of the parts (one must be extracted from __current
147  // because the partition beginning at end, consists only of
148  // itself).
149  size_t __size_part = (__cur - 1) / __num_parts;
150  int __size_greater = static_cast<int>((__cur - 1) % __num_parts);
151  __starts[0] = __os_starts[0];
152 
153  size_t __index = 0;
154 
155  // Smallest partitions.
156  for (int __i = 1; __i < (__num_parts + 1 - __size_greater); ++__i)
157  {
158  __lengths[__i - 1] = __size_part * __range_length;
159  __index += __size_part;
160  __starts[__i] = __os_starts[__index];
161  }
162 
163  // Biggest partitions.
164  for (int __i = __num_parts + 1 - __size_greater; __i <= __num_parts;
165  ++__i)
166  {
167  __lengths[__i - 1] = (__size_part+1) * __range_length;
168  __index += (__size_part+1);
169  __starts[__i] = __os_starts[__index];
170  }
171 
172  // Correction of the end size (the end iteration has not finished).
173  __lengths[__num_parts - 1] -= (__dist_limit - __dist);
174 
175  return __dist;
176  }
void __shrink_and_double(std::vector< _IIter > &__os_starts, size_t &__count_to_two, size_t &__range_length, const bool __make_twice)
Shrinks and doubles the ranges.
Definition: list_partition.h:50
#define and
Definition: iso646.h:32
template<typename _Tp >
const _Tp& __gnu_parallel::max ( const _Tp &  __a,
const _Tp &  __b 
)
inline

Equivalent to std::max.

151  { return (__a > __b) ? __a : __b; }
template<typename _Tp >
const _Tp& __gnu_parallel::min ( const _Tp &  __a,
const _Tp &  __b 
)
inline

Equivalent to std::min.

145  { return (__a < __b) ? __a : __b; }
template<typename _RanSeqs , typename _RankType , typename _RankIterator , typename _Compare >
void __gnu_parallel::multiseq_partition ( _RanSeqs  __begin_seqs,
_RanSeqs  __end_seqs,
_RankType  __rank,
_RankIterator  __begin_offsets,
_Compare  __comp = std::less< typename std::iterator_traits<typename std::iterator_traits<_RanSeqs>::value_type:: first_type>::value_type>() 
)

Splits several sorted sequences at a certain global __rank, resulting in a splitting point for each sequence. The sequences are passed via a sequence of random-access iterator pairs, none of the sequences may be empty. If there are several equal elements across the split, the ones on the __left side will be chosen from sequences with smaller number.

Parameters
__begin_seqsBegin of the sequence of iterator pairs.
__end_seqsEnd of the sequence of iterator pairs.
__rankThe global rank to partition at.
__begin_offsetsA random-access __sequence __begin where the __result will be stored in. Each element of the sequence is an iterator that points to the first element on the greater part of the respective __sequence.
__compThe ordering functor, defaults to std::less<_Tp>.
129  {
130  _GLIBCXX_CALL(__end_seqs - __begin_seqs)
131 
132  typedef typename std::iterator_traits<_RanSeqs>::value_type::first_type
133  _It;
134  typedef typename std::iterator_traits<_RanSeqs>::difference_type
135  _SeqNumber;
136  typedef typename std::iterator_traits<_It>::difference_type
137  _DifferenceType;
138  typedef typename std::iterator_traits<_It>::value_type _ValueType;
139 
140  _Lexicographic<_ValueType, _SeqNumber, _Compare> __lcomp(__comp);
141  _LexicographicReverse<_ValueType, _SeqNumber, _Compare> __lrcomp(__comp);
142 
143  // Number of sequences, number of elements in total (possibly
144  // including padding).
145  _DifferenceType __m = std::distance(__begin_seqs, __end_seqs), __nn = 0,
146  __nmax, __n, __r;
147 
148  for (_SeqNumber __i = 0; __i < __m; __i++)
149  {
150  __nn += std::distance(__begin_seqs[__i].first,
151  __begin_seqs[__i].second);
153  std::distance(__begin_seqs[__i].first,
154  __begin_seqs[__i].second) > 0);
155  }
156 
157  if (__rank == __nn)
158  {
159  for (_SeqNumber __i = 0; __i < __m; __i++)
160  __begin_offsets[__i] = __begin_seqs[__i].second; // Very end.
161  // Return __m - 1;
162  return;
163  }
164 
165  _GLIBCXX_PARALLEL_ASSERT(__m != 0);
166  _GLIBCXX_PARALLEL_ASSERT(__nn != 0);
167  _GLIBCXX_PARALLEL_ASSERT(__rank >= 0);
168  _GLIBCXX_PARALLEL_ASSERT(__rank < __nn);
169 
170  _DifferenceType* __ns = new _DifferenceType[__m];
171  _DifferenceType* __a = new _DifferenceType[__m];
172  _DifferenceType* __b = new _DifferenceType[__m];
173  _DifferenceType __l;
174 
175  __ns[0] = std::distance(__begin_seqs[0].first, __begin_seqs[0].second);
176  __nmax = __ns[0];
177  for (_SeqNumber __i = 0; __i < __m; __i++)
178  {
179  __ns[__i] = std::distance(__begin_seqs[__i].first,
180  __begin_seqs[__i].second);
181  __nmax = std::max(__nmax, __ns[__i]);
182  }
183 
184  __r = __rd_log2(__nmax) + 1;
185 
186  // Pad all lists to this length, at least as long as any ns[__i],
187  // equality iff __nmax = 2^__k - 1.
188  __l = (1ULL << __r) - 1;
189 
190  for (_SeqNumber __i = 0; __i < __m; __i++)
191  {
192  __a[__i] = 0;
193  __b[__i] = __l;
194  }
195  __n = __l / 2;
196 
197  // Invariants:
198  // 0 <= __a[__i] <= __ns[__i], 0 <= __b[__i] <= __l
199 
200 #define __S(__i) (__begin_seqs[__i].first)
201 
202  // Initial partition.
203  std::vector<std::pair<_ValueType, _SeqNumber> > __sample;
204 
205  for (_SeqNumber __i = 0; __i < __m; __i++)
206  if (__n < __ns[__i]) //__sequence long enough
207  __sample.push_back(std::make_pair(__S(__i)[__n], __i));
208  __gnu_sequential::sort(__sample.begin(), __sample.end(), __lcomp);
209 
210  for (_SeqNumber __i = 0; __i < __m; __i++) //conceptual infinity
211  if (__n >= __ns[__i]) //__sequence too short, conceptual infinity
212  __sample.push_back(
213  std::make_pair(__S(__i)[0] /*__dummy element*/, __i));
214 
215  _DifferenceType __localrank = __rank / __l;
216 
217  _SeqNumber __j;
218  for (__j = 0;
219  __j < __localrank && ((__n + 1) <= __ns[__sample[__j].second]);
220  ++__j)
221  __a[__sample[__j].second] += __n + 1;
222  for (; __j < __m; __j++)
223  __b[__sample[__j].second] -= __n + 1;
224 
225  // Further refinement.
226  while (__n > 0)
227  {
228  __n /= 2;
229 
230  _SeqNumber __lmax_seq = -1; // to avoid warning
231  const _ValueType* __lmax = 0; // impossible to avoid the warning?
232  for (_SeqNumber __i = 0; __i < __m; __i++)
233  {
234  if (__a[__i] > 0)
235  {
236  if (!__lmax)
237  {
238  __lmax = &(__S(__i)[__a[__i] - 1]);
239  __lmax_seq = __i;
240  }
241  else
242  {
243  // Max, favor rear sequences.
244  if (!__comp(__S(__i)[__a[__i] - 1], *__lmax))
245  {
246  __lmax = &(__S(__i)[__a[__i] - 1]);
247  __lmax_seq = __i;
248  }
249  }
250  }
251  }
252 
253  _SeqNumber __i;
254  for (__i = 0; __i < __m; __i++)
255  {
256  _DifferenceType __middle = (__b[__i] + __a[__i]) / 2;
257  if (__lmax && __middle < __ns[__i] &&
258  __lcomp(std::make_pair(__S(__i)[__middle], __i),
259  std::make_pair(*__lmax, __lmax_seq)))
260  __a[__i] = std::min(__a[__i] + __n + 1, __ns[__i]);
261  else
262  __b[__i] -= __n + 1;
263  }
264 
265  _DifferenceType __leftsize = 0;
266  for (_SeqNumber __i = 0; __i < __m; __i++)
267  __leftsize += __a[__i] / (__n + 1);
268 
269  _DifferenceType __skew = __rank / (__n + 1) - __leftsize;
270 
271  if (__skew > 0)
272  {
273  // Move to the left, find smallest.
274  std::priority_queue<std::pair<_ValueType, _SeqNumber>,
275  std::vector<std::pair<_ValueType, _SeqNumber> >,
276  _LexicographicReverse<_ValueType, _SeqNumber, _Compare> >
277  __pq(__lrcomp);
278 
279  for (_SeqNumber __i = 0; __i < __m; __i++)
280  if (__b[__i] < __ns[__i])
281  __pq.push(std::make_pair(__S(__i)[__b[__i]], __i));
282 
283  for (; __skew != 0 && !__pq.empty(); --__skew)
284  {
285  _SeqNumber __source = __pq.top().second;
286  __pq.pop();
287 
288  __a[__source]
289  = std::min(__a[__source] + __n + 1, __ns[__source]);
290  __b[__source] += __n + 1;
291 
292  if (__b[__source] < __ns[__source])
293  __pq.push(
294  std::make_pair(__S(__source)[__b[__source]], __source));
295  }
296  }
297  else if (__skew < 0)
298  {
299  // Move to the right, find greatest.
300  std::priority_queue<std::pair<_ValueType, _SeqNumber>,
301  std::vector<std::pair<_ValueType, _SeqNumber> >,
302  _Lexicographic<_ValueType, _SeqNumber, _Compare> >
303  __pq(__lcomp);
304 
305  for (_SeqNumber __i = 0; __i < __m; __i++)
306  if (__a[__i] > 0)
307  __pq.push(std::make_pair(__S(__i)[__a[__i] - 1], __i));
308 
309  for (; __skew != 0; ++__skew)
310  {
311  _SeqNumber __source = __pq.top().second;
312  __pq.pop();
313 
314  __a[__source] -= __n + 1;
315  __b[__source] -= __n + 1;
316 
317  if (__a[__source] > 0)
318  __pq.push(std::make_pair(
319  __S(__source)[__a[__source] - 1], __source));
320  }
321  }
322  }
323 
324  // Postconditions:
325  // __a[__i] == __b[__i] in most cases, except when __a[__i] has been
326  // clamped because of having reached the boundary
327 
328  // Now return the result, calculate the offset.
329 
330  // Compare the keys on both edges of the border.
331 
332  // Maximum of left edge, minimum of right edge.
333  _ValueType* __maxleft = 0;
334  _ValueType* __minright = 0;
335  for (_SeqNumber __i = 0; __i < __m; __i++)
336  {
337  if (__a[__i] > 0)
338  {
339  if (!__maxleft)
340  __maxleft = &(__S(__i)[__a[__i] - 1]);
341  else
342  {
343  // Max, favor rear sequences.
344  if (!__comp(__S(__i)[__a[__i] - 1], *__maxleft))
345  __maxleft = &(__S(__i)[__a[__i] - 1]);
346  }
347  }
348  if (__b[__i] < __ns[__i])
349  {
350  if (!__minright)
351  __minright = &(__S(__i)[__b[__i]]);
352  else
353  {
354  // Min, favor fore sequences.
355  if (__comp(__S(__i)[__b[__i]], *__minright))
356  __minright = &(__S(__i)[__b[__i]]);
357  }
358  }
359  }
360 
361  _SeqNumber __seq = 0;
362  for (_SeqNumber __i = 0; __i < __m; __i++)
363  __begin_offsets[__i] = __S(__i) + __a[__i];
364 
365  delete[] __ns;
366  delete[] __a;
367  delete[] __b;
368  }
#define __S(__i)
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
_Size __rd_log2(_Size __n)
Calculates the rounded-down logarithm of __n for base 2.
Definition: base.h:102
const _Tp & max(const _Tp &__a, const _Tp &__b)
Equivalent to std::max.
Definition: base.h:150
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<typename _Tp , typename _RanSeqs , typename _RankType , typename _Compare >
_Tp __gnu_parallel::multiseq_selection ( _RanSeqs  __begin_seqs,
_RanSeqs  __end_seqs,
_RankType  __rank,
_RankType &  __offset,
_Compare  __comp = std::less<_Tp>() 
)

Selects the element at a certain global __rank from several sorted sequences.

The sequences are passed via a sequence of random-access iterator pairs, none of the sequences may be empty.

Parameters
__begin_seqsBegin of the sequence of iterator pairs.
__end_seqsEnd of the sequence of iterator pairs.
__rankThe global rank to partition at.
__offsetThe rank of the selected element in the global subsequence of elements equal to the selected element. If the selected element is unique, this number is 0.
__compThe ordering functor, defaults to std::less.
391  {
392  _GLIBCXX_CALL(__end_seqs - __begin_seqs)
393 
394  typedef typename std::iterator_traits<_RanSeqs>::value_type::first_type
395  _It;
396  typedef typename std::iterator_traits<_RanSeqs>::difference_type
397  _SeqNumber;
398  typedef typename std::iterator_traits<_It>::difference_type
399  _DifferenceType;
400 
401  _Lexicographic<_Tp, _SeqNumber, _Compare> __lcomp(__comp);
402  _LexicographicReverse<_Tp, _SeqNumber, _Compare> __lrcomp(__comp);
403 
404  // Number of sequences, number of elements in total (possibly
405  // including padding).
406  _DifferenceType __m = std::distance(__begin_seqs, __end_seqs);
407  _DifferenceType __nn = 0;
408  _DifferenceType __nmax, __n, __r;
409 
410  for (_SeqNumber __i = 0; __i < __m; __i++)
411  __nn += std::distance(__begin_seqs[__i].first,
412  __begin_seqs[__i].second);
413 
414  if (__m == 0 || __nn == 0 || __rank < 0 || __rank >= __nn)
415  {
416  // result undefined if there is no data or __rank is outside bounds
417  throw std::exception();
418  }
419 
420 
421  _DifferenceType* __ns = new _DifferenceType[__m];
422  _DifferenceType* __a = new _DifferenceType[__m];
423  _DifferenceType* __b = new _DifferenceType[__m];
424  _DifferenceType __l;
425 
426  __ns[0] = std::distance(__begin_seqs[0].first, __begin_seqs[0].second);
427  __nmax = __ns[0];
428  for (_SeqNumber __i = 0; __i < __m; ++__i)
429  {
430  __ns[__i] = std::distance(__begin_seqs[__i].first,
431  __begin_seqs[__i].second);
432  __nmax = std::max(__nmax, __ns[__i]);
433  }
434 
435  __r = __rd_log2(__nmax) + 1;
436 
437  // Pad all lists to this length, at least as long as any ns[__i],
438  // equality iff __nmax = 2^__k - 1
439  __l = __round_up_to_pow2(__r) - 1;
440 
441  for (_SeqNumber __i = 0; __i < __m; ++__i)
442  {
443  __a[__i] = 0;
444  __b[__i] = __l;
445  }
446  __n = __l / 2;
447 
448  // Invariants:
449  // 0 <= __a[__i] <= __ns[__i], 0 <= __b[__i] <= __l
450 
451 #define __S(__i) (__begin_seqs[__i].first)
452 
453  // Initial partition.
454  std::vector<std::pair<_Tp, _SeqNumber> > __sample;
455 
456  for (_SeqNumber __i = 0; __i < __m; __i++)
457  if (__n < __ns[__i])
458  __sample.push_back(std::make_pair(__S(__i)[__n], __i));
459  __gnu_sequential::sort(__sample.begin(), __sample.end(),
460  __lcomp, sequential_tag());
461 
462  // Conceptual infinity.
463  for (_SeqNumber __i = 0; __i < __m; __i++)
464  if (__n >= __ns[__i])
465  __sample.push_back(
466  std::make_pair(__S(__i)[0] /*__dummy element*/, __i));
467 
468  _DifferenceType __localrank = __rank / __l;
469 
470  _SeqNumber __j;
471  for (__j = 0;
472  __j < __localrank && ((__n + 1) <= __ns[__sample[__j].second]);
473  ++__j)
474  __a[__sample[__j].second] += __n + 1;
475  for (; __j < __m; ++__j)
476  __b[__sample[__j].second] -= __n + 1;
477 
478  // Further refinement.
479  while (__n > 0)
480  {
481  __n /= 2;
482 
483  const _Tp* __lmax = 0;
484  for (_SeqNumber __i = 0; __i < __m; ++__i)
485  {
486  if (__a[__i] > 0)
487  {
488  if (!__lmax)
489  __lmax = &(__S(__i)[__a[__i] - 1]);
490  else
491  {
492  if (__comp(*__lmax, __S(__i)[__a[__i] - 1])) //max
493  __lmax = &(__S(__i)[__a[__i] - 1]);
494  }
495  }
496  }
497 
498  _SeqNumber __i;
499  for (__i = 0; __i < __m; __i++)
500  {
501  _DifferenceType __middle = (__b[__i] + __a[__i]) / 2;
502  if (__lmax && __middle < __ns[__i]
503  && __comp(__S(__i)[__middle], *__lmax))
504  __a[__i] = std::min(__a[__i] + __n + 1, __ns[__i]);
505  else
506  __b[__i] -= __n + 1;
507  }
508 
509  _DifferenceType __leftsize = 0;
510  for (_SeqNumber __i = 0; __i < __m; ++__i)
511  __leftsize += __a[__i] / (__n + 1);
512 
513  _DifferenceType __skew = __rank / (__n + 1) - __leftsize;
514 
515  if (__skew > 0)
516  {
517  // Move to the left, find smallest.
518  std::priority_queue<std::pair<_Tp, _SeqNumber>,
519  std::vector<std::pair<_Tp, _SeqNumber> >,
520  _LexicographicReverse<_Tp, _SeqNumber, _Compare> >
521  __pq(__lrcomp);
522 
523  for (_SeqNumber __i = 0; __i < __m; ++__i)
524  if (__b[__i] < __ns[__i])
525  __pq.push(std::make_pair(__S(__i)[__b[__i]], __i));
526 
527  for (; __skew != 0 && !__pq.empty(); --__skew)
528  {
529  _SeqNumber __source = __pq.top().second;
530  __pq.pop();
531 
532  __a[__source]
533  = std::min(__a[__source] + __n + 1, __ns[__source]);
534  __b[__source] += __n + 1;
535 
536  if (__b[__source] < __ns[__source])
537  __pq.push(
538  std::make_pair(__S(__source)[__b[__source]], __source));
539  }
540  }
541  else if (__skew < 0)
542  {
543  // Move to the right, find greatest.
544  std::priority_queue<std::pair<_Tp, _SeqNumber>,
545  std::vector<std::pair<_Tp, _SeqNumber> >,
546  _Lexicographic<_Tp, _SeqNumber, _Compare> > __pq(__lcomp);
547 
548  for (_SeqNumber __i = 0; __i < __m; ++__i)
549  if (__a[__i] > 0)
550  __pq.push(std::make_pair(__S(__i)[__a[__i] - 1], __i));
551 
552  for (; __skew != 0; ++__skew)
553  {
554  _SeqNumber __source = __pq.top().second;
555  __pq.pop();
556 
557  __a[__source] -= __n + 1;
558  __b[__source] -= __n + 1;
559 
560  if (__a[__source] > 0)
561  __pq.push(std::make_pair(
562  __S(__source)[__a[__source] - 1], __source));
563  }
564  }
565  }
566 
567  // Postconditions:
568  // __a[__i] == __b[__i] in most cases, except when __a[__i] has been
569  // clamped because of having reached the boundary
570 
571  // Now return the result, calculate the offset.
572 
573  // Compare the keys on both edges of the border.
574 
575  // Maximum of left edge, minimum of right edge.
576  bool __maxleftset = false, __minrightset = false;
577 
578  // Impossible to avoid the warning?
579  _Tp __maxleft, __minright;
580  for (_SeqNumber __i = 0; __i < __m; ++__i)
581  {
582  if (__a[__i] > 0)
583  {
584  if (!__maxleftset)
585  {
586  __maxleft = __S(__i)[__a[__i] - 1];
587  __maxleftset = true;
588  }
589  else
590  {
591  // Max.
592  if (__comp(__maxleft, __S(__i)[__a[__i] - 1]))
593  __maxleft = __S(__i)[__a[__i] - 1];
594  }
595  }
596  if (__b[__i] < __ns[__i])
597  {
598  if (!__minrightset)
599  {
600  __minright = __S(__i)[__b[__i]];
601  __minrightset = true;
602  }
603  else
604  {
605  // Min.
606  if (__comp(__S(__i)[__b[__i]], __minright))
607  __minright = __S(__i)[__b[__i]];
608  }
609  }
610  }
611 
612  // Minright is the __splitter, in any case.
613 
614  if (!__maxleftset || __comp(__minright, __maxleft))
615  {
616  // Good luck, everything is split unambiguously.
617  __offset = 0;
618  }
619  else
620  {
621  // We have to calculate an offset.
622  __offset = 0;
623 
624  for (_SeqNumber __i = 0; __i < __m; ++__i)
625  {
626  _DifferenceType lb
627  = std::lower_bound(__S(__i), __S(__i) + __ns[__i],
628  __minright,
629  __comp) - __S(__i);
630  __offset += __a[__i] - lb;
631  }
632  }
633 
634  delete[] __ns;
635  delete[] __a;
636  delete[] __b;
637 
638  return __minright;
639  }
#define __S(__i)
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
_Size __rd_log2(_Size __n)
Calculates the rounded-down logarithm of __n for base 2.
Definition: base.h:102
_Tp __round_up_to_pow2(_Tp __x)
Round up to the next greater power of 2.
Definition: random_shuffle.h:248
const _Tp & max(const _Tp &__a, const _Tp &__b)
Equivalent to std::max.
Definition: base.h:150
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::sequential_tag   
)

Multiway Merge Frontend.

Merge the sequences specified by seqs_begin and __seqs_end into __target. __seqs_begin and __seqs_end must point to a sequence of pairs. These pairs must contain an iterator to the beginning of a sequence in their first entry and an iterator the _M_end of the same sequence in their second entry.

Ties are broken arbitrarily. See stable_multiway_merge for a variant that breaks ties by sequence number but is slower.

The first entries of the pairs (i.e. the begin iterators) will be moved forward.

The output sequence has to provide enough space for all elements that are written to it.

This function will merge the input sequences:

  • not stable
  • parallel, depending on the input size and Settings
  • using sampling for splitting
  • not using sentinels

Example:

  int sequences[10][10];
  for (int __i = 0; __i < 10; ++__i)
    for (int __j = 0; __i < 10; ++__j)
      sequences[__i][__j] = __j;
  int __out[33];
  std::vector<std::pair<int*> > seqs;
  for (int __i = 0; __i < 10; ++__i)
    { seqs.push(std::make_pair<int*>(sequences[__i],
                                     sequences[__i] + 10)) }
  multiway_merge(seqs.begin(), seqs.end(), __target, std::less<int>(), 33);
See Also
stable_multiway_merge
Precondition
All input sequences must be sorted.
Target must provide enough space to merge out length elements or the number of elements in all sequences, whichever is smaller.
Postcondition
[__target, return __value) contains merged __elements from the input sequences.
return __value - __target = min(__length, number of elements in all sequences).
Template Parameters
_RAIterPairIteratoriterator over sequence of pairs of iterators
_RAIterOutiterator over target sequence
_DifferenceTpdifference type for the sequence
_Comparestrict weak ordering type to compare elements in sequences
Parameters
__seqs_begin__begin of sequence __sequence
__seqs_end_M_end of sequence __sequence
__targettarget sequence to merge to.
__compstrict weak ordering to use for element comparison.
__lengthMaximum length to merge, possibly larger than the number of elements available.
Returns
_M_end iterator of output sequence
1423  {
1424  typedef _DifferenceTp _DifferenceType;
1425  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1426 
1427  // catch special case: no sequences
1428  if (__seqs_begin == __seqs_end)
1429  return __target;
1430 
1431  // Execute multiway merge *sequentially*.
1433  </* __stable = */ false, /* __sentinels = */ false>
1434  (__seqs_begin, __seqs_end, __target,
1435  *(__seqs_begin->second), __length, __comp);
1436  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
return(unsigned int) __res
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::exact_tag  __tag 
)
1449  {
1450  typedef _DifferenceTp _DifferenceType;
1451  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1452 
1453  // catch special case: no sequences
1454  if (__seqs_begin == __seqs_end)
1455  return __target;
1456 
1457  // Execute merge; maybe parallel, depending on the number of merged
1458  // elements and the number of sequences and global thresholds in
1459  // Settings.
1460  if ((__seqs_end - __seqs_begin > 1)
1462  ((__seqs_end - __seqs_begin) >=
1463  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
1464  && ((_SequenceIndex)__length >=
1465  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
1467  </* __stable = */ false, /* __sentinels = */ false>
1468  (__seqs_begin, __seqs_end, __target,
1469  multiway_merge_exact_splitting</* __stable = */ false,
1470  typename std::iterator_traits<_RAIterPairIterator>
1471  ::value_type*, _Compare, _DifferenceTp>,
1472  static_cast<_DifferenceType>(__length), __comp,
1473  __tag.__get_num_threads());
1474  else
1476  </* __stable = */ false, /* __sentinels = */ false>
1477  (__seqs_begin, __seqs_end, __target,
1478  *(__seqs_begin->second), __length, __comp);
1479  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
void multiway_merge_exact_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Exact splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1120
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::sampling_tag  __tag 
)
1492  {
1493  typedef _DifferenceTp _DifferenceType;
1494  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1495 
1496  // catch special case: no sequences
1497  if (__seqs_begin == __seqs_end)
1498  return __target;
1499 
1500  // Execute merge; maybe parallel, depending on the number of merged
1501  // elements and the number of sequences and global thresholds in
1502  // Settings.
1503  if ((__seqs_end - __seqs_begin > 1)
1505  ((__seqs_end - __seqs_begin) >=
1506  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
1507  && ((_SequenceIndex)__length >=
1508  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
1510  </* __stable = */ false, /* __sentinels = */ false>
1511  (__seqs_begin, __seqs_end, __target,
1512  multiway_merge_exact_splitting</* __stable = */ false,
1513  typename std::iterator_traits<_RAIterPairIterator>
1514  ::value_type*, _Compare, _DifferenceTp>,
1515  static_cast<_DifferenceType>(__length), __comp,
1516  __tag.__get_num_threads());
1517  else
1519  </* __stable = */ false, /* __sentinels = */ false>
1520  (__seqs_begin, __seqs_end, __target,
1521  *(__seqs_begin->second), __length, __comp);
1522  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
void multiway_merge_exact_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Exact splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1120
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
parallel_tag  __tag = parallel_tag(0) 
)
1535  { return multiway_merge(__seqs_begin, __seqs_end, __target, __length,
1536  __comp, exact_tag(__tag.__get_num_threads())); }
_RAIterOut multiway_merge(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:1544
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
default_parallel_tag  __tag 
)
1549  { return multiway_merge(__seqs_begin, __seqs_end, __target, __length,
1550  __comp, exact_tag(__tag.__get_num_threads())); }
_RAIterOut multiway_merge(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:1544
template<template< typename RAI, typename C > class iterator, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 __gnu_parallel::multiway_merge_3_variant ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_RAIter3  __target,
_DifferenceTp  __length,
_Compare  __comp 
)

Highly efficient 3-way merging procedure.

Merging is done with the algorithm implementation described by Peter Sanders. Basically, the idea is to minimize the number of necessary comparison after merging an element. The implementation trick that makes this fast is that the order of the sequences is stored in the instruction pointer (translated into labels in C++).

This works well for merging up to 4 sequences.

Note that making the merging stable does not come at a performance hit.

Whether the merging is done guarded or unguarded is selected by the used iterator class.

Parameters
__seqs_beginBegin iterator of iterator pair input sequence.
__seqs_endEnd iterator of iterator pair input sequence.
__targetBegin iterator of output sequence.
__compComparator.
__lengthMaximum length to merge, less equal than the total number of elements available.
Returns
End iterator of output sequence.
245  {
246  _GLIBCXX_CALL(__length);
247 
248  typedef _DifferenceTp _DifferenceType;
249 
250  typedef typename std::iterator_traits<_RAIterIterator>
251  ::value_type::first_type
252  _RAIter1;
253  typedef typename std::iterator_traits<_RAIter1>::value_type
254  _ValueType;
255 
256  if (__length == 0)
257  return __target;
258 
259 #if _GLIBCXX_ASSERTIONS
260  _DifferenceTp __orig_length = __length;
261 #endif
262 
263  iterator<_RAIter1, _Compare>
264  __seq0(__seqs_begin[0].first, __seqs_begin[0].second, __comp),
265  __seq1(__seqs_begin[1].first, __seqs_begin[1].second, __comp),
266  __seq2(__seqs_begin[2].first, __seqs_begin[2].second, __comp);
267 
268  if (__seq0 <= __seq1)
269  {
270  if (__seq1 <= __seq2)
271  goto __s012;
272  else
273  if (__seq2 < __seq0)
274  goto __s201;
275  else
276  goto __s021;
277  }
278  else
279  {
280  if (__seq1 <= __seq2)
281  {
282  if (__seq0 <= __seq2)
283  goto __s102;
284  else
285  goto __s120;
286  }
287  else
288  goto __s210;
289  }
290 #define _GLIBCXX_PARALLEL_MERGE_3_CASE(__a, __b, __c, __c0, __c1) \
291  __s ## __a ## __b ## __c : \
292  *__target = *__seq ## __a; \
293  ++__target; \
294  --__length; \
295  ++__seq ## __a; \
296  if (__length == 0) goto __finish; \
297  if (__seq ## __a __c0 __seq ## __b) goto __s ## __a ## __b ## __c; \
298  if (__seq ## __a __c1 __seq ## __c) goto __s ## __b ## __a ## __c; \
299  goto __s ## __b ## __c ## __a;
300 
301  _GLIBCXX_PARALLEL_MERGE_3_CASE(0, 1, 2, <=, <=);
302  _GLIBCXX_PARALLEL_MERGE_3_CASE(1, 2, 0, <=, < );
303  _GLIBCXX_PARALLEL_MERGE_3_CASE(2, 0, 1, < , < );
304  _GLIBCXX_PARALLEL_MERGE_3_CASE(1, 0, 2, < , <=);
305  _GLIBCXX_PARALLEL_MERGE_3_CASE(0, 2, 1, <=, <=);
306  _GLIBCXX_PARALLEL_MERGE_3_CASE(2, 1, 0, < , < );
307 
308 #undef _GLIBCXX_PARALLEL_MERGE_3_CASE
309 
310  __finish:
311  ;
312 
313 #if _GLIBCXX_ASSERTIONS
315  ((_RAIter1)__seq0 - __seqs_begin[0].first) +
316  ((_RAIter1)__seq1 - __seqs_begin[1].first) +
317  ((_RAIter1)__seq2 - __seqs_begin[2].first)
318  == __orig_length);
319 #endif
320 
321  __seqs_begin[0].first = __seq0;
322  __seqs_begin[1].first = __seq1;
323  __seqs_begin[2].first = __seq2;
324 
325  return __target;
326  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define _GLIBCXX_PARALLEL_MERGE_3_CASE(__a, __b, __c, __c0, __c1)
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<template< typename RAI, typename C > class iterator, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 __gnu_parallel::multiway_merge_4_variant ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_RAIter3  __target,
_DifferenceTp  __length,
_Compare  __comp 
)

Highly efficient 4-way merging procedure.

Merging is done with the algorithm implementation described by Peter Sanders. Basically, the idea is to minimize the number of necessary comparison after merging an element. The implementation trick that makes this fast is that the order of the sequences is stored in the instruction pointer (translated into goto labels in C++).

This works well for merging up to 4 sequences.

Note that making the merging stable does not come at a performance hit.

Whether the merging is done guarded or unguarded is selected by the used iterator class.

Parameters
__seqs_beginBegin iterator of iterator pair input sequence.
__seqs_endEnd iterator of iterator pair input sequence.
__targetBegin iterator of output sequence.
__compComparator.
__lengthMaximum length to merge, less equal than the total number of elements available.
Returns
End iterator of output sequence.
364  {
365  _GLIBCXX_CALL(__length);
366  typedef _DifferenceTp _DifferenceType;
367 
368  typedef typename std::iterator_traits<_RAIterIterator>
369  ::value_type::first_type
370  _RAIter1;
371  typedef typename std::iterator_traits<_RAIter1>::value_type
372  _ValueType;
373 
374  iterator<_RAIter1, _Compare>
375  __seq0(__seqs_begin[0].first, __seqs_begin[0].second, __comp),
376  __seq1(__seqs_begin[1].first, __seqs_begin[1].second, __comp),
377  __seq2(__seqs_begin[2].first, __seqs_begin[2].second, __comp),
378  __seq3(__seqs_begin[3].first, __seqs_begin[3].second, __comp);
379 
380 #define _GLIBCXX_PARALLEL_DECISION(__a, __b, __c, __d) { \
381  if (__seq ## __d < __seq ## __a) \
382  goto __s ## __d ## __a ## __b ## __c; \
383  if (__seq ## __d < __seq ## __b) \
384  goto __s ## __a ## __d ## __b ## __c; \
385  if (__seq ## __d < __seq ## __c) \
386  goto __s ## __a ## __b ## __d ## __c; \
387  goto __s ## __a ## __b ## __c ## __d; }
388 
389  if (__seq0 <= __seq1)
390  {
391  if (__seq1 <= __seq2)
393  else
394  if (__seq2 < __seq0)
396  else
397  _GLIBCXX_PARALLEL_DECISION(0,2,1,3)
398  }
399  else
400  {
401  if (__seq1 <= __seq2)
402  {
403  if (__seq0 <= __seq2)
405  else
406  _GLIBCXX_PARALLEL_DECISION(1,2,0,3)
407  }
408  else
409  _GLIBCXX_PARALLEL_DECISION(2,1,0,3)
410  }
411 
412 #define _GLIBCXX_PARALLEL_MERGE_4_CASE(__a, __b, __c, __d, \
413  __c0, __c1, __c2) \
414  __s ## __a ## __b ## __c ## __d: \
415  if (__length == 0) goto __finish; \
416  *__target = *__seq ## __a; \
417  ++__target; \
418  --__length; \
419  ++__seq ## __a; \
420  if (__seq ## __a __c0 __seq ## __b) \
421  goto __s ## __a ## __b ## __c ## __d; \
422  if (__seq ## __a __c1 __seq ## __c) \
423  goto __s ## __b ## __a ## __c ## __d; \
424  if (__seq ## __a __c2 __seq ## __d) \
425  goto __s ## __b ## __c ## __a ## __d; \
426  goto __s ## __b ## __c ## __d ## __a;
427 
428  _GLIBCXX_PARALLEL_MERGE_4_CASE(0, 1, 2, 3, <=, <=, <=);
429  _GLIBCXX_PARALLEL_MERGE_4_CASE(0, 1, 3, 2, <=, <=, <=);
430  _GLIBCXX_PARALLEL_MERGE_4_CASE(0, 2, 1, 3, <=, <=, <=);
431  _GLIBCXX_PARALLEL_MERGE_4_CASE(0, 2, 3, 1, <=, <=, <=);
432  _GLIBCXX_PARALLEL_MERGE_4_CASE(0, 3, 1, 2, <=, <=, <=);
433  _GLIBCXX_PARALLEL_MERGE_4_CASE(0, 3, 2, 1, <=, <=, <=);
434  _GLIBCXX_PARALLEL_MERGE_4_CASE(1, 0, 2, 3, < , <=, <=);
435  _GLIBCXX_PARALLEL_MERGE_4_CASE(1, 0, 3, 2, < , <=, <=);
436  _GLIBCXX_PARALLEL_MERGE_4_CASE(1, 2, 0, 3, <=, < , <=);
437  _GLIBCXX_PARALLEL_MERGE_4_CASE(1, 2, 3, 0, <=, <=, < );
438  _GLIBCXX_PARALLEL_MERGE_4_CASE(1, 3, 0, 2, <=, < , <=);
439  _GLIBCXX_PARALLEL_MERGE_4_CASE(1, 3, 2, 0, <=, <=, < );
440  _GLIBCXX_PARALLEL_MERGE_4_CASE(2, 0, 1, 3, < , < , <=);
441  _GLIBCXX_PARALLEL_MERGE_4_CASE(2, 0, 3, 1, < , <=, < );
442  _GLIBCXX_PARALLEL_MERGE_4_CASE(2, 1, 0, 3, < , < , <=);
443  _GLIBCXX_PARALLEL_MERGE_4_CASE(2, 1, 3, 0, < , <=, < );
444  _GLIBCXX_PARALLEL_MERGE_4_CASE(2, 3, 0, 1, <=, < , < );
445  _GLIBCXX_PARALLEL_MERGE_4_CASE(2, 3, 1, 0, <=, < , < );
446  _GLIBCXX_PARALLEL_MERGE_4_CASE(3, 0, 1, 2, < , < , < );
447  _GLIBCXX_PARALLEL_MERGE_4_CASE(3, 0, 2, 1, < , < , < );
448  _GLIBCXX_PARALLEL_MERGE_4_CASE(3, 1, 0, 2, < , < , < );
449  _GLIBCXX_PARALLEL_MERGE_4_CASE(3, 1, 2, 0, < , < , < );
450  _GLIBCXX_PARALLEL_MERGE_4_CASE(3, 2, 0, 1, < , < , < );
451  _GLIBCXX_PARALLEL_MERGE_4_CASE(3, 2, 1, 0, < , < , < );
452 
453 #undef _GLIBCXX_PARALLEL_MERGE_4_CASE
454 #undef _GLIBCXX_PARALLEL_DECISION
455 
456  __finish:
457  ;
458 
459  __seqs_begin[0].first = __seq0;
460  __seqs_begin[1].first = __seq1;
461  __seqs_begin[2].first = __seq2;
462  __seqs_begin[3].first = __seq3;
463 
464  return __target;
465  }
#define _GLIBCXX_PARALLEL_MERGE_4_CASE(__a, __b, __c, __d,__c0, __c1, __c2)
#define _GLIBCXX_PARALLEL_DECISION(__a, __b, __c, __d)
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
template<bool __stable, typename _RAIterIterator , typename _Compare , typename _DifferenceType >
void __gnu_parallel::multiway_merge_exact_splitting ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_DifferenceType  __length,
_DifferenceType  __total_length,
_Compare  __comp,
std::vector< std::pair< _DifferenceType, _DifferenceType > > *  __pieces 
)

Exact splitting for parallel multiway-merge routine.

None of the passed sequences may be empty.

1126  {
1127  typedef typename std::iterator_traits<_RAIterIterator>
1128  ::difference_type _SeqNumber;
1129  typedef typename std::iterator_traits<_RAIterIterator>
1130  ::value_type::first_type
1131  _RAIter1;
1132 
1133  const bool __tight = (__total_length == __length);
1134 
1135  // __k sequences.
1136  const _SeqNumber __k = __seqs_end - __seqs_begin;
1137 
1138  const _ThreadIndex __num_threads = omp_get_num_threads();
1139 
1140  // (Settings::multiway_merge_splitting
1141  // == __gnu_parallel::_Settings::EXACT).
1142  std::vector<_RAIter1>* __offsets =
1143  new std::vector<_RAIter1>[__num_threads];
1144  std::vector<std::pair<_RAIter1, _RAIter1> > __se(__k);
1145 
1146  copy(__seqs_begin, __seqs_end, __se.begin());
1147 
1148  _DifferenceType* __borders =
1149  new _DifferenceType[__num_threads + 1];
1150  __equally_split(__length, __num_threads, __borders);
1151 
1152  for (_ThreadIndex __s = 0; __s < (__num_threads - 1); ++__s)
1153  {
1154  __offsets[__s].resize(__k);
1155  multiseq_partition(__se.begin(), __se.end(), __borders[__s + 1],
1156  __offsets[__s].begin(), __comp);
1157 
1158  // Last one also needed and available.
1159  if (!__tight)
1160  {
1161  __offsets[__num_threads - 1].resize(__k);
1162  multiseq_partition(__se.begin(), __se.end(),
1163  _DifferenceType(__length),
1164  __offsets[__num_threads - 1].begin(),
1165  __comp);
1166  }
1167  }
1168  delete[] __borders;
1169 
1170  for (_ThreadIndex __slab = 0; __slab < __num_threads; ++__slab)
1171  {
1172  // For each slab / processor.
1173  for (_SeqNumber __seq = 0; __seq < __k; ++__seq)
1174  {
1175  // For each sequence.
1176  if (__slab == 0)
1177  {
1178  // Absolute beginning.
1179  __pieces[__slab][__seq].first = 0;
1180  }
1181  else
1182  __pieces[__slab][__seq].first =
1183  __pieces[__slab - 1][__seq].second;
1184  if (!__tight || __slab < (__num_threads - 1))
1185  __pieces[__slab][__seq].second =
1186  __offsets[__slab][__seq] - __seqs_begin[__seq].first;
1187  else
1188  {
1189  // __slab == __num_threads - 1
1190  __pieces[__slab][__seq].second =
1191  _GLIBCXX_PARALLEL_LENGTH(__seqs_begin[__seq]);
1192  }
1193  }
1194  }
1195  delete[] __offsets;
1196  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_num_threads(void) __GOMP_NOTHROW
#define _GLIBCXX_PARALLEL_LENGTH(__s)
Length of a sequence described by a pair of iterators.
Definition: multiway_merge.h:54
void multiseq_partition(_RanSeqs __begin_seqs, _RanSeqs __end_seqs, _RankType __rank, _RankIterator __begin_offsets, _Compare __comp=std::less< typename std::iterator_traits< typename std::iterator_traits< _RanSeqs >::value_type::first_type >::value_type >())
Splits several sorted sequences at a certain global __rank, resulting in a splitting point for each s...
Definition: multiseq_selection.h:122
_OutputIterator __equally_split(_DifferenceType __n, _ThreadIndex __num_threads, _OutputIterator __s)
function to split a sequence into parts of almost equal size.
Definition: equally_split.h:48
template<typename _LT , typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 __gnu_parallel::multiway_merge_loser_tree ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_RAIter3  __target,
_DifferenceTp  __length,
_Compare  __comp 
)

Multi-way merging procedure for a high branching factor, guarded case.

This merging variant uses a LoserTree class as selected by _LT.

Stability is selected through the used LoserTree class _LT.

At least one non-empty sequence is required.

Parameters
__seqs_beginBegin iterator of iterator pair input sequence.
__seqs_endEnd iterator of iterator pair input sequence.
__targetBegin iterator of output sequence.
__compComparator.
__lengthMaximum length to merge, less equal than the total number of elements available.
Returns
End iterator of output sequence.
495  {
496  _GLIBCXX_CALL(__length)
497 
498  typedef _DifferenceTp _DifferenceType;
499  typedef typename std::iterator_traits<_RAIterIterator>
500  ::difference_type _SeqNumber;
501  typedef typename std::iterator_traits<_RAIterIterator>
502  ::value_type::first_type
503  _RAIter1;
504  typedef typename std::iterator_traits<_RAIter1>::value_type
505  _ValueType;
506 
507  _SeqNumber __k = static_cast<_SeqNumber>(__seqs_end - __seqs_begin);
508 
509  _LT __lt(__k, __comp);
510 
511  // Default value for potentially non-default-constructible types.
512  _ValueType* __arbitrary_element = 0;
513 
514  for (_SeqNumber __t = 0; __t < __k; ++__t)
515  {
516  if(!__arbitrary_element
517  && _GLIBCXX_PARALLEL_LENGTH(__seqs_begin[__t]) > 0)
518  __arbitrary_element = &(*__seqs_begin[__t].first);
519  }
520 
521  for (_SeqNumber __t = 0; __t < __k; ++__t)
522  {
523  if (__seqs_begin[__t].first == __seqs_begin[__t].second)
524  __lt.__insert_start(*__arbitrary_element, __t, true);
525  else
526  __lt.__insert_start(*__seqs_begin[__t].first, __t, false);
527  }
528 
529  __lt.__init();
530 
531  _SeqNumber __source;
532 
533  for (_DifferenceType __i = 0; __i < __length; ++__i)
534  {
535  //take out
536  __source = __lt.__get_min_source();
537 
538  *(__target++) = *(__seqs_begin[__source].first++);
539 
540  // Feed.
541  if (__seqs_begin[__source].first == __seqs_begin[__source].second)
542  __lt.__delete_min_insert(*__arbitrary_element, true);
543  else
544  // Replace from same __source.
545  __lt.__delete_min_insert(*__seqs_begin[__source].first, false);
546  }
547 
548  return __target;
549  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define _GLIBCXX_PARALLEL_LENGTH(__s)
Length of a sequence described by a pair of iterators.
Definition: multiway_merge.h:54
template<typename UnguardedLoserTree , typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 __gnu_parallel::multiway_merge_loser_tree_sentinel ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_RAIter3  __target,
const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &  __sentinel,
_DifferenceTp  __length,
_Compare  __comp 
)

Multi-way merging procedure for a high branching factor, requiring sentinels to exist.

Template Parameters
UnguardedLoserTree_Loser Tree variant to use for the unguarded merging.
Parameters
__seqs_beginBegin iterator of iterator pair input sequence.
__seqs_endEnd iterator of iterator pair input sequence.
__targetBegin iterator of output sequence.
__compComparator.
__lengthMaximum length to merge, less equal than the total number of elements available.
Returns
End iterator of output sequence.
670  {
671  _GLIBCXX_CALL(__length)
672 
673  typedef _DifferenceTp _DifferenceType;
674  typedef std::iterator_traits<_RAIterIterator> _TraitsType;
675  typedef typename std::iterator_traits<_RAIterIterator>
676  ::value_type::first_type
677  _RAIter1;
678  typedef typename std::iterator_traits<_RAIter1>::value_type
679  _ValueType;
680 
681  _RAIter3 __target_end;
682 
683  for (_RAIterIterator __s = __seqs_begin; __s != __seqs_end; ++__s)
684  // Move the sequence ends to the sentinel. This has the
685  // effect that the sentinel appears to be within the sequence. Then,
686  // we can use the unguarded variant if we merge out as many
687  // non-sentinel elements as we have.
688  ++((*__s).second);
689 
690  __target_end = multiway_merge_loser_tree_unguarded<UnguardedLoserTree>
691  (__seqs_begin, __seqs_end, __target, __sentinel, __length, __comp);
692 
693 #if _GLIBCXX_ASSERTIONS
694  _GLIBCXX_PARALLEL_ASSERT(__target_end == __target + __length);
695  _GLIBCXX_PARALLEL_ASSERT(__is_sorted(__target, __target_end, __comp));
696 #endif
697 
698  // Restore the sequence ends so the sentinels are not contained in the
699  // sequence any more (see comment in loop above).
700  for (_RAIterIterator __s = __seqs_begin; __s != __seqs_end; ++__s)
701  --((*__s).second);
702 
703  return __target_end;
704  }
_RAIter3 multiway_merge_loser_tree_unguarded(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Multi-way merging procedure for a high branching factor, unguarded case.
Definition: multiway_merge.h:574
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
bool __is_sorted(_IIter __begin, _IIter __end, _Compare __comp)
Check whether [__begin, __end) is sorted according to __comp.
Definition: checkers.h:51
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<typename _LT , typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Compare >
_RAIter3 __gnu_parallel::multiway_merge_loser_tree_unguarded ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_RAIter3  __target,
const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &  __sentinel,
_DifferenceTp  __length,
_Compare  __comp 
)

Multi-way merging procedure for a high branching factor, unguarded case.

Merging is done using the LoserTree class _LT.

Stability is selected by the used LoserTrees.

Precondition
No input will run out of elements during the merge.
Parameters
__seqs_beginBegin iterator of iterator pair input sequence.
__seqs_endEnd iterator of iterator pair input sequence.
__targetBegin iterator of output sequence.
__compComparator.
__lengthMaximum length to merge, less equal than the total number of elements available.
Returns
End iterator of output sequence.
582  {
583  _GLIBCXX_CALL(__length)
584  typedef _DifferenceTp _DifferenceType;
585 
586  typedef typename std::iterator_traits<_RAIterIterator>
587  ::difference_type _SeqNumber;
588  typedef typename std::iterator_traits<_RAIterIterator>
589  ::value_type::first_type
590  _RAIter1;
591  typedef typename std::iterator_traits<_RAIter1>::value_type
592  _ValueType;
593 
594  _SeqNumber __k = __seqs_end - __seqs_begin;
595 
596  _LT __lt(__k, __sentinel, __comp);
597 
598  for (_SeqNumber __t = 0; __t < __k; ++__t)
599  {
600 #if _GLIBCXX_ASSERTIONS
601  _GLIBCXX_PARALLEL_ASSERT(__seqs_begin[__t].first
602  != __seqs_begin[__t].second);
603 #endif
604  __lt.__insert_start(*__seqs_begin[__t].first, __t, false);
605  }
606 
607  __lt.__init();
608 
609  _SeqNumber __source;
610 
611 #if _GLIBCXX_ASSERTIONS
612  _DifferenceType __i = 0;
613 #endif
614 
615  _RAIter3 __target_end = __target + __length;
616  while (__target < __target_end)
617  {
618  // Take out.
619  __source = __lt.__get_min_source();
620 
621 #if _GLIBCXX_ASSERTIONS
622  _GLIBCXX_PARALLEL_ASSERT(0 <= __source && __source < __k);
623  _GLIBCXX_PARALLEL_ASSERT(__i == 0
624  || !__comp(*(__seqs_begin[__source].first), *(__target - 1)));
625 #endif
626 
627  // Feed.
628  *(__target++) = *(__seqs_begin[__source].first++);
629 
630 #if _GLIBCXX_ASSERTIONS
631  ++__i;
632 #endif
633  // Replace from same __source.
634  __lt.__delete_min_insert(*__seqs_begin[__source].first, false);
635  }
636 
637  return __target;
638  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
template<bool __stable, typename _RAIterIterator , typename _Compare , typename _DifferenceType >
void __gnu_parallel::multiway_merge_sampling_splitting ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_DifferenceType  __length,
_DifferenceType  __total_length,
_Compare  __comp,
std::vector< std::pair< _DifferenceType, _DifferenceType > > *  __pieces 
)

Sampling based splitting for parallel multiway-merge routine.

1041  {
1042  typedef typename std::iterator_traits<_RAIterIterator>
1043  ::difference_type _SeqNumber;
1044  typedef typename std::iterator_traits<_RAIterIterator>
1045  ::value_type::first_type
1046  _RAIter1;
1047  typedef typename std::iterator_traits<_RAIter1>::value_type
1048  _ValueType;
1049 
1050  // __k sequences.
1051  const _SeqNumber __k
1052  = static_cast<_SeqNumber>(__seqs_end - __seqs_begin);
1053 
1054  const _ThreadIndex __num_threads = omp_get_num_threads();
1055 
1056  const _DifferenceType __num_samples =
1058 
1059  _ValueType* __samples = static_cast<_ValueType*>
1060  (::operator new(sizeof(_ValueType) * __k * __num_samples));
1061  // Sample.
1062  for (_SeqNumber __s = 0; __s < __k; ++__s)
1063  for (_DifferenceType __i = 0; __i < __num_samples; ++__i)
1064  {
1065  _DifferenceType sample_index = static_cast<_DifferenceType>
1066  (_GLIBCXX_PARALLEL_LENGTH(__seqs_begin[__s])
1067  * (double(__i + 1) / (__num_samples + 1))
1068  * (double(__length) / __total_length));
1069  new(&(__samples[__s * __num_samples + __i]))
1070  _ValueType(__seqs_begin[__s].first[sample_index]);
1071  }
1072 
1073  // Sort stable or non-stable, depending on value of template parameter
1074  // "__stable".
1075  _SamplingSorter<__stable, _ValueType*, _Compare>()
1076  (__samples, __samples + (__num_samples * __k), __comp);
1077 
1078  for (_ThreadIndex __slab = 0; __slab < __num_threads; ++__slab)
1079  // For each slab / processor.
1080  for (_SeqNumber __seq = 0; __seq < __k; ++__seq)
1081  {
1082  // For each sequence.
1083  if (__slab > 0)
1084  __pieces[__slab][__seq].first = std::upper_bound
1085  (__seqs_begin[__seq].first, __seqs_begin[__seq].second,
1086  __samples[__num_samples * __k * __slab / __num_threads],
1087  __comp)
1088  - __seqs_begin[__seq].first;
1089  else
1090  // Absolute beginning.
1091  __pieces[__slab][__seq].first = 0;
1092  if ((__slab + 1) < __num_threads)
1093  __pieces[__slab][__seq].second = std::upper_bound
1094  (__seqs_begin[__seq].first, __seqs_begin[__seq].second,
1095  __samples[__num_samples * __k * (__slab + 1) / __num_threads],
1096  __comp)
1097  - __seqs_begin[__seq].first;
1098  else
1099  // Absolute end.
1100  __pieces[__slab][__seq].second =
1101  _GLIBCXX_PARALLEL_LENGTH(__seqs_begin[__seq]);
1102  }
1103 
1104  for (_SeqNumber __s = 0; __s < __k; ++__s)
1105  for (_DifferenceType __i = 0; __i < __num_samples; ++__i)
1106  __samples[__s * __num_samples + __i].~_ValueType();
1107  ::operator delete(__samples);
1108  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_num_threads(void) __GOMP_NOTHROW
#define _GLIBCXX_PARALLEL_LENGTH(__s)
Length of a sequence described by a pair of iterators.
Definition: multiway_merge.h:54
static _GLIBCXX_CONST const _Settings & get()
Get the global settings.
unsigned int merge_oversampling
Oversampling factor for merge.
Definition: settings.h:175
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::sequential_tag   
)

Multiway Merge Frontend.

Merge the sequences specified by seqs_begin and __seqs_end into __target. __seqs_begin and __seqs_end must point to a sequence of pairs. These pairs must contain an iterator to the beginning of a sequence in their first entry and an iterator the _M_end of the same sequence in their second entry.

Ties are broken arbitrarily. See stable_multiway_merge for a variant that breaks ties by sequence number but is slower.

The first entries of the pairs (i.e. the begin iterators) will be moved forward accordingly.

The output sequence has to provide enough space for all elements that are written to it.

This function will merge the input sequences:

  • not stable
  • parallel, depending on the input size and Settings
  • using sampling for splitting
  • using sentinels

You have to take care that the element the _M_end iterator points to is readable and contains a value that is greater than any other non-sentinel value in all sequences.

Example:

  int sequences[10][11];
  for (int __i = 0; __i < 10; ++__i)
    for (int __j = 0; __i < 11; ++__j)
      sequences[__i][__j] = __j; // __last one is sentinel!
  int __out[33];
  std::vector<std::pair<int*> > seqs;
  for (int __i = 0; __i < 10; ++__i)
    { seqs.push(std::make_pair<int*>(sequences[__i],
                                     sequences[__i] + 10)) }
  multiway_merge(seqs.begin(), seqs.end(), __target, std::less<int>(), 33);
Precondition
All input sequences must be sorted.
Target must provide enough space to merge out length elements or the number of elements in all sequences, whichever is smaller.
For each __i, __seqs_begin[__i].second must be the end marker of the sequence, but also reference the one more __sentinel element.
Postcondition
[__target, return __value) contains merged __elements from the input sequences.
return __value - __target = min(__length, number of elements in all sequences).
See Also
stable_multiway_merge_sentinels
Template Parameters
_RAIterPairIteratoriterator over sequence of pairs of iterators
_RAIterOutiterator over target sequence
_DifferenceTpdifference type for the sequence
_Comparestrict weak ordering type to compare elements in sequences
Parameters
__seqs_begin__begin of sequence __sequence
__seqs_end_M_end of sequence __sequence
__targettarget sequence to merge to.
__compstrict weak ordering to use for element comparison.
__lengthMaximum length to merge, possibly larger than the number of elements available.
Returns
_M_end iterator of output sequence
1787  {
1788  typedef _DifferenceTp _DifferenceType;
1789  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1790 
1791  // catch special case: no sequences
1792  if (__seqs_begin == __seqs_end)
1793  return __target;
1794 
1795  // Execute multiway merge *sequentially*.
1797  </* __stable = */ false, /* __sentinels = */ true>
1798  (__seqs_begin, __seqs_end,
1799  __target, *(__seqs_begin->second), __length, __comp);
1800  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
return(unsigned int) __res
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::exact_tag  __tag 
)
1813  {
1814  typedef _DifferenceTp _DifferenceType;
1815  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1816 
1817  // catch special case: no sequences
1818  if (__seqs_begin == __seqs_end)
1819  return __target;
1820 
1821  // Execute merge; maybe parallel, depending on the number of merged
1822  // elements and the number of sequences and global thresholds in
1823  // Settings.
1824  if ((__seqs_end - __seqs_begin > 1)
1826  ((__seqs_end - __seqs_begin) >=
1827  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
1828  && ((_SequenceIndex)__length >=
1829  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
1831  </* __stable = */ false, /* __sentinels = */ true>
1832  (__seqs_begin, __seqs_end, __target,
1833  multiway_merge_exact_splitting</* __stable = */ false,
1834  typename std::iterator_traits<_RAIterPairIterator>
1835  ::value_type*, _Compare, _DifferenceTp>,
1836  static_cast<_DifferenceType>(__length), __comp,
1837  __tag.__get_num_threads());
1838  else
1840  </* __stable = */ false, /* __sentinels = */ true>
1841  (__seqs_begin, __seqs_end, __target,
1842  *(__seqs_begin->second), __length, __comp);
1843  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
void multiway_merge_exact_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Exact splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1120
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
sampling_tag  __tag 
)
1856  {
1857  typedef _DifferenceTp _DifferenceType;
1858  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1859 
1860  // catch special case: no sequences
1861  if (__seqs_begin == __seqs_end)
1862  return __target;
1863 
1864  // Execute merge; maybe parallel, depending on the number of merged
1865  // elements and the number of sequences and global thresholds in
1866  // Settings.
1867  if ((__seqs_end - __seqs_begin > 1)
1869  ((__seqs_end - __seqs_begin) >=
1870  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
1871  && ((_SequenceIndex)__length >=
1872  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
1874  </* __stable = */ false, /* __sentinels = */ true>
1875  (__seqs_begin, __seqs_end, __target,
1876  multiway_merge_sampling_splitting</* __stable = */ false,
1877  typename std::iterator_traits<_RAIterPairIterator>
1878  ::value_type*, _Compare, _DifferenceTp>,
1879  static_cast<_DifferenceType>(__length), __comp,
1880  __tag.__get_num_threads());
1881  else
1883  </* __stable = */false, /* __sentinels = */ true>(
1884  __seqs_begin, __seqs_end, __target,
1885  *(__seqs_begin->second), __length, __comp);
1886  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
void multiway_merge_sampling_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Sampling based splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1035
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
parallel_tag  __tag = parallel_tag(0) 
)
1899  {
1901  (__seqs_begin, __seqs_end, __target, __length, __comp,
1902  exact_tag(__tag.__get_num_threads()));
1903  }
_RAIterOut multiway_merge_sentinels(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:1911
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
default_parallel_tag  __tag 
)
1916  {
1918  (__seqs_begin, __seqs_end, __target, __length, __comp,
1919  exact_tag(__tag.__get_num_threads()));
1920  }
_RAIterOut multiway_merge_sentinels(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:1911
template<bool __stable, bool __sentinels, typename _RAIterIterator , typename _RAIter3 , typename _DifferenceTp , typename _Splitter , typename _Compare >
_RAIter3 __gnu_parallel::parallel_multiway_merge ( _RAIterIterator  __seqs_begin,
_RAIterIterator  __seqs_end,
_RAIter3  __target,
_Splitter  __splitter,
_DifferenceTp  __length,
_Compare  __comp,
_ThreadIndex  __num_threads 
)

Parallel multi-way merge routine.

The _GLIBCXX_PARALLEL_DECISION is based on the branching factor and runtime settings.

Must not be called if the number of sequences is 1.

Template Parameters
_Splitterfunctor to split input (either __exact or sampling based)
__stableStable merging incurs a performance penalty.
__sentinelIgnored.
Parameters
__seqs_beginBegin iterator of iterator pair input sequence.
__seqs_endEnd iterator of iterator pair input sequence.
__targetBegin iterator of output sequence.
__compComparator.
__lengthMaximum length to merge, possibly larger than the number of elements available.
Returns
End iterator of output sequence.
1232  {
1233 #if _GLIBCXX_ASSERTIONS
1234  _GLIBCXX_PARALLEL_ASSERT(__seqs_end - __seqs_begin > 1);
1235 #endif
1236 
1237  _GLIBCXX_CALL(__length)
1238 
1239  typedef _DifferenceTp _DifferenceType;
1240  typedef typename std::iterator_traits<_RAIterIterator>
1241  ::difference_type _SeqNumber;
1242  typedef typename std::iterator_traits<_RAIterIterator>
1243  ::value_type::first_type
1244  _RAIter1;
1245  typedef typename
1246  std::iterator_traits<_RAIter1>::value_type _ValueType;
1247 
1248  // Leave only non-empty sequences.
1249  typedef std::pair<_RAIter1, _RAIter1> seq_type;
1250  seq_type* __ne_seqs = new seq_type[__seqs_end - __seqs_begin];
1251  _SeqNumber __k = 0;
1252  _DifferenceType __total_length = 0;
1253  for (_RAIterIterator __raii = __seqs_begin;
1254  __raii != __seqs_end; ++__raii)
1255  {
1256  _DifferenceTp __seq_length = _GLIBCXX_PARALLEL_LENGTH(*__raii);
1257  if(__seq_length > 0)
1258  {
1259  __total_length += __seq_length;
1260  __ne_seqs[__k++] = *__raii;
1261  }
1262  }
1263 
1264  _GLIBCXX_CALL(__total_length)
1265 
1266  __length = std::min<_DifferenceTp>(__length, __total_length);
1267 
1268  if (__total_length == 0 || __k == 0)
1269  {
1270  delete[] __ne_seqs;
1271  return __target;
1272  }
1273 
1274  std::vector<std::pair<_DifferenceType, _DifferenceType> >* __pieces;
1275 
1276  __num_threads = static_cast<_ThreadIndex>
1277  (std::min<_DifferenceType>(__num_threads, __total_length));
1278 
1279 # pragma omp parallel num_threads (__num_threads)
1280  {
1281 # pragma omp single
1282  {
1283  __num_threads = omp_get_num_threads();
1284  // Thread __t will have to merge pieces[__iam][0..__k - 1]
1285  __pieces = new std::vector<
1286  std::pair<_DifferenceType, _DifferenceType> >[__num_threads];
1287  for (_ThreadIndex __s = 0; __s < __num_threads; ++__s)
1288  __pieces[__s].resize(__k);
1289 
1290  _DifferenceType __num_samples =
1292  * __num_threads;
1293 
1294  __splitter(__ne_seqs, __ne_seqs + __k, __length, __total_length,
1295  __comp, __pieces);
1296  } //single
1297 
1298  _ThreadIndex __iam = omp_get_thread_num();
1299 
1300  _DifferenceType __target_position = 0;
1301 
1302  for (_SeqNumber __c = 0; __c < __k; ++__c)
1303  __target_position += __pieces[__iam][__c].first;
1304 
1305  seq_type* __chunks = new seq_type[__k];
1306 
1307  for (_SeqNumber __s = 0; __s < __k; ++__s)
1308  __chunks[__s] = std::make_pair(__ne_seqs[__s].first
1309  + __pieces[__iam][__s].first,
1310  __ne_seqs[__s].first
1311  + __pieces[__iam][__s].second);
1312 
1313  if(__length > __target_position)
1314  __sequential_multiway_merge<__stable, __sentinels>
1315  (__chunks, __chunks + __k, __target + __target_position,
1316  *(__seqs_begin->second), __length - __target_position, __comp);
1317 
1318  delete[] __chunks;
1319  } // parallel
1320 
1321 #if _GLIBCXX_ASSERTIONS
1323  __is_sorted(__target, __target + __length, __comp));
1324 #endif
1325 
1326  __k = 0;
1327  // Update ends of sequences.
1328  for (_RAIterIterator __raii = __seqs_begin;
1329  __raii != __seqs_end; ++__raii)
1330  {
1331  _DifferenceTp __length = _GLIBCXX_PARALLEL_LENGTH(*__raii);
1332  if(__length > 0)
1333  (*__raii).first += __pieces[__num_threads - 1][__k++].second;
1334  }
1335 
1336  delete[] __pieces;
1337  delete[] __ne_seqs;
1338 
1339  return __target + __length;
1340  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
int omp_get_num_threads(void) __GOMP_NOTHROW
#define _GLIBCXX_PARALLEL_LENGTH(__s)
Length of a sequence described by a pair of iterators.
Definition: multiway_merge.h:54
const _Tp & min(const _Tp &__a, const _Tp &__b)
Equivalent to std::min.
Definition: base.h:144
bool __is_sorted(_IIter __begin, _IIter __end, _Compare __comp)
Check whether [__begin, __end) is sorted according to __comp.
Definition: checkers.h:51
static _GLIBCXX_CONST const _Settings & get()
Get the global settings.
unsigned int merge_oversampling
Oversampling factor for merge.
Definition: settings.h:175
#define _GLIBCXX_PARALLEL_ASSERT(_Condition)
Definition: base.h:422
int omp_get_thread_num(void) __GOMP_NOTHROW
template<bool __stable, bool __exact, typename _RAIter , typename _Compare >
void __gnu_parallel::parallel_sort_mwms ( _RAIter  __begin,
_RAIter  __end,
_Compare  __comp,
_ThreadIndex  __num_threads 
)

PMWMS main call.

Parameters
__beginBegin iterator of sequence.
__endEnd iterator of sequence.
__compComparator.
__num_threadsNumber of threads to use.
398  {
399  _GLIBCXX_CALL(__end - __begin)
400 
401  typedef std::iterator_traits<_RAIter> _TraitsType;
402  typedef typename _TraitsType::value_type _ValueType;
403  typedef typename _TraitsType::difference_type _DifferenceType;
404 
405  _DifferenceType __n = __end - __begin;
406 
407  if (__n <= 1)
408  return;
409 
410  // at least one element per thread
411  if (__num_threads > __n)
412  __num_threads = static_cast<_ThreadIndex>(__n);
413 
414  // shared variables
415  _PMWMSSortingData<_RAIter> __sd;
416  _DifferenceType* __starts;
417  _DifferenceType __size;
418 
419 # pragma omp parallel num_threads(__num_threads)
420  {
421  __num_threads = omp_get_num_threads(); //no more threads than requested
422 
423 # pragma omp single
424  {
425  __sd._M_num_threads = __num_threads;
426  __sd._M_source = __begin;
427 
428  __sd._M_temporary = new _ValueType*[__num_threads];
429 
430  if (!__exact)
431  {
432  __size =
433  (_Settings::get().sort_mwms_oversampling * __num_threads - 1)
434  * __num_threads;
435  __sd._M_samples = static_cast<_ValueType*>
436  (::operator new(__size * sizeof(_ValueType)));
437  }
438  else
439  __sd._M_samples = 0;
440 
441  __sd._M_offsets = new _DifferenceType[__num_threads - 1];
442  __sd._M_pieces
443  = new std::vector<_Piece<_DifferenceType> >[__num_threads];
444  for (_ThreadIndex __s = 0; __s < __num_threads; ++__s)
445  __sd._M_pieces[__s].resize(__num_threads);
446  __starts = __sd._M_starts = new _DifferenceType[__num_threads + 1];
447 
448  _DifferenceType __chunk_length = __n / __num_threads;
449  _DifferenceType __split = __n % __num_threads;
450  _DifferenceType __pos = 0;
451  for (_ThreadIndex __i = 0; __i < __num_threads; ++__i)
452  {
453  __starts[__i] = __pos;
454  __pos += ((__i < __split)
455  ? (__chunk_length + 1) : __chunk_length);
456  }
457  __starts[__num_threads] = __pos;
458  } //single
459 
460  // Now sort in parallel.
461  parallel_sort_mwms_pu<__stable, __exact>(&__sd, __comp);
462  } //parallel
463 
464  delete[] __starts;
465  delete[] __sd._M_temporary;
466 
467  if (!__exact)
468  {
469  for (_DifferenceType __i = 0; __i < __size; ++__i)
470  __sd._M_samples[__i].~_ValueType();
471  ::operator delete(__sd._M_samples);
472  }
473 
474  delete[] __sd._M_offsets;
475  delete[] __sd._M_pieces;
476  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
std::size_t __size(__stack_t __stack)
Definition: profiler_node.h:68
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
int omp_get_num_threads(void) __GOMP_NOTHROW
return(unsigned int) __res
template<bool __stable, bool __exact, typename _RAIter , typename _Compare >
void __gnu_parallel::parallel_sort_mwms_pu ( _PMWMSSortingData< _RAIter > *  __sd,
_Compare &  __comp 
)

PMWMS code executed by each thread.

Parameters
__sdPointer to algorithm data.
__compComparator.
310  {
311  typedef std::iterator_traits<_RAIter> _TraitsType;
312  typedef typename _TraitsType::value_type _ValueType;
313  typedef typename _TraitsType::difference_type _DifferenceType;
314 
316 
317  // Length of this thread's chunk, before merging.
318  _DifferenceType __length_local =
319  __sd->_M_starts[__iam + 1] - __sd->_M_starts[__iam];
320 
321  // Sort in temporary storage, leave space for sentinel.
322 
323  typedef _ValueType* _SortingPlacesIterator;
324 
325  __sd->_M_temporary[__iam] =
326  static_cast<_ValueType*>(::operator new(sizeof(_ValueType)
327  * (__length_local + 1)));
328 
329  // Copy there.
330  std::uninitialized_copy(__sd->_M_source + __sd->_M_starts[__iam],
331  __sd->_M_source + __sd->_M_starts[__iam]
332  + __length_local,
333  __sd->_M_temporary[__iam]);
334 
335  __possibly_stable_sort<__stable, _SortingPlacesIterator, _Compare>()
336  (__sd->_M_temporary[__iam],
337  __sd->_M_temporary[__iam] + __length_local,
338  __comp);
339 
340  // Invariant: locally sorted subsequence in sd->_M_temporary[__iam],
341  // __sd->_M_temporary[__iam] + __length_local.
342 
343  // No barrier here: Synchronization is done by the splitting routine.
344 
345  _DifferenceType __num_samples =
346  _Settings::get().sort_mwms_oversampling * __sd->_M_num_threads - 1;
347  _SplitConsistently<__exact, _RAIter, _Compare, _SortingPlacesIterator>()
348  (__iam, __sd, __comp, __num_samples);
349 
350  // Offset from __target __begin, __length after merging.
351  _DifferenceType __offset = 0, __length_am = 0;
352  for (_ThreadIndex __s = 0; __s < __sd->_M_num_threads; __s++)
353  {
354  __length_am += (__sd->_M_pieces[__iam][__s]._M_end
355  - __sd->_M_pieces[__iam][__s]._M_begin);
356  __offset += __sd->_M_pieces[__iam][__s]._M_begin;
357  }
358 
359  typedef std::vector<
360  std::pair<_SortingPlacesIterator, _SortingPlacesIterator> >
361  _SeqVector;
362  _SeqVector __seqs(__sd->_M_num_threads);
363 
364  for (_ThreadIndex __s = 0; __s < __sd->_M_num_threads; ++__s)
365  {
366  __seqs[__s] =
367  std::make_pair(__sd->_M_temporary[__s]
368  + __sd->_M_pieces[__iam][__s]._M_begin,
369  __sd->_M_temporary[__s]
370  + __sd->_M_pieces[__iam][__s]._M_end);
371  }
372 
373  __possibly_stable_multiway_merge<
374  __stable, typename _SeqVector::iterator,
375  _RAIter, _Compare, _DifferenceType>()(__seqs.begin(), __seqs.end(),
376  __sd->_M_source + __offset, __comp,
377  __length_am);
378 
379 # pragma omp barrier
380 
381  for (_DifferenceType __i = 0; __i < __length_local; ++__i)
382  __sd->_M_temporary[__iam][__i].~_ValueType();
383  ::operator delete(__sd->_M_temporary[__iam]);
384  }
uint16_t _ThreadIndex
Unsigned integer to index a thread number. The maximum thread number (for each processor) must fit in...
Definition: types.h:123
int omp_get_thread_num(void) __GOMP_NOTHROW
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::sequential_tag   
)
1564  {
1565  typedef _DifferenceTp _DifferenceType;
1566  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1567 
1568  // catch special case: no sequences
1569  if (__seqs_begin == __seqs_end)
1570  return __target;
1571 
1572  // Execute multiway merge *sequentially*.
1574  </* __stable = */ true, /* __sentinels = */ false>
1575  (__seqs_begin, __seqs_end, __target,
1576  *(__seqs_begin->second), __length, __comp);
1577  }
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
return(unsigned int) __res
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::exact_tag  __tag 
)
1590  {
1591  typedef _DifferenceTp _DifferenceType;
1592  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1593 
1594  // catch special case: no sequences
1595  if (__seqs_begin == __seqs_end)
1596  return __target;
1597 
1598  // Execute merge; maybe parallel, depending on the number of merged
1599  // elements and the number of sequences and global thresholds in
1600  // Settings.
1601  if ((__seqs_end - __seqs_begin > 1)
1603  ((__seqs_end - __seqs_begin) >=
1604  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
1605  && ((_SequenceIndex)__length >=
1606  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
1608  </* __stable = */ true, /* __sentinels = */ false>
1609  (__seqs_begin, __seqs_end, __target,
1610  multiway_merge_exact_splitting</* __stable = */ true,
1611  typename std::iterator_traits<_RAIterPairIterator>
1612  ::value_type*, _Compare, _DifferenceTp>,
1613  static_cast<_DifferenceType>(__length), __comp,
1614  __tag.__get_num_threads());
1615  else
1617  </* __stable = */ true, /* __sentinels = */ false>
1618  (__seqs_begin, __seqs_end, __target,
1619  *(__seqs_begin->second), __length, __comp);
1620  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
void multiway_merge_exact_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Exact splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1120
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
sampling_tag  __tag 
)
1633  {
1634  typedef _DifferenceTp _DifferenceType;
1635  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1636 
1637  // catch special case: no sequences
1638  if (__seqs_begin == __seqs_end)
1639  return __target;
1640 
1641  // Execute merge; maybe parallel, depending on the number of merged
1642  // elements and the number of sequences and global thresholds in
1643  // Settings.
1644  if ((__seqs_end - __seqs_begin > 1)
1646  ((__seqs_end - __seqs_begin) >=
1647  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
1648  && ((_SequenceIndex)__length >=
1649  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
1651  </* __stable = */ true, /* __sentinels = */ false>
1652  (__seqs_begin, __seqs_end, __target,
1653  multiway_merge_sampling_splitting</* __stable = */ true,
1654  typename std::iterator_traits<_RAIterPairIterator>
1655  ::value_type*, _Compare, _DifferenceTp>,
1656  static_cast<_DifferenceType>(__length), __comp,
1657  __tag.__get_num_threads());
1658  else
1660  </* __stable = */ true, /* __sentinels = */ false>
1661  (__seqs_begin, __seqs_end, __target,
1662  *(__seqs_begin->second), __length, __comp);
1663  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
void multiway_merge_sampling_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Sampling based splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1035
#define false
Definition: stdbool.h:35
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
parallel_tag  __tag = parallel_tag(0) 
)
1676  {
1677  return stable_multiway_merge
1678  (__seqs_begin, __seqs_end, __target, __length, __comp,
1679  exact_tag(__tag.__get_num_threads()));
1680  }
_RAIterOut stable_multiway_merge(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:1688
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
default_parallel_tag  __tag 
)
1693  {
1694  return stable_multiway_merge
1695  (__seqs_begin, __seqs_end, __target, __length, __comp,
1696  exact_tag(__tag.__get_num_threads()));
1697  }
_RAIterOut stable_multiway_merge(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:1688
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::sequential_tag   
)
1934  {
1935  typedef _DifferenceTp _DifferenceType;
1936  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1937 
1938  // catch special case: no sequences
1939  if (__seqs_begin == __seqs_end)
1940  return __target;
1941 
1942  // Execute multiway merge *sequentially*.
1944  </* __stable = */ true, /* __sentinels = */ true>
1945  (__seqs_begin, __seqs_end, __target,
1946  *(__seqs_begin->second), __length, __comp);
1947  }
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
return(unsigned int) __res
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
__gnu_parallel::exact_tag  __tag 
)
1960  {
1961  typedef _DifferenceTp _DifferenceType;
1962  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
1963 
1964  // catch special case: no sequences
1965  if (__seqs_begin == __seqs_end)
1966  return __target;
1967 
1968  // Execute merge; maybe parallel, depending on the number of merged
1969  // elements and the number of sequences and global thresholds in
1970  // Settings.
1971  if ((__seqs_end - __seqs_begin > 1)
1973  ((__seqs_end - __seqs_begin) >=
1974  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
1975  && ((_SequenceIndex)__length >=
1976  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
1978  </* __stable = */ true, /* __sentinels = */ true>
1979  (__seqs_begin, __seqs_end, __target,
1980  multiway_merge_exact_splitting</* __stable = */ true,
1981  typename std::iterator_traits<_RAIterPairIterator>
1982  ::value_type*, _Compare, _DifferenceTp>,
1983  static_cast<_DifferenceType>(__length), __comp,
1984  __tag.__get_num_threads());
1985  else
1987  </* __stable = */ true, /* __sentinels = */ true>
1988  (__seqs_begin, __seqs_end, __target,
1989  *(__seqs_begin->second), __length, __comp);
1990  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
void multiway_merge_exact_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Exact splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1120
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
sampling_tag  __tag 
)
2004  {
2005  typedef _DifferenceTp _DifferenceType;
2006  _GLIBCXX_CALL(__seqs_end - __seqs_begin)
2007 
2008  // catch special case: no sequences
2009  if (__seqs_begin == __seqs_end)
2010  return __target;
2011 
2012  // Execute merge; maybe parallel, depending on the number of merged
2013  // elements and the number of sequences and global thresholds in
2014  // Settings.
2015  if ((__seqs_end - __seqs_begin > 1)
2017  ((__seqs_end - __seqs_begin) >=
2018  __gnu_parallel::_Settings::get().multiway_merge_minimal_k)
2019  && ((_SequenceIndex)__length >=
2020  __gnu_parallel::_Settings::get().multiway_merge_minimal_n)))
2022  </* __stable = */ true, /* __sentinels = */ true>
2023  (__seqs_begin, __seqs_end, __target,
2024  multiway_merge_sampling_splitting</* __stable = */ true,
2025  typename std::iterator_traits<_RAIterPairIterator>
2026  ::value_type*, _Compare, _DifferenceTp>,
2027  static_cast<_DifferenceType>(__length), __comp,
2028  __tag.__get_num_threads());
2029  else
2031  </* __stable = */ true, /* __sentinels = */ true>
2032  (__seqs_begin, __seqs_end, __target,
2033  *(__seqs_begin->second), __length, __comp);
2034  }
#define _GLIBCXX_PARALLEL_CONDITION(__c)
Determine at compile(?)-time if the parallel variant of an algorithm should be called.
Definition: settings.h:95
void multiway_merge_sampling_splitting(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _DifferenceType __length, _DifferenceType __total_length, _Compare __comp, std::vector< std::pair< _DifferenceType, _DifferenceType > > *__pieces)
Sampling based splitting for parallel multiway-merge routine.
Definition: multiway_merge.h:1035
#define _GLIBCXX_CALL(__n)
Macro to produce log message when entering a function.
Definition: compiletime_settings.h:44
#define true
Definition: stdbool.h:34
_RAIter3 parallel_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, _Splitter __splitter, _DifferenceTp __length, _Compare __comp, _ThreadIndex __num_threads)
Parallel multi-way merge routine.
Definition: multiway_merge.h:1225
return(unsigned int) __res
_RAIter3 __sequential_multiway_merge(_RAIterIterator __seqs_begin, _RAIterIterator __seqs_end, _RAIter3 __target, const typename std::iterator_traits< typename std::iterator_traits< _RAIterIterator >::value_type::first_type >::value_type &__sentinel, _DifferenceTp __length, _Compare __comp)
Sequential multi-way merging switch.
Definition: multiway_merge.h:920
uint64_t _SequenceIndex
Unsigned integer to index __elements. The total number of elements for each algorithm must fit into t...
Definition: types.h:117
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
parallel_tag  __tag = parallel_tag(0) 
)
2048  {
2050  (__seqs_begin, __seqs_end, __target, __length, __comp,
2051  exact_tag(__tag.__get_num_threads()));
2052  }
_RAIterOut stable_multiway_merge_sentinels(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:2060
template<typename _RAIterPairIterator , typename _RAIterOut , typename _DifferenceTp , typename _Compare >
_RAIterOut __gnu_parallel::stable_multiway_merge_sentinels ( _RAIterPairIterator  __seqs_begin,
_RAIterPairIterator  __seqs_end,
_RAIterOut  __target,
_DifferenceTp  __length,
_Compare  __comp,
default_parallel_tag  __tag 
)
2065  {
2067  (__seqs_begin, __seqs_end, __target, __length, __comp,
2068  exact_tag(__tag.__get_num_threads()));
2069  }
_RAIterOut stable_multiway_merge_sentinels(_RAIterPairIterator __seqs_begin, _RAIterPairIterator __seqs_end, _RAIterOut __target, _DifferenceTp __length, _Compare __comp, default_parallel_tag __tag)
Definition: multiway_merge.h:2060

Variable Documentation

const int __gnu_parallel::_CASable_bits = std::numeric_limits<_CASable>::digits
static

Number of bits of _CASable.

const _CASable __gnu_parallel::_CASable_mask
static
Initial value:
=
((_CASable(1) << (_CASable_bits / 2)) - 1)
static const int _CASable_bits
Number of bits of _CASable.
Definition: types.h:130
int64_t _CASable
Longest compare-and-swappable integer type on this platform.
Definition: types.h:127

_CASable with the right half of bits set to 1.