sampling Package¶
sampling Package¶
This module implement an incremental sampler used to approximate the task and randomly select a portion of the triplets.
sampler Module¶
The sampler class implementing incremental sampling without replacement. Incremental meaning that you don’t have to draw the whole sample at once, instead at any given time you can get a piece of the sample of a size you specify. This is useful for very large sample sizes.
- class ABXpy.sampling.sampler.IncrementalSampler(N, K, step=None, relative_indexing=True, dtype=<Mock id='140696957940240'>)[source]¶
Bases: object
- sample(n, dtype=<Mock id='140696957942224'>)[source]¶
Fast implementation of the sampling function
Get all samples from the next n items in a way that avoid rejection sampling with too large samples, more precisely samples whose expected number of sampled items is larger than 10**5.
Parameters: n : int
the size of the chunk
Returns
——-
sample : numpy.array
the indices to keep given relative to the current position in the sample or absolutely, depending on the value of relative_indexing specified when initialising the sampler (default value is True)
- ABXpy.sampling.sampler.Knuth_sampling(n, N, dtype=<Mock id='140696957940496'>)[source]¶
This is the usual sampling function when n is comparable to N
- ABXpy.sampling.sampler.hypergeometric_sample(N, K, n)[source]¶
This function return the number of elements to sample from the next n items.
- ABXpy.sampling.sampler.rejection_sampling(n, N, dtype=<Mock id='140696957940688'>)[source]¶
Using rejection sampling to keep a good performance if n << N
- ABXpy.sampling.sampler.sample_without_replacement(n, N, dtype=<Mock id='140696957940368'>)[source]¶
Returns uniform samples in [0, N-1] without replacement. It will use Knuth sampling or rejection sampling depending on the parameters n and N.
Note
the values 0.6 and 100 are based on empirical tests of the functions and would need to be changed if the functions are changed