Class AVLTreeDigest
java.lang.Object
com.tdunning.math.stats.TDigest
com.tdunning.math.stats.AbstractTDigest
com.tdunning.math.stats.AVLTreeDigest
- All Implemented Interfaces:
Serializable
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final doubleprivate longprivate static final intprivate AVLGroupTreeprivate static final intFields inherited from class AbstractTDigest
gen, recordAllData -
Constructor Summary
ConstructorsConstructorDescriptionAVLTreeDigest(double compression) A histogram structure that will record a sketch of a distribution. -
Method Summary
Modifier and TypeMethodDescriptionvoidadd(double x, int w) Adds a sample to a histogram.(package private) voidvoidvoidvoidasBytes(ByteBuffer buf) Outputs a histogram as bytes using a particularly cheesy encoding.voidasSmallBytes(ByteBuffer buf) Serialize this TDigest into a byte buffer.intbyteSize()Returns an upper bound on the number bytes that will be required to represent this histogram.doublecdf(double x) Returns the fraction of all points added which are invalid input: '<'= x.intACollectionthat lets you go through the centroids in ascending order by mean.voidcompress()Re-examines a t-digest to determine whether some centroids are redundant.doubleReturns the current compression factor.static AVLTreeDigestfromBytes(ByteBuffer buf) Reads a histogram from a byte bufferdoublequantile(double q) Returns an estimate of the cutoff such that a specified fraction of the data added to this TDigest would be less than or equal to the cutoff.Sets up so that all centroids will record all data assigned to them.longsize()Returns the number of samples represented in this histogram.intReturns an upper bound on the number of bytes that will be required to represent this histogram in the tighter representation.Methods inherited from class AbstractTDigest
add, add, createCentroid, decode, encode, interpolate, isRecording, quantile, weightedAverageMethods inherited from class TDigest
checkValue, createAvlTreeDigest, createDigest, createMergingDigest, getMax, getMin, setMinMax
-
Field Details
-
compression
private final double compression -
summary
-
count
private long count -
VERBOSE_ENCODING
private static final int VERBOSE_ENCODING- See Also:
-
SMALL_ENCODING
private static final int SMALL_ENCODING- See Also:
-
-
Constructor Details
-
AVLTreeDigest
public AVLTreeDigest(double compression) A histogram structure that will record a sketch of a distribution.- Parameters:
compression- How should accuracy be traded for size? A value of N here will give quantile errors almost always less than 3/N with considerably smaller errors expected for extreme quantiles. Conversely, you should expect to track about 5 N centroids for this accuracy.
-
-
Method Details
-
recordAllData
Description copied from class:AbstractTDigestSets up so that all centroids will record all data assigned to them. For testing only, really.- Overrides:
recordAllDatain classAbstractTDigest- Returns:
- This TDigest so that configurations can be done in fluent style.
-
centroidCount
public int centroidCount()- Specified by:
centroidCountin classTDigest
-
add
- Specified by:
addin classAbstractTDigest
-
add
-
add
-
add
-
compress
public void compress()Description copied from class:TDigestRe-examines a t-digest to determine whether some centroids are redundant. If your data are perversely ordered, this may be a good idea. Even if not, this may save 20% or so in space. The cost is roughly the same as adding as many data points as there are centroids. This is typically invalid input: '<' 10 * compression, but could be as high as 100 * compression. This is a destructive operation that is not thread-safe. -
size
-
cdf
public double cdf(double x) Description copied from class:TDigestReturns the fraction of all points added which are invalid input: '<'= x. -
quantile
public double quantile(double q) Description copied from class:TDigestReturns an estimate of the cutoff such that a specified fraction of the data added to this TDigest would be less than or equal to the cutoff. -
centroids
Description copied from class:TDigestACollectionthat lets you go through the centroids in ascending order by mean. Centroids returned will not be re-used, but may or may not share storage with this TDigest. -
compression
public double compression()Description copied from class:TDigestReturns the current compression factor.- Specified by:
compressionin classTDigest- Returns:
- The compression factor originally used to set up the TDigest.
-
byteSize
-
smallByteSize
public int smallByteSize()Returns an upper bound on the number of bytes that will be required to represent this histogram in the tighter representation.- Specified by:
smallByteSizein classTDigest- Returns:
- The number of bytes required.
-
asBytes
Outputs a histogram as bytes using a particularly cheesy encoding. -
asSmallBytes
Description copied from class:TDigestSerialize this TDigest into a byte buffer. Some simple compression is used such as using variable byte representation to store the centroid weights and using delta-encoding on the centroid means so that floats can be reasonably used to store the centroid means.- Specified by:
asSmallBytesin classTDigest- Parameters:
buf- The byte buffer into which the TDigest should be serialized.
-
fromBytes
Reads a histogram from a byte buffer- Returns:
- The new histogram structure
-