Good tutorial for Bloom Filter understanding: http://billmill.org/bloomfilter-tutorial/
Bloom filters use case is following:
You have very large data sets that typically don’t fit in memory and you want to check your element it contains or not contains. Obviously It works very well for not contains detection.
if the bloom filter gives a hit: the item is probably inside
if the bloom filter gives a miss: the item is certainly not inside
How can I use in Java. Guava Provide a library for Bloom Filter:
m denotes the number of bits in the Bloom filter (bitSize)
n denotes the number of elements inserted into the Bloom filter (maxKeys)
k represents the number of hash functions used (nbHash)
e represents the desired false
positive rate for the bloom (err) If we fix the error rate (e) and know the number of entries, then the optimal bloom size
m = -(nln(err) / (ln(2)^2) ~= nln(err) / ln(0.6185)
The probability of false positives is minimized when k = m/n ln(2).
Talat Uyarer 6 Ağustos 2016