SZaru is a library to use Google's Sawzall aggregators in pure C++, Ruby and Python.
Sawzall aggregators use memory efficient and one-pass algorithms to approximately compute popular statistics. For example, a simple algorithm of 'top N' computation requires O(K) memories where K means the number of unique elements. But SZaru requires only O(N) memories (in most cases N < M) instead of losing some accuracy.
Therefore, SZaru may be useful for large data processing or streaming data processing.
Currently, I have imported the following 3 aggregators from szl (OSS implementation of Sawzall):
git clone git://github.com/llamerada/SZaru.git cd SZaru ./waf configure ./waf sudo ./waf install
# After installing core library sudo gem install szaru
# Change current directory to core library directory cd SZaru cd python python setup.py build sudo python setup.py install