Too Many NumLumps
Adam Ashenfelter's Project Blog
Wednesday, August 16, 2017
A Simple Traffic/Speed Detector
A bit about a simple traffic detector I built in Clojure combining a webcam and anomaly detection:
https://blog.bigml.com/2017/08/16/a-stupidly-easy-speed-detector/
Tuesday, September 6, 2016
Friday, January 16, 2015
cljx-sampling: A Clojure(script) library for sampling and random numbers
For a current hobby project I need the ability to generate seeded random numbers and/or sample items from collections in either the JVM or the browser. The PPRNG lib offers seedable random numbers, but it uses different generators depending on the environment. Given a seed I want to generate the same sequence of numbers regardless where the code is running.
So I've open-sourced a little library, cljx-sampling, that uses a seedable 32-bit Xorshift random number generator for consistent results in both Clojure and Clojurescript. I also reused some of my code from bigml/sampling and combined it with the Xorshift RNG to allow for convenient (and still consistent) in-memory samples over collections. Maybe you'll find it useful?
https://github.com/ashenfad/cljx-sampling
So I've open-sourced a little library, cljx-sampling, that uses a seedable 32-bit Xorshift random number generator for consistent results in both Clojure and Clojurescript. I also reused some of my code from bigml/sampling and combined it with the Xorshift RNG to allow for convenient (and still consistent) in-memory samples over collections. Maybe you'll find it useful?
https://github.com/ashenfad/cljx-sampling
Wednesday, September 3, 2014
Sketching/hashing Algorithms in Clojure
Just a short note that I (and BigML) have open sourced a library of hashing / sketching based stream summarizers for Clojure.
Specifically, the library includes techniques that take streams of items and return summaries that can be queried for set membership (bloom filters), set similarity (min-hashes), item occurrence counts (count-min sketches), and the number of distinct items (with my favorite, the magical HyperLogLog).
This library was largely an educational exercise for me, as I wanted to better understand the world of streaming summaries for categorical data. It's written in almost pure Clojure and backed by plain Clojure data structures. So it's (hopefully!) easy to use and easy to serialize. All the summaries are merge friendly making them a nice fit for distributed settings. The big caveat is that I didn't spend much effort optimizing for speed. Those in need of maximizing every CPU cycle may need to look elsewhere.
Subscribe to:
Posts (Atom)