Python date parsing from unstructured text

posted Feb 15, 2019, 7:58 AM by Chris G   [ updated Feb 15, 2019, 7:59 AM ]

Duckling is a Clojure library that parses text into structured data:

“the first Tuesday of October” => {:value "2014-10-07T00:00:00.000-07:00"
                                   :grain :day}

See our blog post announcement for more context.

Duckling is shipped with modules that parse temporal expressions in English, Spanish, French, Italian and Chinese (experimental, thanks to Zhe Wang). It recognizes dates and times described in many ways:

  • today at 5pm
  • 2014-10-01
  • the last Tuesday of October 2012
  • twenty five minutes ago
  • the day before labor day 2020
  • June 10-11 (interval)
  • third monday after christmas 1980

Python wrapper for's Duckling Clojure library

GitHub Gist

posted Feb 4, 2019, 8:46 AM by Chris G   [ updated Feb 4, 2019, 8:46 AM ]

GitHub Gist

Instantly share code, notes, and snippets.

Binder - launch Jupyter notebooks from GitHub

posted Feb 4, 2019, 8:41 AM by Chris G   [ updated Feb 4, 2019, 8:42 AM ]

Turn a Git repo into a collection of interactive notebooks

Have a repository full of Jupyter notebooks? With Binder, open those notebooks in an executable environment, making your code immediately reproducible by anyone, anywhere.

Zappa - Fast and easy serverless deployment of Python code

posted Aug 30, 2018, 6:28 PM by Chris G   [ updated Aug 30, 2018, 6:28 PM ]

Zappa makes it super easy to build and deploy server-less, event-driven Python applications (including, but not limited to, WSGI web apps) on AWS Lambda + API Gateway. Think of it as "serverless" web hosting for your Python apps. That means infinite scalingzero downtimezero maintenance - and at a fraction of the cost of your current deployments!

If you've got a Python web app (including Django and Flask apps), it's as easy as:

$ pip install zappa
$ zappa init
$ zappa deploy

and now you're server-less! Wow!

Python Click

posted Jul 11, 2018, 6:52 AM by Chris G   [ updated Jul 11, 2018, 6:52 AM ]

Click is a Python package for creating beautiful command line interfaces in a composable way with as little code as necessary. It’s the “Command Line Interface Creation Kit”. It’s highly configurable but comes with sensible defaults out of the box.

It aims to make the process of writing command line tools quick and fun while also preventing any frustration caused by the inability to implement an intended CLI API.

Click in three points:

  • arbitrary nesting of commands
  • automatic help page generation
  • supports lazy loading of subcommands at runtime

Python visualization library - bokeh

posted Jul 11, 2018, 6:50 AM by Chris G   [ updated Jul 11, 2018, 6:51 AM ]

Bokeh is an interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of versatile graphics, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.

Python web applications with gunicorn

posted May 12, 2018, 8:46 AM by Chris G   [ updated May 12, 2018, 8:46 AM ]

Python Libraries such as Flask are single-threaded. They work fine but will not allow your web app to scale. It is a good idea to run gunicorn in front of Flask:

Gunicorn 'Green Unicorn' is a Python WSGI HTTP Server for UNIX. It's a pre-fork worker model. The Gunicorn server is broadly compatible with various web frameworks, simply implemented, light on server resources, and fairly speedy.

A Gentle Introduction to Bloom Filter

posted May 12, 2018, 8:40 AM by Chris G   [ updated May 12, 2018, 8:41 AM ]

A Gentle Introduction to Bloom Filter

Bloom Filter

Bloom filters are probabilistic space-efficient data structures. They are very similar to hashtables; they are used exclusively membership existence in a set. However, they have a very powerful property which allows to make trade-off between space and false-positive rate when it comes to membership existence. Since it can make a tradeoff between space and false positive rate, it is called probabilistic data structure.

Celery - Distributed Task Queue

posted May 12, 2018, 8:39 AM by Chris G   [ updated May 12, 2018, 8:39 AM ]

: Distributed Task Queue

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).

Celery is used in production systems to process millions of tasks a day.

Concurrency is not parallelism

posted May 12, 2018, 8:19 AM by Chris G   [ updated May 12, 2018, 8:20 AM ]

Concurrency is not parallelism

But when people hear the word concurrency they often think of parallelism, a related but quite distinct concept. In programming, concurrency is the composition of independently executing processes, while parallelism is the simultaneous execution of (possibly related) computations. Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.

Vimeo video

1-10 of 26