hyperjson

A hyper-fast, safe Python module to read and write JSON data. Works as a drop-in replacement for Python's built-in json module. This is alpha software and there will be bugs, so maybe don't deploy to production just yet.

logo--1-

Installation

pip install hyperjson

Usage

hyperjson is meant as a drop-in replacement for Python's json
module
:

>>> import hyperjson
>>> hyperjson.dumps([{"key": "value"}, 81, True])
'[{"key":"value"},81,true]'
>>> hyperjson.loads("""[{"key": "value"}, 81, true]""")
[{u'key': u'value'}, 81, True]

Motivation

Parsing JSON is a solved problem; so, no need to reinvent the wheel, right?
Well, unless you care about performance and safety.

Turns out, parsing JSON correctly is a hard problem. Thanks to Rust however, we can minimize the risk of running into stack overflows or segmentation faults however.

hyperjson is a thin wrapper around Rust's serde-json and pyo3. It is compatible with Python 3 (and 2 on a best-effort basis).

For a more in-depth discussion, watch the talk about this project recorded at the Rust Cologne Meetup in August 2018.

Goals

  • Compatibility: Support the full feature-set of Python's json module.
  • Safety: No segfaults, panics, or overflows.
  • Performance: Significantly faster than json and as fast as ujson (both written in C).

Non-goals

  • Support ujson and simplejson extensions:
    Custom extensions like encode(), __json__(), or toDict() are not
    supported. The reason is, that they go against PEP8 (e.g. dunder methods
    are restricted to the standard library, camelCase is not Pythonic) and are not
    available in Python's json module.
  • Whitespace preservation: Whitespace in JSON strings is not preserved.
    Mainly because JSON is a whitespace-agnostic format and serde-json strips
    them out by default. In practice this should not be a problem, since your
    application must not depend on whitespace padding, but it's something to be
    aware of.

Benchmark

We are not fast yet. That said, we haven't done any big optimizations.
In the long-term we might explore features of newer CPUs like multi-core and SIMD.
That's one area other (C-based) JSON extensions haven't touched yet, because it might
make code harder to debug and prone to race-conditions. In Rust, this is feasible due to crates like
faster or
rayon.

So there's a chance that the following measurements might improve soon.
If you want to help, check the instructions in the Development Environment section below.

Test machine:
MacBook Pro 15 inch, Mid 2015 (2,2 GHz Intel Core i7, 16 GB RAM) Darwin 17.6.18

serialize

deserialize

GitHub

https://github.com/mre/hyperjson