Skip to content

Conversation

@calgray
Copy link

@calgray calgray commented Sep 18, 2023

JMESPath.py is limited in that only the dict and list derived containers returned by the built-in json library are supported in the object hierarchy due to the use of isinstance. A very notable arraylike instance that does not derive directly from these containers is a numpy.ndarray which can be deserialized using the JSON-like msgpack library with msgpack_numpy.

This changeset aims to add support for arraylike (list, tuple and numpy.ndarray) containers in place of parsed JSON arrays and without adding any dependency on the numpy library. This is done using the documented numpy array interface protocol of which many more arraylike libraries adhere to such as xarray, dask, astropy and cupy.

(pandas.Series is also arraylike but limited to 1D as multidimensional Series isn't an intended use case and has slicing issues)

@calgray calgray changed the title Add support 'arraylike' objects as JSON arrays Add support for 'arraylike' objects as JSON arrays Sep 18, 2023
@calgray calgray marked this pull request as ready for review September 18, 2023 04:44


def is_array(arg):
return hasattr(arg, "__array__") and arg.shape != ()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth linking to https://numpy.org/doc/stable/user/basics.interoperability.html#the-array-method in a comment?

Also, is it guaranteed that an object with __array__ will always have a shape attribute defined?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numpy docs describes that __array__ method if it exists should always return a np.ndarray instance (ideally with zero copy) of which that type always has a shape attribute (tested with dask and astropy).

__array_interface__ is a bit more array library agnostic and explicitly documented to require shape, but __array__ is already being used so this project doesn't need to explicitly import numpy to perform np.array(value.__array_interface__, copy=False).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants