123

Github issues

3.500

Github stars

8

Days since last commit

23

Stackoverflow questions

123

Github issues

3.500

Github stars

8

Last commit days ago

23

Stackoverflow questions

Vaex in one line: Vaex is Pandas with lazy evaluation and memory mapping.

What is Vaex?

Vaex is a data wrangling Python library. It helps developers work with “uncomfortably large” datasets on a single machine using lazy evaluation, memory mapping, and integrations with C++ code. It’s specifically designed to work “out of core” – to process data that’s too big to be loaded into memory (RAM) all at once. 

What problem does Vaex solve?

Pandas is a popular, high-level library for handling tabular data, but it often has efficiency issues. Both Python and Pandas value simplicity over efficiency, and Python code in particular is often difficult to properly parallelize because of the GIL.

Vaex solves this problem by rebuilding a Pandas-like library “from the ground up,” taking advantage of lower-level C++ integrations for parallelization and lazy evaluation. Opening a 100GB dataset on a normal laptop is difficult with Pandas, but Vaex can do this efficiently, allowing developers to analyze larger datasets without compute clusters.

Used by:

Top articles for

vaex

No items found.