[Python, Rust]ConnectorX - load data from DBs in the fastest and most memory efficient way.

'ConnectorX enables you to load data from databases into Python in the fastest and most memory efficient way.'

Rust 로 구현된 데이터 로딩 라이브러리. 문서만으로 보면 성능이 ...
문서화가 아직은 많이 부족.

mariaDB만 일단 테스트 해봤는데

db uri 에 옵션 설정은 직접 안됨. 문서가 부족하니 다른 설정 방법이 있는지 아직 잘 모르겠음.
read_sql 에서 컬럼값이 null 일 때 에러 발생(임시 방편으로 query 에서 ifnull 함수 쓰면 되긴 하지만)

GitHub - sfu-db/connector-x: Fastest library to load data from DB to DataFrames in Rust and Python

Fastest library to load data from DB to DataFrames in Rust and Python - GitHub - sfu-db/connector-x: Fastest library to load data from DB to DataFrames in Rust and Python

github.com

https://towardsdatascience.com/connectorx-the-fastest-way-to-load-data-from-databases-a65d4d4062d5

ConnectorX: The fastest library for loading your Python data frame

Accelerate Pandas read_sql by 10x with one line of code

towardsdatascience.com

Three main reasons make ConnectorX achieve this performance:

Written in native language:Unlike other libraries, ConnectorX is written in Rust, which avoids the additional cost of implementing data-intensive applications in Python.
Copy exactly once:While existing solutions more or less do data copy multiple times when downloading the data from databases, the implementation of ConnectorX follows the “zero-copy” principle. We manage to copy the data exactly once, directly from theSourcetoDestinationeven under parallelism.
CPU cache efficient:We apply several optimizations to make ConnectorX CPU cache-friendly. Other than “zero-copy” implementation, data processing in ConnectorX is conducted in a streaming fashion to reduce cache miss. Another example is that when we construct strings in Python, we write a batch of strings into one pre-allocated buffer instead of allocating separated locations for each one.

저작자표시 (새창열림)

'Lang' 카테고리의 다른 글

[julia]Decimal to Binary conversion (0)	2021.09.23
r2dbc, DatabaseClient 첫 테스트 (0)	2021.09.17
[Java]r2dbc 첫 테스트 (0)	2021.07.09
[Java]spring-cloud-stream mqtt binder (0)	2021.07.08
[Java] FeignClient, no suitable HttpMessageConverter 에러 (1)	2021.06.11

힘껏차라

[Python, Rust]ConnectorX - load data from DBs in the fastest and most memory efficient way.

'Lang' 카테고리의 다른 글

티스토리툴바

[Python, Rust]ConnectorX - load data from DBs in the fastest and most memory efficient way.

'Lang' 카테고리의 다른 글

'Lang' Related Articles

티스토리툴바