https://github.com/sfu-db/connector-x
'ConnectorX enables you to load data from databases into Python in the fastest and most memory efficient way.'
- Rust 로 구현된 데이터 로딩 라이브러리. 문서만으로 보면 성능이 ...
- 문서화가 아직은 많이 부족.
mariaDB만 일단 테스트 해봤는데
- db uri 에 옵션 설정은 직접 안됨. 문서가 부족하니 다른 설정 방법이 있는지 아직 잘 모르겠음.
- read_sql 에서 컬럼값이 null 일 때 에러 발생(임시 방편으로 query 에서 ifnull 함수 쓰면 되긴 하지만)
https://towardsdatascience.com/connectorx-the-fastest-way-to-load-data-from-databases-a65d4d4062d5
Three main reasons make ConnectorX achieve this performance:
- Written in native language:Unlike other libraries, ConnectorX is written in Rust, which avoids the additional cost of implementing data-intensive applications in Python.
- Copy exactly once:While existing solutions more or less do data copy multiple times when downloading the data from databases, the implementation of ConnectorX follows the “zero-copy” principle. We manage to copy the data exactly once, directly from theSourcetoDestinationeven under parallelism.
- CPU cache efficient:We apply several optimizations to make ConnectorX CPU cache-friendly. Other than “zero-copy” implementation, data processing in ConnectorX is conducted in a streaming fashion to reduce cache miss. Another example is that when we construct strings in Python, we write a batch of strings into one pre-allocated buffer instead of allocating separated locations for each one.
'Lang' 카테고리의 다른 글
[julia]Decimal to Binary conversion (0) | 2021.09.23 |
---|---|
r2dbc, DatabaseClient 첫 테스트 (0) | 2021.09.17 |
[Java]r2dbc 첫 테스트 (0) | 2021.07.09 |
[Java]spring-cloud-stream mqtt binder (0) | 2021.07.08 |
[Java] FeignClient, no suitable HttpMessageConverter 에러 (1) | 2021.06.11 |