https://pyo3.rs/latest/parallelism.html
https://github.com/PyO3/pyo3/tree/main/examples/word-count
cargo new --lib word-count
Cargo.toml
[package]
name = "word-count"
version = "0.1.0"
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
rayon = "1.5.1"
[lib]
name = "word_count"
crate-type = ["cdylib"]
[dependencies.pyo3]
version = "0.14.5"
features = ["extension-module"]
lib.rs は GitHub からコピペ
count_line の実装は略
Rust 並列版 (rayon)
fn search(contents: &str, needle: &str) -> usize {
contents
.par_lines()
.map(|line| count_line(line, needle))
.sum()
}
Rust 直列版
fn search_sequential(contents: &str, needle: &str) -> usize {
contents.lines().map(|line| count_line(line, needle)).sum()
}
Rust 直列 を Python 並列版
fn search_sequential_allow_threads(py: Python, contents: &str, needle: &str) -> usize {
py.allow_threads(|| search_sequential(contents, needle))
}
ビルド
cargo build --release
target/release/word_count.dll ができているので(略)
比較用に、純粋 Python 版
def search_py(contents: str, needle: str) -> int:
total = 0
for line in contents.splitlines():
for word in line.split(" "):
if word == needle:
total += 1
return total
公式に従ってベンチマークテストすると、
- test_word_count_rust_parallel (search)
- test_word_count_rust_sequential (search_sequential)
- test_word_count_rust_sequential_twice_with_threads (run_rust_sequential_twice)
- test_word_count_python_sequential (search_py)
の比率が大体 1 : 3 : 4.5 : 18 になった。
ここで twice は以下のような感じ。
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(max_workers=2)
future_1 = executor.submit(
search_sequential_allow_threads, contents, needle
)
future_2 = executor.submit(
search_sequential_allow_threads, contents, needle
)
result_1 = future_1.result()
result_2 = future_2.result()
I/Oバウンドじゃないけど、そんなに遅くなってない。 twice の実装を search_sequential_allow_threads の代わりに search_sequential にすると、2倍くらい時間がかかる。
つまり、Python::allow_threads によって GIL が解除されて並列処理ができている。
関連記事
コメント