Parrot
Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.
โจ Features
- Fused Operations - Operations are fused when possible
- GPU Acceleration - Built on CUDA/Thrust for high performance
- Chainable API - Clean, expressive syntax for complex operations
- Header-Only - Simple integration with
#include "parrot.hpp"
ยน Lazy-ish means that any operation that can be lazily evaluated is lazily evaluated.
๐ Quick Start
#include "parrot.hpp"
int main() {
// Create arrays
auto A = parrot::array({3, 4, 0, 8, 2});
auto B = parrot::array({6, 7, 2, 1, 8});
a...
Parrot
Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.
โจ Features
- Fused Operations - Operations are fused when possible
- GPU Acceleration - Built on CUDA/Thrust for high performance
- Chainable API - Clean, expressive syntax for complex operations
- Header-Only - Simple integration with
#include "parrot.hpp"
ยน Lazy-ish means that any operation that can be lazily evaluated is lazily evaluated.
๐ Quick Start
#include "parrot.hpp"
int main() {
// Create arrays
auto A = parrot::array({3, 4, 0, 8, 2});
auto B = parrot::array({6, 7, 2, 1, 8});
auto C = parrot::array({2, 5, 7, 4, 3});
// Chain operations
(B * C + A).print(); // Output: 15 39 14 12 26
}
Advanced Example: Row-wise Softmax
#include "parrot.hpp"
using namespace parrot::literals;
auto softmax(auto matrix) {
auto cols = matrix.ncols();
auto z = matrix - matrix.maxr(2_ic).replicate(cols);
auto num = z.exp();
auto den = num.sum(2_ic);
return num / den.replicate(cols);
}
int main() {
auto matrix = parrot::range(6).as<float>().reshape({2, 3});
softmax(matrix).print();
}
๐๏ธ Building
Prerequisites
- CMake (3.10+)
- C++ compiler with C++20 support
- NVIDIA CUDA Toolkit 13.0 or later
- NVIDIA GPU with compute capability 7.0+
Build Steps
# Clone the repository
git clone <repository-url>
cd parrot
# Create build directory
mkdir build && cd build
# Configure and build
cmake ..
cmake --build . -j$(nproc)
# Run tests
ctest
For detailed build instructions, see BUILDING.md.
๐ Comparisons
Parrot provides significant code reduction compared to other CUDA libraries:
Library | Code Reduction |
---|---|
Thrust | ~10x less code |
See detailed comparisons in our documentation.
๐ Documentation
- Full Documentation - Complete API reference and examples
- Examples - Standalone Parrot examples
- vs Thrust - Side-by-side comparisons with Thrust
๐งช Examples
The examples/ directory contains:
getting_started/
- Simple examples to get startedthrust/
- Parrot implementations of Thrust examples
๐ ๏ธ Development
Running Tests
# All tests
ctest
# Individual test categories
./test_basic # Basic operations
./test_sorting # Sorting algorithms
./test_math # Mathematical operations
./test_reductions # Reduction operations
Code Quality
# Run clang-tidy
./scripts/run-clang-tidy.sh
# Auto-fix issues
./scripts/run-clang-tidy.sh --fix
Building Documentation
# Install dependencies
pip install -r requirements.txt
# Build docs
cd script && ./build-docs.sh
๐ Repository Structure
parrot/
โโโ parrot.hpp # Main header (single-file library)
โโโ thrustx.hpp # Extended Thrust utilities
โโโ examples/ # Example code
โ โโโ getting_started/ # Simple getting-started examples
โ โโโ thrust/ # Parrot versions of Thrust examples
โ โโโ real_world/ # Parrot versions of examples from open source projects
โโโ tests/ # Unit tests
โโโ docs/ # Documentation source
โโโ scripts/ # Development scripts
๐ค Contributing
We welcome contributions! Please see our CONTRIBUTING.md guide for details on:
- Developer Certificate of Origin (DCO) requirements
- Setting up the development environment
- Code standards and review process
- Running tests and code quality checks
- Building documentation
All contributions must be signed off under the DCO and comply with the Apache License 2.0.
๐ License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Third-Party Licenses
This project includes third-party software components. See THIRD_PARTY_LICENSES for complete license information and attributions.
๐ Acknowledgments
Built on top of NVIDIA Thrust and CUDA.