Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Apache Spark
⚡ Apache Spark
Specific
PySpark, Spark SQL, distributed computing, big data
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
95
posts in
12.6
ms
When AI builds itself 👷, AI is not a line item 📝, local LLMs for agentic coding 🤖
🐍
Python
tldr.tech
·
5d
5 days ago
Actions for When AI builds itself 👷, AI is not a line item 📝, local LLMs for agentic coding 🤖
Build stateful
streaming
applications with
Apache
Spark
4.0 on Amazon EMR Serverless
🔄
Data Pipelines
Content type:
Blog
aws.amazon.com
·
1d
1 day ago
Actions for Build stateful streaming applications with Apache Spark 4.0 on Amazon EMR Serverless
Operationalizing Property-Based Testing for
Data-Intensive
Scalable
Computing
Systems
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Operationalizing Property-Based Testing for Data-Intensive Scalable Computing Systems
RATrain: A Resource-Aware Training Runtime for Large Language Models on Bandwidth-Constrained Heterogeneous Supercomputing Platforms
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for RATrain: A Resource-Aware Training Runtime for Large Language Models on Bandwidth-Constrained Heterogeneous Supercomputing Platforms
Run Interactive Workloads on Amazon EMR Serverless with
Spark
Connect
🔄
Data Pipelines
aws.amazon.com
·
1d
1 day ago
Actions for Run Interactive Workloads on Amazon EMR Serverless with Spark Connect
ASTRA-sim 3.0: Next-Level
Distributed
Machine Learning Simulations via High-Fidelity GPU and Infrastructure Modeling
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for ASTRA-sim 3.0: Next-Level Distributed Machine Learning Simulations via High-Fidelity GPU and Infrastructure Modeling
Achieving Cloud-Grade SLOs for Local Mixture-of-Experts Inference through CPU-GPU Hybrid Design
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Achieving Cloud-Grade SLOs for Local Mixture-of-Experts Inference through CPU-GPU Hybrid Design
Update canonical GitHub project links (#3177)
🐍
Python
Content type:
Code
github.com
·
6d
6 days ago
Actions for Update canonical GitHub project links (#3177)
FairWave : A Fairness-Aware Asynchronous DAG-BFT Consensus
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for FairWave : A Fairness-Aware Asynchronous DAG-BFT Consensus
FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training
Generalizing LCL Complexity Gaps to Unbounded Degree via Monadic Second-Order Properties
🛠️
Data Engineering
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Generalizing LCL Complexity Gaps to Unbounded Degree via Monadic Second-Order Properties
Terastal: Layer-Variant-based Scheduling for Real-Time Multi-DNN Workloads on Heterogeneous Accelerators
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Terastal: Layer-Variant-based Scheduling for Real-Time Multi-DNN Workloads on Heterogeneous Accelerators
Rectangular Matrix Multiplication in the Low-Bandwidth Model
🛠️
Data Engineering
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Rectangular Matrix Multiplication in the Low-Bandwidth Model
Engineering Scalable
Distributed
List Ranking
🛠️
Data Engineering
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Engineering Scalable Distributed List Ranking
aayush4vedi/drift-spark
:
Spark-native
embedding lifecycle- produce, CDC refresh, model-migrate, audit.
🔄
Data Pipelines
Content type:
Code
github.com
·
7h
7 hours ago
·
Hacker News
Actions for aayush4vedi/drift-spark: Spark-native embedding lifecycle- produce, CDC refresh, model-migrate, audit.
When More Cores Hurts: The Vector
Database
Scaling Paradox in HPC
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for When More Cores Hurts: The Vector Database Scaling Paradox in HPC
Resource-aware Computation-Communication Overlap for multi-GPU ML Workloads
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Resource-aware Computation-Communication Overlap for multi-GPU ML Workloads
IN2P3
Computing
Center 2024 Workload
Dataset
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for IN2P3 Computing Center 2024 Workload Dataset
AutoPilot: Learning to Steer High Speed Robust BFT
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for AutoPilot: Learning to Steer High Speed Robust BFT
Demystifying NVSHMEM: A System-Level Analysis on Symmetric Memory and Device-Initiated Operations in GPU Communication
🔄
Data Pipelines
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for Demystifying NVSHMEM: A System-Level Analysis on Symmetric Memory and Device-Initiated Operations in GPU Communication
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help