Introducing distributed dynamic data‐intensive (D3) science: Understanding applications and infrastructure

S Jha, DS Katz, A Luckow, N Chue Hong… - Concurrency and …, 2017 - Wiley Online Library
Concurrency and Computation: Practice and Experience, 2017Wiley Online Library
A common feature across many science and engineering applications is the amount and
diversity of data and computation that must be integrated to yield insights. Datasets are
growing larger and becoming distributed; their location, availability, and properties are often
time‐dependent. Collectively, these characteristics give rise to dynamic distributed data‐
intensive applications. While “static” data applications have received significant attention,
the characteristics, requirements, and software systems for the analysis of large volumes of …
Summary
A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Datasets are growing larger and becoming distributed; their location, availability, and properties are often time‐dependent. Collectively, these characteristics give rise to dynamic distributed data‐intensive applications. While “static” data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data‐intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data‐intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.
Wiley Online Library