Publikation
DoomDB: kill the query
Carsten Binnig; Abdallah Salama; Erfan Zamanian
In: Curtis E. Dyreson; Feifei Li; M. Tamer Özsu (Hrsg.). International Conference on Management of Data. ACM SIGMOD International Conference on Management of Data (SIGMOD-2014), June 22-27, Snowbird, UT, USA, Pages 913-916, ACM, 2014.
Zusammenfassung
Typically, fault-tolerance in parallel database systems is handled by restarting a query completely when a node failure happens. However, when deploying a parallel database on a cluster of commodity machines or on IaaS offerings such as Amazon's Spot Instances, node failures are a common case. This requires a more fine-granular fault-tolerance scheme. Therefore, most recent parallel data management platforms such as Hadoop or Shark use a fine-grained fault-tolerance scheme, which materializes all intermediate results in order to be able to recover from mid-query faults. While such a fine-grained fault-tolerance scheme is able to efficiently handle node failures for complex and long-running queries, it is not optimal for short-running latency-sensitive queries since the additional costs for materialization often outweigh the costs for actually executing the query. In this demo, we showcase our novel cost-based fault-tolerance scheme in XDB. It selects which intermediate results to materialize such that the overall query runtime is minimized in the presence of node failures. For the demonstration, we present a computer game called DoomDB. DoomDB is designed as an ego-shooter game with the goal of killing nodes in an XDB database cluster and thus prevent a given query to produce its final result in a given time frame. One interesting use-case of DoomDB is to use it for crowdsourcing the testing activities of XDB.