default search action
BigData Conference 2013: Santa Clara, CA, USA
- Xiaohua Hu, Tsau Young Lin, Vijay V. Raghavan, Benjamin W. Wah, Ricardo Baeza-Yates, Geoffrey C. Fox, Cyrus Shahabi, Matthew Smith, Qiang Yang, Rayid Ghani, Wei Fan, Ronny Lempel, Raghunath Nambiar:
2013 IEEE International Conference on Big Data (IEEE BigData 2013), 6-9 October 2013, Santa Clara, CA, USA. IEEE Computer Society 2013, ISBN 978-1-4799-1292-6
Conference Paper Presentations
- Amgad Madkour, Walid G. Aref, Saleh M. Basalamah:
Knowledge cubes - A proposal for scalable and semantically-guided management of Big Data. 1-7 - Pascal Bianchi, Stéphan Clémençon, Gemma Morral, Jérémie Jakubowicz:
On-line learning gossip algorithm in multi-agent systems with local decision rules. 6-14 - Peter Sanders, Sebastian Schlag, Ingo Müller:
Communication efficient algorithms for fundamental big data problems. 15-23 - Upa Gupta, Leonidas Fegaras:
Map-based graph analysis on MapReduce. 24-30 - Tao Luo, Yin Liao, Guoliang Chen, Yunquan Zhang:
P-DOT: A model of computation for big data. 31-37 - En-Hui Yang, Xiang Yu:
Transparent composite model for large scale image/video processing. 38-44 - Rui Han, Lei Nie, Moustafa Ghanem, Yike Guo:
Elastic algorithms for guaranteeing quality monotonicity in big data mining. 45-50 - Mario Pastorelli, Antonio Barbuzzi, Damiano Carra, Matteo Dell'Amico, Pietro Michiardi:
HFSP: Size-based scheduling for Hadoop. 51-59 - Benedikt Elser, Alberto Montresor:
An evaluation study of BigData frameworks for graph processing. 60-67 - Bryan N. Lawrence, Victoria L. Bennett, J. Churchill, Martin Juckes, Philip Kershaw, Stephen Pascoe, Sam Pepler, M. Pritchard, Ag Stephens:
Storing and manipulating environmental big data with JASMIN. 68-75 - Hieu Hanh Le, Satoshi Hikida, Haruo Yokota:
Efficient gear-shifting for a power-proportional distributed data-placement method. 76-84 - Patrick Leyshock, David Maier, Kristin Tufte:
Agrios: A hybrid approach to big array analytics. 85-93 - Chun-Hsiang Lee, David Birch, Chao Wu, Dilshan Silva, Orestis Tsinalis, Yang Li, Shulin Yan, Moustafa Ghanem, Yike Guo:
Building a generic platform for big sensor data application. 94-102 - Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen:
Locality-driven high-level I/O aggregation for processing scientific datasets. 103-111 - Dheeraj Kumar, Marimuthu Palaniswami, Sutharshan Rajasegarar, Christopher Leckie, James C. Bezdek, Timothy C. Havens:
clusiVAT: A mixed visual/numerical clustering algorithm for big data. 112-117 - Toshimori Honjo, Kazuki Oikawa:
Hardware acceleration of Hadoop MapReduce. 118-124 - Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh:
Optimizing the MapReduce framework on Intel Xeon Phi coprocessor. 125-130 - Eugen Feller, Lavanya Ramakrishnan, Christine Morin:
On the performance and energy efficiency of Hadoop deployment models. 131-136 - D. Michael Freemon:
Optimizing throughput on guaranteed-bandwidth WAN networks for the Large Synoptic Survey Telescope (LSST). 137-142 - Takuya Araki, Kazuyo Narita, Hiroshi Tamano:
Feliss: Flexible distributed computing framework with light-weight checkpointing. 143-149 - Jonas Dias, Eduardo S. Ogasawara, Daniel de Oliveira, Fábio Porto, Patrick Valduriez, Marta Mattoso:
Algebraic dataflows for big data analysis. 150-155 - Wei Yan, Yuan Xue, Bradley A. Malin:
Scalable and robust key group size estimation for reducer load balancing in MapReduce. 156-162 - Chao Yin, Jianzong Wang, Changsheng Xie, Jiguang Wan, Changlin Long, Wenjuan Bi:
Robot: An efficient model for big data storage systems based on erasure coding. 163-168 - Chao Chen, Michael Lang, Yong Chen:
Multilevel Active Storage for big data applications in high performance computing. 169-174 - Chandima Hewa Nadungodage, Yuni Xia, Jaehwan John Lee, Myungcheol Lee, Choon Seo Park:
GPU accelerated item-based collaborative filtering for big-data applications. 175-180 - GuiXin Guo, Shuang Qiu, Zhiqiang Ye, Bingqiang Wang, Lin Fang, Mian Lu, Simon See, Rui Mao:
GPU-accelerated adaptive compression framework for genomics data. 181-186 - Deepal Jayasinghe, Josh Kimball, Tao Zhu, Siddharth Choudhary, Calton Pu:
An infrastructure for automating large-scale performance studies and data processing. 187-192 - Li-Yung Ho, Tsung-Han Li, Jan-Jan Wu, Pangfeng Liu:
Kylin: An efficient and scalable graph data processing system. 193-198 - Qunzhi Zhou, Yogesh Simmhan, Viktor K. Prasanna:
Towards hybrid online on-demand querying of realtime data with stateful complex event processing. 199-205 - Jiaran Zhang, Xiaohui Yu, Yang Liu, Liwei Lin:
DDSN: Duplicate detection to reduce both storage and bandwidth consumption. 206-211 - Aalap Tripathy, Ka Chon Ieong, Atish Patra, Rabi N. Mahapatra:
A reconfigurable computing architecture for semantic information filtering. 212-218 - Oyindamola O. Akande, Philip J. Rhodes:
Iteration aware prefetching for unstructured grids. 219-227 - Elad Yom-Tov, Mounia Lalmas, Ricardo Baeza-Yates, Georges Dupret, Janette Lehmann, Pinar Donmez:
Measuring inter-site engagement. 228-236 - Ting Chen, Kenjiro Taura:
A selective checkpointing mechanism for query plans in a parallel database system. 237-245 - Kyumars Sheykh Esmaili, Lluis Pamies-Juarez, Anwitaman Datta:
CORE: Cross-object redundancy for efficient data repair in storage systems. 246-254 - Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, Nectarios Koziris:
H2RDF+: High-performance distributed joins over large-scale RDF graphs. 255-263 - Austin R. Benson, David F. Gleich, James Demmel:
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. 264-272 - Radu Tudoran, Alexandru Costan, Ramin Rezai Rad, Goetz Brasche, Gabriel Antoniu:
Adaptive file management for scientific workflows on the Azure cloud. 273-281 - Tian Guo, Thanasis G. Papaioannou, Karl Aberer:
Model-view sensor data management in the cloud. 282-290 - Anthony D. Fox, Christopher N. Eichelberger, James N. Hughes, Skylar Lyon:
Spatio-temporal indexing in non-relational distributed databases. 291-299 - Lefteris Sidirourgos, Martin L. Kersten, Peter A. Boncz:
Scientific discovery through weighted sampling. 300-306 - Stefan Pröll, Andreas Rauber:
Scalable data citation in dynamic, large databases: Model and reference implementation. 307-312 - Krish K. R., Aleksandr Khasymski, Guanying Wang, Ali Raza Butt, Gaurav Makkar:
On the use of shared storage in shared-nothing environments. 313-318 - Alexander Artikis, Matthias Weidlich, Avigdor Gal, Vana Kalogeraki, Dimitrios Gunopulos:
Self-adaptive event recognition for intelligent transport management. 319-325 - Leonardo Arturo Bautista-Gomez, Franck Cappello:
Improving floating point compression through binary masks. 326-331 - Junjie Chen, Philip C. Roth, Yong Chen:
Using pattern-models to guide SSD deployment for Big Data applications in HPC systems. 332-337 - Zhiquan Liu, Luo Luo, Wu-Jun Li:
Robust crowdsourced learning. 338-343 - Jialin Liu, Surendra Byna, Yong Chen:
Segmented analysis for reducing data movement. 344-349 - Simon Chan, Philip C. Treleaven, Licia Capra:
Continuous hyperparameter optimization for large-scale recommender systems. 350-358 - Hoang Vu Nguyen, Emmanuel Müller, Klemens Böhm:
4S: Scalable subspace search scheme overcoming traditional Apriori processing. 359-367 - Lars Arge, Michael T. Goodrich, Freek van Walderveen:
Computing betweenness centrality in external memory. 368-375 - Rong Gu, Furao Shen, Yihua Huang:
A parallel computing platform for training large scale neural networks. 376-384 - Raghvendra Mall, Rocco Langone, Johan A. K. Suykens:
Self-tuned kernel spectral clustering for large scale networks. 385-393 - Yuichiro Yasui, Katsuki Fujisawa, Kazushige Goto:
NUMA-optimized parallel breadth-first search on multicore single-node system. 394-402 - Arash Fard, M. Usman Nisar, Lakshmish Ramaswamy, John A. Miller, Matthew Saltz:
A distributed vertex-centric approach for pattern matching in massive graphs. 403-411 - Lee Parnell Thompson, Weijia Xu, Daniel P. Miranker:
Fast scalable selection algorithms for large scale data. 412-420 - Yoshiki Sakai, Kenji Yamanishi:
An NML-based model selection criterion for general relational data modeling. 421-429 - Rajiv Khanna, Liang Zhang, Deepak Agarwal, Bee-Chung Chen:
Parallel matrix factorization for binary response. 430-438 - Desheng Zhang, Tian He, Yunhuai Liu, John A. Stankovic:
CallCab: A unified recommendation system for carpooling and regular taxicab services. 439-447 - Abhirup Chakraborty:
Top-K aggregation over a large graph using shared-nothing systems. 448-457 - Nemanja Djuric, Mihajlo Grbovic, Slobodan Vucetic:
Distributed confidence-weighted classification on MapReduce. 458-466 - Zhiwei Yu, Raymond K. Wong, Chi-Hung Chi:
Scalable context-aware role mining with MapReduce. 467-474 - Yusheng Xie, Zhengzhang Chen, Kunpeng Zhang, Chen Jin, Yu Cheng, Ankit Agrawal, Alok N. Choudhary:
Elver: Recommending Facebook pages in cold start situation without content features. 475-479 - Paul Logasa Bogen, Christopher T. Symons, Amber McKenzie, Robert M. Patton, Robert E. Gillen:
Massively scalable near duplicate detection in streams of documents using MDSH. 480-486 - Ahmet Erdem Sariyüce, Kamer Kaya, Erik Saule, Ümit V. Çatalyürek:
Incremental algorithms for closeness centrality. 487-492 - Bo Zhang, Zhongzhi Shi:
Classification of big velocity data via cross-domain Canonical Correlation Analysis. 493-498 - Frank K. H. A. Dehne, Q. Kong, Andrew Rau-Chaplin, Hamidreza Zaboli, R. Zhou:
A distributed tree data structure for real-time OLAP on cloud architectures. 499-505 - Jiangling Yin, Andrew Foran, Jun Wang:
DL-MPI: Enabling data locality computation for MPI-based data-intensive applications. 506-511 - Chenxia Wu, Haiqin Yang, Jianke Zhu, Jiemi Zhang, Irwin King, Michael R. Lyu:
Sparse Poisson coding for high dimensional document clustering. 512-517 - Martin Weidner, Jonathan Dees, Peter Sanders:
Fast OLAP query execution in main memory on large data in a cluster. 518-524 - Xudong Zhang, Wayne Xin Zhao, Dongdong Shan, Hongfei Yan:
Group-Scheme: SIMD-based compression algorithms for web text data. 525-530 - Chun-Chieh Chen, Kuan-Wei Lee, Chih-Chieh Chang, De-Nian Yang, Ming-Syan Chen:
Efficient large graph pattern mining for big data in the cloud. 531-536 - Rui Wang, Kenneth Chiu:
A stream partitioning approach to processing large scale distributed graph datasets. 537-542 - Richard McCreadie, Craig Macdonald, Iadh Ounis, Miles Osborne, Sasa Petrovic:
Scalable distributed event detection for Twitter. 543-549 - Barbara Furletti, Lorenzo Gabrielli, Chiara Renso, Salvatore Rinzivillo:
Analysis of GSM calls data for understanding user mobility behavior. 550-555 - Haizhou Fu, HyeongSik Kim, Kemafor Anyanwu:
Scaling concurrency of personalized Semantic search over Large RDF data. 556-562 - Hui Miao, Xiangyang Liu, Bert Huang, Lise Getoor:
A hypergraph-partitioned vertex programming approach for large-scale consensus optimization. 563-568 - Simon Price, Peter A. Flach:
A Higher-order data flow model for heterogeneous Big Data. 569-574 - Daniel Trabold, Henrik Grosskreutz:
Parallel subgroup discovery on computing clusters - First results. 575-579 - Darakhshan J. Mir, Sibren Isaacman, Ramón Cáceres, Margaret Martonosi, Rebecca N. Wright:
DP-WHERE: Differentially private modeling of human mobility. 580-588 - Min-Sheng Lin, Chien-Yi Chiu, Yuh-Jye Lee, Hsing-Kuo Pao:
Malicious URL filtering - A big data application. 589-596 - Maryam Shoaran, Alex Thomo, Jens H. Weber-Jahnke:
Zero-knowledge private graph summarization. 597-605 - Lei Shi, Qi Liao, Xiaohua Sun, Yarui Chen, Chuang Lin:
Scalable network traffic visualization using compressed graphs. 606-612 - Duncan Hodges, Sadie Creese:
Breaking the Arc: Risk control for Big Data. 613-621 - Tim Hegeman, Bogdan Ghit, Mihai Capota, Jan Hidders, Dick H. J. Epema, Alexandru Iosup:
The BTWorld use case for big data analytics: Description, MapReduce logical workflow, and empirical evaluation. 622-630 - Bin Liu, Haifeng Chen, Abhishek B. Sharma, Guofei Jiang, Hui Xiong:
Modeling heterogeneous time series dynamics to profile big sensor data in complex physical systems. 631-638 - Wei Lu, Gang Chen, Anthony K. H. Tung, Feng Zhao:
Efficiently extracting frequent subgraphs using MapReduce. 639-647 - Diego Pennacchioli, Michele Coscia, Salvatore Rinzivillo, Dino Pedreschi, Fosca Giannotti:
Explaining the product range effect in purchase data. 648-656 - Natasha Balac, Tamara B. Sipes, Nicole Wolter, Kenneth Nunes, Robert S. Sinkovits, Homa Karimabadi:
Large Scale predictive analytics for real-time energy management. 657-664 - Geoffrey C. Fox, Deepak R. Mani, Saumyadipta Pyne:
Parallel deterministic annealing clustering and its application to LC-MS data analysis. 665-673 - Diana Moise, Denis Shestakov, Gylfi Þór Gudmundsson, Laurent Amsaleg:
Terabyte-scale image similarity search: Experience and best practice. 674-682 - Matthieu-P. Schapranow, Hasso Plattner:
HIG - An in-memory database platform enabling real-time analyses of genome data. 691-696 - András Garzó, András A. Benczúr, Csaba István Sidló, Daniel Tahara, Erik Francis Wyatt:
Real-time streaming mobility analytics. 697-702 - Andrew Rau-Chaplin, Blesson Varghese, Duane Wilson, Zhimin Yao, Norbert Zeh:
QuPARA: Query-driven large-scale portfolio aggregate risk analysis on MapReduce. 703-709 - Mauricio A. Hernández, Kirsten Hildrum, Prateek Jain, Rohit Wagle, Bogdan Alexe, Rajasekar Krishnamurthy, Ioana Roxana Stanoi, Chitra Venkatramani:
Constructing consumer profiles from social media data. 710-716 - Chien-Chih Chen, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, Jan-Ming Ho:
CloudRS: An error correction algorithm of high-throughput sequencing data based on scalable framework. 717-722 - Jungsuk Kwac, Ram Rajagopal:
Demand response targeting using big data analytics. 683-690 - Adrian Albert, Ram Rajagopal:
Building dynamic thermal profiles of energy consumption for individuals and neighborhoods. 723-728 - Peter Bajcsy, Antoine Vandecreme, Julien Amelot, Phuong Nguyen, Joe Chalfoun, Mary Brady:
Terabyte-sized image computations on Hadoop cluster platforms. 729-737 - Ron Begleiter, Yuval Elovici, Yona Hollander, Ori Mendelson, Lior Rokach, Roi Saltzman:
A fast and scalable method for threat detection in large-scale DNS logs. 738-741 - Matthew Hayes, Sam Shah:
Hourglass: A library for incremental processing on Hadoop. 742-752 - Qi Guo, Yan Li, Tao Liu, Kun Wang, Guancheng Chen, Xiaoming Bao, Wentao Tang:
Correlation-based performance analysis for full-system MapReduce optimization. 753-761 - Mihajlo Grbovic, Jon Malkin, Hirakendu Das:
Large scale ad latency analysis. 762-767 - Alessandro Morari, Vito Giovanni Castellana, David Haglin, John Feo, Jesse Weaver, Antonino Tumeo, Oreste Villa:
Accelerating semantic graph databases on commodity clusters. 768-772 - Peter Lubell-Doughtie, Jon Sondag:
Practical distributed classification using the Alternating Direction Method of Multipliers algorithm. 773-776 - Varun Sharma, Jeremy Carroll, Abhi Khune:
Scaling deep social feeds at Pinterest. 777-783 - Thibaud Chardonnens, Philippe Cudré-Mauroux, Martin Grund, Benoit Perroud:
Big data analytics on high Velocity streams: A case study. 784-787
Workshop 1: Distributed Storage Systems and Coding for Big Data
- Iryna Andriyanova, Alan Jule, Emina Soljanin:
The Code rebalancing problem for a storage-flexible Data Center Network. 1-6 - Wasim Ahmad Bhat, S. M. K. Quadri:
suvfs: A virtual file system in userspace that supports large files. 7-11 - Antonio Campello, Vinay A. Vaishampayan:
Reliability of erasure coded storage systems: A geometric approach. 12-16 - Yih-Farn Chen, Scott Daniels, Marios Hadjieleftheriou, Pingkai Liu, Chao Tian, Vinay A. Vaishampayan:
Distributed storage evaluation on a three-wide inter-data center deployment. 17-22 - Vinay Deolalikar:
Paired-replicas with constant repair time: Loss functions and memorylessness. 23-27 - Kyumars Sheykh Esmaili, Aatish Chiniah, Anwitaman Datta:
Efficient updates in cross-object erasure-coded storage systems. 28-32 - Hanxu Hou, Kenneth W. Shum, Hui Li:
Construction of exact-BASIC codes for distributed storage systems at the MSR point. 33-38 - Xianxia Huang, Hui Li, Tai Zhou, Yumeng Zhang, Han Guo, Hanxu Hou, Huayu Zhang, Kai Lei:
Minimum storage BASIC codes: A system perspective. 39-43 - Youngjae Kim, Scott Atchley, Geoffroy Vallée, Galen M. Shipman:
Layout-aware I/O Scheduling for terabits data movement. 44-51
Workshop 2: Big Data and the Humanities
- Alberto Acerbi, Vasileios Lampos, R. Alexander Bentley:
Robustness of emotion extraction from 20th century English books. 1-8 - Neal Audenaert, Natalie M. Houston:
VisualPage: Towards large scale analysis of nineteenth-century print culture. 9-16 - Tobias Blanke, Michael Bryant, Mark Hedges:
Back to our data - Experiments with NoSQL technologies in the Humanities. 17-20 - Sheryl Grant, Richard Marciano, Priscilla Ndiaye, Kristan E. Shawgo, Jeff Heard:
The human face of crowdsourcing: A citizen-led crowdsourcing case study. 21-24 - Kathleen Kerr, Bernice L. Hausman, Samah Gad, Waqas Javen:
Visualization and rhetoric: Key concerns for utilizing big data in humanities research: A case study of vaccination discourses: 1918-1919. 25-32 - Amalia S. Levi:
Humanities 'big data': Myths, challenges, and lessons. 33-36 - Ben Miller, Ayush Shrestha, Jason Derby, Jennifer Olive, Karthikeyan Umapathy, Fuxin Li, Yanjun Zhao:
Digging into human rights violations: Data modelling and collective memory. 37-45 - Vu Dung Nguyen, Blesson Varghese, Adam Barker:
The royal birth of 2013: Analysing and visualising public sentiment in the UK using Twitter. 46-54 - Andrew Prescott:
Bibliographic records as humanities big data. 55-58 - C. J. Rupp, Paul Rayson, Alistair Baron, Christopher Donaldson, Ian N. Gregory, Andrew Hardie, Patricia Murrieta-Flores:
Customising geoparsing and georeferencing for historical texts. 59-62 - Jedrzej Rybicki, Benedikt von St. Vieth, Daniel Mallmann:
A concept of Generic Workspace for Big Data Processing in Humanities. 63-70 - W. Brent Seales, Steve Crossan, Mark Yoshitake, Sertan Girgin:
From assets to stories via the Google Cultural Institute Platform. 71-76 - Susan Brown, John Simpson:
The curious identity of Michael Field and its implications for humanities research with the semantic web. 77-85 - David A. Smith, Ryan Cordell, Elizabeth Maddock Dillon:
Infectious texts: Modeling text reuse in nineteenth-century newspapers. 86-94 - Ted Underwood, Michael L. Black, Loretta Auvil, Boris Capitanu:
Mapping mutable genres in structurally complex volumes. 95-103 - Lu Xiao, Yan Luo, Steven High:
CKM: A shared visual analytical tool for large-scale analysis of audio-video interviews. 104-112 - Weijia Xu, Maria Esteva, Jessica Trelogan, Todd Swinson:
A case study on entity Resolution for Distant Processing of big Humanities data. 113-120
Workshop 3: Workshop on Big Data and Society
- Vinay Deolalikar:
Enterprise pre-sales forums: A preliminary study of metadata and content. 1-4 - Roman Ferrando-Llopis, David López-Berzosa, Catherine Mulligan:
Advancing value creation and value capture in data-intensive contexts. 5-9 - Wen-Chiao Hsu, Jyun-Yao Huang, Chi-Hao Chen, Chien-Yu Su, Hsiao-Chen Shih, Tzu-Ya Liao, I-En Liao:
A cloud service for the evaluation of company's financial health using XBRL-based financial statements. 10-14 - Janez Kranjc, Vid Podpecan, Nada Lavrac:
Real-time data analysis in ClowdFlows. 15-22 - Udo Kroon:
Ma3tch: Privacy and knowledge: 'Dynamic networked collective intelligence'. 23-31 - F. Canari Pembe Muhtaroglu, Seniz Demir, Murat Obali, Canan Girgin:
Business model canvas perspective on big data applications. 32-37 - Pantelis Koutroumpis, Aija Leiponen:
Understanding the value of (big) data. 38-42 - Slobodanka Dana Kathrin Tomic, Anna Fensel:
OpenFridge: A platform for data economy for energy efficiency data. 43-47 - Wen Zhou, Shutao Ye, Xiaolong Lu:
A study of innovation network database Construction by using big data and an enterprise strategy model. 48-52 - Chao Wu, Yike Guo:
Enhanced user data privacy with pay-by-data model. 53-57 - Helen X. Xiang:
Query optimization over a heterogeneously distributed scientific database. 58-64 - Wuheng Luo:
Enterprise data economy: A hadoop-driven model and strategy. 65-70
Workshop 4: The First Workshop on Benchmarks, Performance Optimization, and Emerging hardware of Big Data Systems and Applications (BPOE 2013)
- Wei-Chun Chung, Yu-Jung Chang, Chien-Chih Chen, Der-Tsai Lee, Jan-Ming Ho:
Optimizing a MapReduce module of preprocessing high-throughput DNA sequencing data. 1-6 - Tyler Clemons, S. M. Faisal, Shirish Tatikonda, Charu C. Aggarwal, Srinivasan Parthasarathy:
Hash in a flash: Hash tables for flash devices. 7-14 - Martin Dimitrov, Karthik Kumar, Patrick Lu, Vish Viswanathan, Thomas Willhalm:
Memory system characterization of big data workloads. 15-22 - Yaakoub El Khamra, Niall Gaffney, David Walling, Eric A. Wernert, Weijia Xu, Hui Zhang:
Performance evaluation of R with Intel Xeon Phi coprocessor. 23-30 - Jing Quan, Yingjie Shi, Ming Zhao, Wei Yang:
The implications from benchmarking three big data systems. 31-38 - Taoying Liu, Jing Liu, Hong Liu, Wei Li:
A performance evaluation of Hive for scientific data management. 39-46 - Shengyuan Liu, Jungang Xu, Zongzhen Liu, Xu Liu:
Evaluating task scheduling in hadoop-based cloud systems. 47-53 - Xi Luo, Walid A. Najjar, Vagelis Hristidis:
Efficient near-duplicate document detection using FPGAs. 54-61 - Stephan Müller, Lars Butzmann, Stefan Klauck, Hasso Plattner:
Workload-aware aggregate maintenance in columnar in-memory databases. 62-69 - Fengfeng Ning, Chuliang Weng, Yuan Luo:
Virtualization I/O optimization based on shared memory. 70-77 - Pengfei Chen, Yong Qi, Xinyi Li, Li Su:
An ensemble MIC-based approach for performance diagnosis in big data platform. 78-85 - Shinichi Yamagiwa, Hiroshi Sakamoto:
A reconfigurable stream compression hardware based on static symbol-lookup table. 86-93 - Dong Yang, Xiang Zhong, Dong Yan, Fangqin Dai, Xusen Yin, Cheng Lian, Zhongliang Zhu, Weihua Jiang, Gansha Wu:
NativeTask: A Hadoop compatible framework for high performance. 94-101 - Tao Zhong, Kshitij A. Doshi, Xi Tang, Ting Lou, Zhongyan Lu, Hong Li:
On mixing high-speed updates and in-memory queries: A big-data architecture for real-time analytics. 102-109 - Runlin Zhou, Yingjie Shi, Chunge Zhu:
AxPUE: Application level metrics for power usage effectiveness in data centers. 110-117 - Wen Xiong, Zhibin Yu, Zhendong Bei, Juanjuan Zhao, Fan Zhang, Yubin Zou, Xue Bai, Ye Li, Cheng-Zhong Xu:
A characterization of big data benchmarks. 118-125
Workshop 5: The First Workshop on Big Data Visualization
- Leilani Battle, Michael Stonebraker, Remco Chang:
Dynamic reduction of query result sets for interactive visualizaton. 1-8 - Joseph A. Cottam, Andrew Lumsdaine, Peter Wang:
Overplotting: Unified solutions under Abstract Rendering. 9-16 - Alex Endert, Russ Burtner, Nick Cramer, Ralph Perko, Shawn D. Hampton, Kristin A. Cook:
Typograph: Multiscale spatial exploration of text documents. 17-24 - Jean-Francois Im, Felix Giguere Villegas, Michael J. McGuffin:
VisReduce: Fast and responsive incremental information visualization of large datasets. 25-32 - Peter Kristof, Bedrich Benes, Carol X. Song, Lan Zhao:
A system for large-scale visualization of streaming Doppler data. 33-40 - Milos Krstajic, Daniel A. Keim:
Visualization of streaming data: Observing change and context in information visualization techniques. 41-47 - Xiaotong Liu, Yifan Hu, Stephen C. North, Han-Wei Shen:
CompactMap: A mental map preserving visual interface for streaming text data. 48-55 - Chris Muelder, Tarik Crnovrsanin, Arnaud Sallaberry, Kwan-Liu Ma:
Egocentric storylines for visual analysis of large dynamic graphs. 56-62 - Eric Papenhausen, Bing Wang, Sungsoo Ha, Alla Zelenyuk, Dan Imre, Klaus Mueller:
GPU-accelerated incremental correlation clustering of large data with visual feedback. 63-70 - Florian Reichl, Marc Treib, Rüdiger Westermann:
Visualization of big SPH simulations via compressed octree grids. 71-78 - Zhangye Wang, Chang Chen, Juanxia Zhou, Jiyuan Liao, Wei Chen, Ross Maciejewski:
A novel visual analytics approach for clustering large-scale social data. 79-86 - Frederik Wiehr, Vidya Setlur, Alark Joshi:
DriveSense: Contextual handling of large-scale route map data for the automobile. 87-94
Workshop 6: Big Data and Science: Infrastructure and Services
- Sandro Fiore, Cosimo Palazzo, Alessandro D'Anca, Ian T. Foster, Dean N. Williams, Giovanni Aloisio:
A big data analytics framework for scientific data management. 1-8 - Eloy Gonzales, Bun Theang Ong, Koji Zettsu:
Searching inter-disciplinary scientific big data based on latent correlation analysis. 9-12 - Kulsawasd Jitkajornwanich, Upa Gupta, Sakthi Kumaran Shanmuganathan, Ramez Elmasri, Leonidas Fegaras, John McEnery:
Complete storm identification algorithms from big raw rainfall data using MapReduce framework. 13-20 - Wei Tang, Jared Wilkening, Narayan Desai, Wolfgang Gerlach, Andreas Wilke, Folker Meyer:
A scalable data analysis platform for metagenomics. 21-26 - Karan Vahi, Mats Rynge, Gideon Juve, Rajiv Mayani, Ewa Deelman:
Rethinking data management for big data scientific workflows. 27-35 - Pengfei Xuan, Yueli Zheng, Sapna Sarupria, Amy W. Apon:
SciFlow: A dataflow-driven model architecture for scientific computing using Hadoop. 36-44
Workshop 7: Scalable Machine Learning: Theory and Applications
- Mohammadreza Babaee, Mihai Datcu, Gerhard Rigoll:
Assessment of dimensionality reduction based on communication channel model; application to immersive information visualization. 1-6 - Bonny Banerjee, Jayanta K. Dutta:
Hierarchical feature learning from sensorial data by spherical clustering. 7-13 - Bonny Banerjee, Jayanta K. Dutta:
Efficient learning from explanation of prediction errors in streaming data. 14-20 - Karl Branting:
Distributed Pivot Clustering with arbitrary distance functions. 21-27 - Søren Dahlgaard, Christian Igel, Mikkel Thorup:
Nearest neighbor classification using bottom-k sketches. 28-34 - Ciro Donalek, S. George Djorgovski, Ashish Mahabal, Matthew J. Graham, Andrew J. Drake, Arun Kumar A., N. Sajeeth Philip, Thomas J. Fuchs, Michael J. Turmon, Michael Ting-Chang Yang, Giuseppe Longo:
Feature selection strategies for classifying high dimensional astronomical data sets. 35-41 - Majed Farrash, Wenjia Wang:
How data partitioning strategies and subset size influence the performance of an ensemble? 42-49 - William Gu, Jaesik Choi, Ming Gu, Horst D. Simon, Kesheng Wu:
Fast Change Point Detection for electricity market analysis. 50-57 - Hong Gu, Junzhe Cao:
A novel integrated method for human multiplex protein subcellular localization prediction. 58-62 - Hisao Ishibuchi, Masakazu Yamane, Yusuke Nojima:
Learning from multiple data sets with different missing attributes and privacy policies: Parallel distributed fuzzy genetics-based machine learning approach. 63-70 - Jiaoyan Chen, Huajun Chen, Xi Chen, Guozhou Zheng, Zhaohui Wu:
Data chaos: An entropy based MapReduce framework for scalable learning. 71-78 - Anthony Kleerekoper, Mikel Luján, Gavin Brown:
Exploring sketches for probability estimation with sublinear memory. 79-86 - Koji Kumanami, Kazuhiro Seki, Kuniaki Uehara:
Agglomerative co-clustering for synonymous phrases based on common effects and influences. 87-94 - Zhiyuan Lin, Duen Horng (Polo) Chau, U Kang:
Leveraging memory mapping for fast and scalable graph computation on a PC. 95-98 - Bingwei Liu, Erik Blasch, Yu Chen, Dan Shen, Genshe Chen:
Scalable sentiment classification for Big Data analysis using Naïve Bayes Classifier. 99-104 - Xuan Liu, Xiaoguang Wang, Stan Matwin, Nathalie Japkowicz:
Meta-learning for large scale machine learning with MapReduce. 105-110 - Sandy Moens, Emin Aksehirli, Bart Goethals:
Frequent Itemset Mining for Big Data. 111-118 - Haoruo Peng, Ding Liang, Cyrus Choi:
Evaluating parallel logistic regression models. 119-126 - Mahmudur Rahman, Mohammad Al Hasan:
Approximate triangle counting algorithms on multi-cores. 127-133 - Anton Slutsky, Xiaohua Hu, Yuan An:
Tree Labeled LDA: A Hierarchical model for web summaries. 134-140 - Kristoffer Stensbo-Smidt, Christian Igel, Andrew Zirm, Kim Steenstrup Pedersen:
Nearest neighbour regression outperforms model-based prediction of specific star formation rate. 141-144 - Naveen C. Tewari, Hari M. Koduvely, Sarbendu Guha, Arun Yadav, Gladbin David:
MapReduce implementation of Variational Bayesian Probabilistic Matrix Factorization algorithm. 145-152 - Xusen Yin, Bin Wu, Xiuqin Lin:
A unified framework for predicting attributes and links in social networks. 153-160 - Zijian Zhang, Timothy C. Havens:
Scalable approximation of kernel fuzzy c-means. 161-168 - Yun Zhu, Yanqing Zhang, Yi Pan:
Large-scale restricted boltzmann machines on single GPU. 169-174
Workshop 8: Big Data in Bioinformatics and Health Informatics
- Ankit Agrawal, Reda Al-Bahrani, Mark J. Russo, Jaishankar Raman, Alok N. Choudhary:
Lung transplant outcome prediction using UNOS data. 1-8 - Reda Al-Bahrani, Ankit Agrawal, Alok N. Choudhary:
Colon cancer survival prediction using ensemble data mining on SEER data. 9-16 - Raghunath Nambiar, Ruchie Bhardwaj, Adhiraaj Sethi, Rajesh Vargheese:
A look at challenges and opportunities of Big Data analytics in healthcare. 17-22 - Mario A. Bochicchio, Antonella Longo, Lucia Vaira, Antonio Malvasi, Andrea Tinelli:
Multidimensional analysis of fetal growth curves. 23-28 - Xi Chen, Huajun Chen, Ningyu Zhang, Jiaoyan Chen, Zhaohui Wu:
OWL reasoning over big biomedical data. 29-36 - Aaron Smalter Hall, Jun Huan:
KUChemBio: A repository of computational chemical biology data sets. 37-42 - Shinya Hayashi, Kenjiro Taura:
Parallel and memory-efficient Burrows-Wheeler transform. 43-50 - Meeyoung Park, Hariprasad Sampathkumar, Bo Luo, Xue-wen Chen:
Content-based assessment of the credibility of online healthcare information. 51-58 - Christian Seebode, Matthias Ort, Christian R. A. Regenbrecht, Martin Peuker:
BIG DATA infrastructures for pharmaceutical research. 59-63 - Kiyana Zolfaghar, Naren Meadem, Ankur Teredesai, Senjuti Basu Roy, Si-Chi Chin, Brian Muckian:
Big data solutions for predicting risk-of-readmission for congestive heart failure patients. 64-71
Workshop 9: Scholarly Big Data: Challenges & Issues
- Martine De Cock, Senjuti Basu Roy, Swapna Savvana, Vani Mandava, Brian Dalessandro, Claudia Perlich, William Cukierski, Benjamin Hamner:
The Microsoft Academic Search challenges at KDD Cup 2013. 1-4 - Philipp Mayr, Peter Mutschke:
Bibliometric-enhanced retrieval models for big scholarly information systems. 5-8 - Michael E. Payne, Linh Bao Ngo, Amy W. Apon:
Academic publishing as a social media paradigm. 9-12 - (Withdrawn) Big spatial data mining. 13-21
Workshop 10: Scalable Cloud Data Management
- Karamjit Kaur, Rinkle Rani:
Modeling and querying data in NoSQL databases. 1-7 - Lipyeow Lim:
Elastic data partitioning for cloud-based SQL processing systems. 8-16 - Jiamin Lu, Ralf Hartmut Güting:
Parallel SECONDO: Practical and efficient mobility data processing in the cloud. 17-25 - Mahsa Mofidpoor, Nematollaah Shiri, Thiruvengadam Radhakrishnan:
Index-based join operations in Hive. 26-33 - Katerina Stamou, Verena Kantere, Jean-Henry Morin:
SLA data management criteria. 34-42
Workshop 11: Big Data and Smarter Cities
- Harish S. Bhat, Garnet Jason Vaz, Juan C. Meza:
Fast solution of load shedding problems via a sequence of linear programs. 1-6 - Hongfei Li, Buyue Qian, Dhaivat Parikh, Arun Hampapur:
Alarm prediction in large-scale sensor networks - A case study in railroad. 7-14 - Alice Marascu, Pascal Pompey, Eric Bouillet, Olivier Verscheure, Michael Wurst, Martin Grund, Philippe Cudré-Mauroux:
MiSTRAL: An architecture for low-latency analytics on MasSive time series. 15-21 - Timothy H. Savage, Huy T. Vo:
Yellow cabs as red corpuscles. 22-28 - Yogesh Simmhan, Muhammad Usman Noor:
Scalable prediction of energy consumption using incremental time series clustering. 29-36 - M. Anil Yazici, Camille Kamga, Abhishek Singhal:
A big data driven model for taxi drivers' airport pick-up decisions in New York City. 37-44
Workshop 12: Knowledge management and Big Data Analytics
- Ruiwen Chen:
Managing massive graphs in relational DBMS. 1-8 - Benoît Denis, Amine Ghrab, Sabri Skhiri:
A distributed approach for graph-oriented multidimensional analysis. 9-16 - Yucong Duan, Yongzhi Wang, Jinpeng Wei, Ajay Kattepur, Wencai Du:
Constructing E-Tourism platform based on service value broker: A knowledge management perspective. 17-24 - Zhenwen Wang, Weidong Xiao, Bin Ge, Hao Xu:
ADraw: A novel social network visualization tool with attribute-based layout and coloring. 25-32 - Yongzhi Wang, Jinpeng Wei, Mudhakar Srivatsa, Yucong Duan, Wencai Du:
IntegrityMR: Integrity assurance framework for big data analytics and management applications. 33-40 - Helen X. Xiang:
Local join optimization over a heterogeneously distributed scientific database. 41-45 - Hao Xu, Weidong Xiao, Daquan Tang, Jiuyang Tang, Zhenwen Wang:
Core-based community evolution in mobile social networks. 46-51 - Xinran Yu, Turgay Korkmaz:
Super-sequence frequent pattern mining on sequential dataset. 52-59 - Yun Wei Zhao, Willem-Jan van den Heuvel, Xiaojun Ye:
Exploring big data in small forms: A multi-layered knowledge extraction of social networks. 60-67 - Xiang Zhao, Bin Ge, Jiuyang Tang, Weidong Xiao, Haichuan Shang:
Provenance comparison for large-scale knowledge discovery. 68-75
Posters
- Peter Bajcsy, Antoine Vandecreme, Mary Brady:
Re-projection of terabyte-sized images. 1 - Daniel Cheng, Peter Schretlen, Nathan Kronenfeld, Neil Bozowsky, William Wright:
Tile based visual analytics for Twitter big data exploratory analysis. 2-4 - HyeongSik Kim, Kemafor Anyanwu:
Optimizing queries over semantically integrated datasets on MapReduce platforms. 5-6 - Hye-Chung Kum, Ashok Kumar Krishnamurthy, Darshana Pathak, Michael K. Reiter, Stanley C. Ahalt:
Secure Decoupled Linkage (SDLink) system for building a social genome. 7-11 - Lin Li, Saeed Bagheri, Helena Goote, Asif Hasan, Gregg Hazard:
Risk adjustment of patient expenditures: A big data analytics approach. 12-14 - Yunlong Ma, Peng Zhang, Yanan Cao, Li Guo:
Parallel auto-encoder for efficient outlier detection. 15-17 - Teng-Sheng Moh, SivaNaga Prasad Shola:
New factors for identifying influential bloggers. 18-27 - Masaharu Munetomo, Shintaro Bando:
A scalable infrastructure of interactive evolutionary computation to evolve services online with data. 28 - Anmol Rajpurohit:
Big data for business managers - Bridging the gap between potential and value. 29-31 - Shusaku Tsumoto, Shoji Hirano, Haruko Iwata:
Granularity-based temporal data mining in hospital information system. 32-40 - Mengmeng Yang, Yi Zhou, Qu Zhou, Kai Chen, Jianhua He, Xiaokang Yang:
Observation of Matthew Effects in Sina Weibo microblogger. 41-43 - Jin Soung Yoo, Douglas Boulware:
A framework of spatial co-location mining on MapReduce. 44 - Wenrong Zeng, Yuhao Yang, Bo Luo:
Access control for big data using data content. 45-47
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.