Personal tools

Two guest speakers at Lunch Seminar: Johannes Gehrke, Cornell University and Themis Palpanas, University of Trento

— filed under:

What
  • Systems Group Event
When Dec 09, 2011
from 12:00 PM to 02:00 PM
Where CAB E 72
Add event to calendar vCal
iCal

Speaker 1 : Johannes Gehrke, Cornell University

Title 1 : Declarative Data-Driven Coordination Through Entanglement

Abstract:

There are many web applications that require users to coordinate and communicate. Friends want to coordinate travel plans, students want to jointly enroll in the same set of courses, and busy professionals want to coordinate their schedules. These tasks are difficult to program using existing abstractions provided by database systems since they all require some type of coordination between users. However, this type of information flow is fundamentally incompatible with classical isolation in database transactions. In this talk, I will argue that it is time to look beyond isolation towards principled and elegant abstractions that allow for communication and coordination between some notion of (suitably generalized) transactions. This new area of declarative data-driven coordination is motivated by many novel applications and is full of challenging research problems. This talk describes joint work with Gabriel Bender, Nitin Gupta, Christoph Koch, Lucja Kot, Milos Nikolic, and Sudip Roy.

Short bio:

Johannes Gehrke is a Professor in the Department of Computer Science at Cornell University. Johannes' research interests are in the areas of database systems, data mining, data privacy, and applications of database and data mining technology to marketing and the sciences. Johannes has received a National Science Foundation Career Award, an Arthur P. Sloan Fellowship, an IBM Faculty Award, the Cornell College of Engineering James and Mary Tien Excellence in Teaching Award, the Cornell University Provost's Award for Distinguished Scholarship, a Humboldt Research Award from the Alexander von Humboldt Foundation, and the 2011 IEEE Computer Society Technical Achievement Award. He is the author of numerous publications on data mining and database systems, and he co-authored the undergraduate textbook Database Management Systems (McGrawHill (2002), currently in its third edition), used at universities all over the world. Johannes is also an Adjunct Faculty Member at the University of Tromsø in Norway. Johannes was co-Chair of the 2003 ACM SIGKDD Cup, Program co-Chair of the 2004 ACM International Conference on Knowledge Discovery and Data Mining (KDD 2004), Program Chair of the 33rd International Conference on Very Large Data Bases (VLDB 2007), and Program co-Chair of the 28th IEEE International Conference on Data Engineering (ICDE 2012). From 2007 to 2008, he was Chief Scientist at FAST, A Microsoft Subsidiary.

 

Speaker 2: Themis Palpanas, University of Trento

Title 2: Indexing and Mining Scientific Data: Beyond One Billion Data Series

There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to index and mine very large collections of data series. Examples of such applications come from astronomy, biology, the web, and other domains. It is not unusual for these applications to involve numbers of data series in the order of hundreds of millions to billions. However, all relevant techniques that have been proposed in the literature so far have not considered any data collections much larger than one-million data series.

In this paper, we describe iSAX 2.0 and its improvements, iSAX 2.0 Clustered and iSAX2+, three methods designed for indexing and mining truly massive collections of data series. We show that the main bottleneck in mining such massive datasets is the time taken to build the index, and we thus introduce a novel bulk loading mechanism, the first of this kind specifically tailored to a data series index. We show how our methods allows mining on datasets that would otherwise be completely untenable, including the first published experiments to index one billion data series, and experiments in mining massive data from domains as diverse as entomology, DNA and web-scale image collections.

Short Bio:

Themis Palpanas is a professor of computer science at the University of Trento, Italy. He received the BS degree from the National Technical University of Athens, Greece, and the MSc and PhD degrees from the University of Toronto, Canada. Before joining the University of Trento, he worked at the IBM T.J. Watson Research Center. He has also been a Visiting Professor at the National University of Singapore, worked for the University of California, Riverside, and visited Microsoft Research and the IBM Almaden Research Center. His interests include data management, data analysis, streaming algorithms, and business process management. His research solutions have been implemented in world-leading commercial data management products and he is the author of five US patents, three of which are part of commercial products in multi-billion dollar markets. He is the recipient of two Best Paper awards (ICDE 2010 and ADAPTIVE 2009).

He is a founding member of the Event Processing Technical Society, and is serving on the Editorial Advisory Board of the Information Systems Journal and as an Associate Editor in the Journal of Intelligent Data Analysis. He is a General Co-Chair for VLDB 2013, has served on the program committees of several top database and data mining conferences, including SIGMOD, VLDB, ICDE, KDD, and ICDM, and has been a member of the IBM Academy of Technology Study on Event Processing. He is serving as a reviewer for the European Commission Framework Programme, the Natuaral Sciences and Engineering Research Council of Canada (NSERC), and the Netherlands Organisation for Scientific Research (NWO). His research has been funded by the National Science Foundation (USA), the 7th Framework Program (EU), the Autonomous Province of Trento (Italy), and IBM.

 

 

Document Actions