Two guest speakers at Lunch Seminar: Johannes Gehrke, Cornell University and Themis Palpanas, University of Trento
| What |
|
|---|---|
| When |
Dec 09, 2011 from 12:00 PM to 02:00 PM |
| Where | CAB E 72 |
| Add event to calendar |
|
Speaker 1 : Johannes Gehrke, Cornell University
Title 1 : Declarative Data-Driven Coordination Through Entanglement
Abstract:
There are many web applications that require users to coordinate and
communicate. Friends want to coordinate travel plans, students want to
jointly enroll in the same set of courses, and busy professionals want
to coordinate their schedules. These tasks are difficult to program
using existing abstractions provided by database systems since they all
require some type of coordination between users. However, this type of
information flow is fundamentally incompatible with classical isolation
in database transactions. In this talk, I will argue that it is time to
look beyond isolation towards principled and elegant abstractions that
allow for communication and coordination between some notion of
(suitably generalized) transactions. This new area of declarative
data-driven coordination is motivated by many novel applications and is
full of challenging research problems. This talk describes joint work
with Gabriel Bender, Nitin Gupta, Christoph Koch, Lucja Kot, Milos
Nikolic, and Sudip Roy.
Short bio:
Johannes Gehrke is a Professor in the Department of Computer Science at
Cornell University. Johannes' research interests are in the areas of
database systems, data mining, data privacy, and applications of
database and data mining technology to marketing and the sciences.
Johannes has received a National Science Foundation Career Award, an
Arthur P. Sloan Fellowship, an IBM Faculty Award, the Cornell College of
Engineering James and Mary Tien Excellence in Teaching Award, the
Cornell University Provost's Award for Distinguished Scholarship, a
Humboldt Research Award from the Alexander von Humboldt Foundation, and
the 2011 IEEE Computer Society Technical Achievement Award. He is the
author of numerous publications on data mining and database systems, and
he co-authored the undergraduate textbook Database Management Systems
(McGrawHill (2002), currently in its third edition), used at
universities all over the world. Johannes is also an Adjunct Faculty
Member at the University of Tromsø in Norway. Johannes was co-Chair of
the 2003 ACM SIGKDD Cup, Program co-Chair of the 2004 ACM International
Conference on Knowledge Discovery and Data Mining (KDD 2004), Program
Chair of the 33rd International Conference on Very Large Data Bases
(VLDB 2007), and Program co-Chair of the 28th IEEE International
Conference on Data Engineering (ICDE 2012). From 2007 to 2008, he was
Chief Scientist at FAST, A Microsoft Subsidiary.
Speaker 2: Themis Palpanas, University of Trento
Title 2: Indexing and Mining Scientific Data: Beyond One Billion Data Series
There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to index and mine very large collections of data series. Examples of such applications come from astronomy, biology, the web, and other domains. It is not unusual for these applications to involve numbers of data series in the order of hundreds of millions to billions. However, all relevant techniques that have been proposed in the literature so far have not considered any data collections much larger than one-million data series.
In this paper, we describe iSAX 2.0 and its improvements, iSAX 2.0 Clustered and iSAX2+, three methods designed for indexing and mining truly massive collections of data series. We show that the main bottleneck in mining such massive datasets is the time taken to build the index, and we thus introduce a novel bulk loading mechanism, the first of this kind specifically tailored to a data series index. We show how our methods allows mining on datasets that would otherwise be completely untenable, including the first published experiments to index one billion data series, and experiments in mining massive data from domains as diverse as entomology, DNA and web-scale image collections.
Short Bio:
Themis Palpanas is a professor of computer science at the University of Trento, Italy. He received the BS degree from the National Technical University of Athens, Greece, and the MSc and PhD degrees from the University of Toronto, Canada. Before joining the University of Trento, he worked at the IBM T.J. Watson Research Center. He has also been a Visiting Professor at the National University of Singapore, worked for the University of California, Riverside, and visited Microsoft Research and the IBM Almaden Research Center. His interests include data management, data analysis, streaming algorithms, and business process management. His research solutions have been implemented in world-leading commercial data management products and he is the author of five US patents, three of which are part of commercial products in multi-billion dollar markets. He is the recipient of two Best Paper awards (ICDE 2010 and ADAPTIVE 2009).
He is a founding member of the Event Processing Technical Society, and is serving on the Editorial Advisory Board of the Information Systems Journal and as an Associate Editor in the Journal of Intelligent Data Analysis. He is a General Co-Chair for VLDB 2013, has served on the program committees of several top database and data mining conferences, including SIGMOD, VLDB, ICDE, KDD, and ICDM, and has been a member of the IBM Academy of Technology Study on Event Processing. He is serving as a reviewer for the European Commission Framework Programme, the Natuaral Sciences and Engineering Research Council of Canada (NSERC), and the Netherlands Organisation for Scientific Research (NWO). His research has been funded by the National Science Foundation (USA), the 7th Framework Program (EU), the Autonomous Province of Trento (Italy), and IBM.



