Selected Tutorials

S.No.Tutorial Title
1Make Wise Decisions for Your DBMSs: Workload Forecasting and Performance Prediction Before Execution (Click Here!)
2Reachability on Large-scale Graphs: Models, Techniques, and Trends (Click Here!)
3A tutorial on biomedical image segmentation using deep learning (Click Here!)
4AI Meets NoSQL Database: Methods, Opportunities and Challenges (Click Here!)
5Time Series Anomaly Detection Toolkit for AI Applications (Click Here!)

1. Make Wise Decisions for Your DBMSs: Workload Forecasting and Performance Prediction Before Execution

Speakers:

  • Zhengtong Yan (University of Helsinki)
  • Jiaheng Lu (University of Helsinki)
  • Qingsong Guo (University of Helsinki)
  • Gongsheng Yuan (University of Helsinki)
  • Calvin Sun (Huawei Toronto)
  • Steven Yang (Huawei Toronto)

Brief outline of the tutorial:

In this tutorial, we will focus on 1) how to forecast the future workloads (e.g., workload shift detection, arrival rate prediction, and next query prediction), and 2) how to analyze the behaviors of the workloads (e.g., execution time prediction and resource usage estimation). We will provide a comprehensive overview and detailed introduction of the two topics, from state-of-the-art methods, real-world applications, to open problems and future directions. Specifically, we will not only discuss traditional methods, such as time-series analysis, Markov modeling, analytical modeling, and experiment-driven methods, but also cover the state-of-the-art AI techniques, including machine learning, deep learning, reinforcement learning, and graph embedding.

Speakers Bio

Zhengtong Yan is a doctoral student at the University of Helsinki. His research topics include autonomous multi-model databases and cross-model query optimization.
Jiaheng Lu is a professor at the University of Helsinki. His main research interests lie in database systems specifically in the challenge of efficient data processing from real-life, massive data repositories and the Web. He has written four books on Hadoop and NoSQL databases, and more than 100 papers published in SIGMOD, VLDB, TODS, and TKDE, etc. He has given several tutorials on multi-model data management and autonomous databases in VLDB, CIKM, and EDBT conferences. He frequently serves as a PC member for conferences including SIGMOD, VLDB, ICDE, EDBT, CIKM, etc.
Qingsong Guo is a postdoctoral researcher at the University of Helsinki His research interests include multi-model databases and automatic management of big data with deep learning.
Gongsheng Yuan is a doctoral student at the University of Helsinki. His research topics lie in databases with quantum theory or reinforcement learning.
Calvin Sun is the Chief Database Architect at Huawei Cloud. He has 20+ years of experience in developing several database systems, ranging from embedded databases, large-scale distributed databases, to cloud-native databases.
Steven Yuan is the Director of Huawei Toronto Distributed Scheduling and Data Engine Lab. He leads a research team in the big data and cloud domain, focusing on distributed scheduling and distributed database, from IaaS to PaaS.

2. Reachability on Large-scale Graphs: Models, Techniques, and Trends

Speakers:

  • Xiaoshuang Chen (Guangzhou University)
  • Long Yuan (Nanjing University of Science and Technology)
  • Wenjie Zhang (University of New South Wales )
  • Ying Zhang (University of Technology Sydney)

Brief outline of the tutorial:

In this tutorial, we will first show the importance and challenges of studying reachability queries. Then, we will introduce the existing reachability models defined over different graphs. The computing algorithms regarding different models and environmental settings will also be presented. Finally, we will discuss future research directions in this important research area.

Speakers Bio

Xiaoshuang Chen is an Associate Professor in the Cyberspace Institute of Advanced Technology, Guangzhou University. Before that, she was a Postdoctoral Fellow in the School of Computer Science and Engineering, University of New South Wales. Her research interest lies in large-scale graph data analysis. She has published several papers in ICDE, VLDB and VLDBJ since 2017.
Long Yuan is a Professor in the School of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research focuses on graph data management and analysis. He has published papers in top venues such as VLDB, WWW, ICDE, VLDBJ, and TKDE.
Wenjie Zhang is an Australian ARC Future Fellow (2021-2025) and Professor in the School of Computer Science and Engineering, University of New South Wales. Her research interests include spatial-temporal data analysis and graph data processing. She has published more than 100 papers in top venues such as TKDE, TODS, VLDBJ, SIGMOD, VLDB, and ICDE. She received the Discovery Early Career Researcher Award in 2011 and the prestigious Chris Wallace Award in 2019.
Ying Zhang is an Australian ARC Future Fellow and Professor at the University of Technology, Sydney (UTS). He has been the head of the database group at the Centre for Artificial Intelligence (CAI) since 2014. His research focuses on efficient query processing and analytics on big data. He has published more than 80 papers in top venues. He had received seven ARC grants which are under the National Competitive Grants Programme (NCGP) including one ARC ADP fellowship, one ARC DECRA fellowship, one ARC future fellowship and four ARC discovery projects.

3. A tutorial on biomedical image segmentation using deep learning

Speakers:

  • Sonali Agarwal (IIIT Allahabad)
  • Krishna Pratap Singh (IIIT Allahabad)
  • Sanjay Kumar Sonbhadra (ITER Bhubaneswar)
  • Narinder Singh Punn (IIIT Allahabad)

Brief outline of the tutorial:

Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. Most medical applications require identifying and localizing the objects or regions (damaged tissues, cells or nuclei) found in medical imaging such as CAT scans, X-Rays, Ultrasound, etc. for diagnosis, monitoring and treatment. This delineation is generally performed by expert clinicians or radiologists which is a complex and time-consuming task. In recent studies, the implication of transfer learning and U-Net based approaches have illustrated state-of-the-art performance in different applications for the development of computer-aided diagnosis systems to localize the infected or damaged tissues or cells in the body using various modalities for early diagnosis and treatment of diseases such as brain tumor, lung cancer, Alzheimer, breast cancer, etc. With this motivation, this tutorial focuses on state-of-the-art deep learning approaches, a critical discussion of open challenges and directions for future research in the area of biomedical image segmentation.

Speakers Bio

Sonali Agarwal is working as an Associate Professor in the Information Technology Department of Indian Institute of Information Technology (IIIT), Allahabad, India. She received her Ph. D. Degree at IIIT Allahabad and joined as faculty at IIIT Allahabad, where she has been teaching since October 2009. She holds Bachelor of Engineering (B.E.) degree in Electrical Engineering from Bhilai Institute of Technology, Bhilai, (C.G.) India and Masters of Engineering (M.E.) degree in Computer Science from Motilal Nehru National Institute of Technology (MNNIT), Allahabad, India Her main research interests are in the areas of Artificial Intelligence and Big Data. She is the head of Big Data Analytics Lab at IIIT Allahabad, India.
Krishna Pratap Singh is working as an Associate professor in the Information Technology Department at IIIT Allahabad, India. He received his PhD and Master from IIT Roorkee. He has been working at IIIT Allahabad since 2009. His main research areas are Machine Learning, Transfer Learning and Optimization.
Sanjay Kumar Sonbhadra is presently working as Assistant Professor in the Computer Science and Engineering Department of ITER, Shiksha ‘O’ Anusandhan, Bhubaneswar, Odisha, India. He is mainly working on One-class classification, Anomaly detection, Target class guided dimensionality reduction and training sample selection techniques and Big data analytics. During 2017-2021, he worked as a senior member of “Big Data Analytics Lab” at IIIT Allahabad, India. He has published many articles in the area of machine learning applications to address recent challenges of COVID-19. He has working experience of machine learning algorithms to address the challenging problem of target specific learning with limited target samples.
Narinder Singh Punn is working as a Teaching Research Assistant (TRA) in the Information Technology Department of Indian Institute of Information Technology (IIIT), Allahabad, India. Narinder’s main research includes Medical Imaging segmentation, Deep learning and Artificial Intelligence techniques in healthcare. He is a senior member of “Big Data Analytics Lab” at IIIT Allahabad, India. His recent publications cover applications of deep learning in the detection and prevention of COVID-19, while also exploiting the potential of self-supervised learning in biomedical image segmentation.

4. AI Meets NoSQL Database: Methods, Opportunities and Challenges

Speakers:

  • Hongzhi Wang (Harbin Institute of Technology)
  • Zhixin Qi (Harbin Institute of Technology)
  • Yu Yan (Harbin Institute of Technology)

Brief outline of the tutorial:

This tutorial is planned for 1.5 hours and consists of the following parts.
(1) Background and Motivation (10’): We introduce the background of AI for database and motivate the need for applying AI techniques on NoSQL database with several scenarios.
(2) Cost Estimation for Graph and Document Databases (20’): We discuss how AI techniques estimate query costs for graph and document databases.
(3) Physical Design for Key-Value and Graph Databases (20’): We introduce physical design methods for key-value and graph databases based on AI techniques.
(4) Index Recommendation for Key-Value and Document Databases (15’): We discuss existing AI-based index recommendation approaches for key-value and document databases.
(5) Opportunities and Challenges (20’): We present the research opportunities and challenges for NoSQL database management based on AI techniques.
(6) Summary (5’): We summarize this tutorial and give our critical thoughts to AI for NoSQL database.

Speakers Bio

Hongzhi Wang, Professor, PHD supervisor, the head of massive data computing center and the vice dean of the honors school of Harbin Institute of Technology, the secretary general of ACM SIGMOD China, outstanding CCF member, a standing committee member CCF databases and a member of CCF big data committee. Research Fields include big data management and analysis, database systems, knowledge engineering and data quality. He was “starring track” visiting professor at MSRA and postdoctoral fellow at University of California, Irvine. Prof. Wang has been PI for more than 10 national or international projects including NSFC key project, NSFC projects and National Technical support project, and co-PI for more than 10 national projects include 973 project, 863 project and NSFC key projects. He also serves as a member of ACM Data Science Task Force. He has won First natural science prize of Heilongjiang Province, MOE technological First award, Microsoft Fellowship, IBM PHD Fellowship and Chinese excellent database engineer. His publications include over 300 papers in the journals and conferences such as VLDB Journal, IEEE TKDE, VLDB, SIGMOD, ICDE and SIGIR, 6 books and 6 book chapters. His PHD thesis was elected to be outstanding PHD dissertation of CCF and Harbin Institute of Technology. He severs as the reviewer of more than 20 international journal including VLDB Journal, IEEE TKDE, and PC members of over 50 international conferences including SIGMOD, VLDB, KDD, ICML, NeurpIS, ICDE, etc. His papers were cited more than 3000 times. His personal website is http://homepage.hit.edu.cn/wang.
Zhixin Qi is currently a PhD student in School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China. She received her bachelor degree from Harbin Engineering University in 2016, and received her master degree from Harbin Institute of Technology in 2018. Her research interests include AI4DB, knowledge graph, and graph data management. She was awarded National Scholarship for PhD students in 2021, National Scholarship for master students in 2017, and National Scholarship for undergraduates in 2014. She has published more than 10 papers in international journals and conferences, including TKDE, KAIS, KBS, Neurocomputing, JCST, CIKM, and DASFAA.
Yu Yan is currently a PhD student in School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China. She received her master degree from Harbin Institute of Technology in 2021. She committed to the research of database tuning, multi-model database, and database auto-management. She got National Scholarship for master students in 2020. She has published many papers in international conferences and journals, such as Information Sciences, ApWeb and etc.

5. Time Series Anomaly Detection Toolkit for AI Applications

Speakers:

  • Dhaval Patel (IBM Research)
  • Dzung Phan (IBM Research)
  • Markus Mueller (IBM Germany)

Brief outline of the tutorial:

The tutorial is organized in a sequence of three sections: Introduction, Theory and Hands-on-demo. In part one, we will briefly discuss foundations of time series dataset with the help of real-world examples. We will also present a broad taxonomy of time series dataset. We will also present general definition of anomalies in time series data and discuss three common variants of Anomaly/Outlier Detection problems. Next, we discuss basic machine learning primitives such as Estimator, Transformer, Data Stationarizer, etc that are useful for building anomaly pipeline. In machine learning field, these components become a backbone for building a complex model learning pipelines. We will formally introduce the key API such as ``fit'', ``predict'', ``decision\_function'', to the participant with the help of 30+ different anomaly detection algorithms. Apart from provide the categorization of these algorithms, we will also discuss one algorithm namely Gaussian Graphical Model for interpretable anomaly detection. The access to the toolkit is made available via IBM API Hub Platform (https://developer.ibm.com/apis/catalog/ai4industry--anomaly-detection-product/Introduction). The example notebooks are accessible at IBM's public github (https://github.com/IBM/anomaly-detection-code-pattern/). The tutorial finally analyzes open issues and future directions in this vibrant and rapidly evolving research area.

Speakers Bio

Dhaval Patel work at IBM Research since 2016. Dr. Dhaval Patel hold PhD in Computer Science from National University of Singapore and master's degree in Information Technology from Indian Institute of Technology – Kharagpur. Dr. Patel is an expert in Data Mining, Machine Learning, Time Series Data Analysis, etc. The significance of his research contributions has been demonstrated in 60+ published papers (10 journal papers and 50+ conference papers) in high impact, refereed, top-notch venues in Database, Data Mining, Big Data and Machine Learning, 1 issued US patent and 18 patent applications. He is recipients of 9 outstanding technical/research accomplishments awards from IBM for advancing AI technology to solve several real-world industrial problems. He is key contributor in many Flagship IBM Product including AutoAI-TS, Maximo Application Suites for Anomaly Detection at Scale, etc.
Dzung Phan is a Research Staff Member at IBM Research, New York, USA since 2010. He received a Ph.D. degree in applied mathematics from the University of Florida in 2010, a M.S. degree in computational engineering from National University of Singapore in 2004, and a B.S. degree in mathematics from Vietnam National University, Hanoi in 2001. His research interests include optimization theory and algorithms, machine learning, and operations research. In particular, he is currently working on anomaly and change detection, sparse learning, and data-driven decision making. He has published more than 40 technical papers in refereed conferences and journals including top machine learning/data mining conferences such as ICML, NeurIPS, IJCAI, and ICDM. He has also filed about 40 U.S. patents. He received the 2012 Pat Goldberg Best Paper Award and a 2020 INFORMS Wagner Prize semi-finalist.
Markus Muller studied Math, Operations Research and Computer Science and has over 2 decades of experience in IT in different roles, mainly as software architect. He has shifted towards Machine Learning in 2018, first as an architect for an NLU related offering, then as data scientist for an offering in the IIoT space. Markus worked at IBM Watson Center, Munich.

Important Dates

Proposal submission Due Dec 7, 2021
Dec 16, 2021
(extended)
Acceptance Notification Dec 28, 2021

*All deadlines are 23:59 Anywhere on Earth (AoE) time