Big Data Representation and Visualization

at the IEEE International Conference on Data Mining 2019

Beijing China, November 8-11, 2019

About the Workshop

The first International Workshop on Big Data Representation and Visualization (BDVR) will be held at the IEEE International Conference on Data Mining in Beijing focuses on researchers that are developing techniques for representing and exploring high dimensional biomedical data, including EHR and genomic data types in terms of visualization, data summarization, and representations that bring out structure and patterns in data. The dimensionality and complexity of biomedical and EHR data is increasing in large part due to novel measurement technologies and record collection practices. These necessitate representations that can intuitively reveal structure in the data to clinicians and practitioners. Representations can range from low dimensional visualizations, archetypal and factor analysis, to high dimensional trajectory mapping in datasets. In addition, noise and collection artifacts in the data can preclude certain types of data visualization or analysis; therefore, preprocessing and denoising methods are also of interest at this workshop.

Topics of interest include but are not limited to:

  • Dimensionality reduction
  • Data visualization
  • Data denoising
  • Novel data description and analyses for hypothesis generation
  • Causal and mutual-information based analysis of features in EHR data
  • Deep learning methods for embedding and visualization
  • Archetypal and Factor analysis
  • Manifold learning of data
  • Data imputation and normalization

Important dates

  • Paper submissions due:

    August 1, 2019

  • Paper notifications:

    September 4, 2019

  • Camera-ready deadline

    September 29, 2019

  • Workshop date:

    November 8, 2019

Paper submission

Submissions are limited to 4 pages, and must be formatted according to the IEEE 2-column format. To find IEEE Manuscript Templates, click here.

Papers will be evaluated based on the evaluation criteria of the main IEEE ICDM 2019 conference for research papers. In particular, papers must present original research that has not yet been published and is not under consideration at the ICDM ‘19 main conference.

Accepted papers will be listed on the workshop website, but this workshop does not have proceedings and is non-archival. (This means you can submit a future version of your work to a conference with proceedings).

To submit a paper, please use the online submission system (to be announced).

Organizing committee


Dr. Samah Fodeh is an Assistant Professor in the Department of Emergency Medicine and the Yale Center for medical Informatics at the Yale School of Medicine. She is also affiliated with the Veterans Administration (VA) Connecticut Healthcare Systems. She has a wealth of experience developing algorithms that exploit complementary data modalities to enhance knowledge discovery. Dr. Fodeh’s research focuses on developing unsupervised machine learning methods to transform big clinical text data into a structured representation useful for subsequent analysis such as prediction and visualization. Dr. Fodeh has experience working with social media data as well. She published articles that leverage data from social media outlets to understand more about critical healthcare problems such as suicide and opioid overdose events. At the VA Center of Innovation Pain Research, Informatics, Medical co-morbidities, and Education (PRIME) Center, she investigates different approaches to improve understanding of the complex interactions between pain, opioids, and associated chronic disease and behavioral health factors and to develop efficacious interventions. Dr Fodeh obtained her training with a Ph.D. as a computer scientist from Michigan State University. Her research focused on developing new approaches that utilize multiple and complementary data sources to improve document clustering. She leveraged different types of ontologies to incorporate semantic knowledge while working on that problem.

Dr. Smita Krishnaswamy is an Assistant Professor in the department of Genetics at the Yale School of Medicine and Department of Computer Science in the Yale School of Applied Science and Engineering. She is also affiliated with the Yale Center for Biomedical Data Science, Yale Cancer Center, and Program in Applied Mathematics. Smita’s research focuses on developing unsupervised machine learning methods (especially graph signal processing and deep-learning) to denoise, impute, visualize and extract structure, patterns and relationships from big, high throughput, high dimensional biomedical data. Her methods have been applied variety of datasets from many systems including embryoid body differentiation, the epithelial-to-mesenchymal transition in breast cancer, lung cancer immunotherapy, infectious disease data, gut microbiome data and patient data. Smita completed her postdoctoral training at Columbia University in the systems biology department where she focused on learning computational models of cellular signaling from single-cell mass cytometry data. She was trained as a computer scientist with a Ph.D. from the University of Michigan’s EECS department where her research focused on algorithms for automated synthesis and probabilistic verification of nanoscale logic circuits. Following her time in Michigan, Smita spent 2 years at IBM’s TJ Watson Research Center as a researcher in the systems division where she worked on automated bug finding and error correction in logic.

Dr. Guy Wolf is an Assistant Professor in the Department of Mathematics and Statistics at the Université de Montréal. He is also affiliated with IVADO – the institute of data valorization at Montreal. He holds an M.Sc. and a Ph.D. in computer science from Tel Aviv University, and prior to joining the Université de Montréal, he was a postdoctoral researcher in the Department of Computer Science at Ecole Normale Superieure in Paris (France) and a Gibbs Assistant Professor in the Applied Mathematics Program at Yale University. His research focuses on manifold learning and geometric deep learning for exploratory data analysis, including methods for dimensionality reduction, visualization, denoising, data augmentation, and coarse graining. Further, he is particularly interested in biomedical data exploration applications of such methods, e.g., in single cell genomics/proteomics and neuroscience.

Tentative Program Committee

  • Matthew Hirn, Michigan State University
  • Gal Mishne, University of California, San Diego
  • Mauro Maggioni, John Hopkins University
  • Yuval Kluger, Yale University
  • Kevin Moon, Utah State University
  • Elana Fertig, John Hopkins University
  • Jian Tang, HEC Montreal

Contact the organizers

If you have any questions, contact the organizers at