Predictive Analytics using Structured and Unstructured Data

Duration: Half day

Short Description: Data mining and text analytics provide a rich repository of algorithms and techniques to mine large volumes of unstructured text and extract structured information by linking their contents, grouping, summarizing and uncovering patterns from the data. Once structured, this content can be effectively utilized for predictive tasks. Unstructured data like call-center logs, support and service center communications, emails exchanged within an organization or even information coming from external news sources can provide critical information that can be utilized for prediction of future events. This of course also requires quantification of the information that is extracted from unstructured sources. Quantified information assimilated from unstructured content can then be fed into an analytics system that can churn all this along with other factors derived from structured data using an appropriately designed predictive model.
This tutorial will be specially targeted towards people who are interested to know how business-critical information extracted from unstructured documents can be merged with information flowing in from structured sources and used within a predictive modeling framework to predict future events or forecast values of critical variables that are business performance indicators.


Lipika Dey Dr. Lipika Dey is a Senior Consultant and Principal Scientist at Tata Consultancy Services, India. She heads the Research and Innovation program on Real Time Contextually Aware Enterprises. Her focus is on seamless integration of social intelligence and business intelligence using multi-structured data analytics. Her research interests are in the areas of Natural Language Processing, Text and Data Mining, Machine Learning and Semantic Search. She is in the Program Committee of several Data Mining and NLP conferences like KDD, WWW etc.
Lipika did her Integrated M.Sc. in Mathematics, M.Tech in Computer Science and Data Processing and a Ph.D. in Computer Science and Engineering – all from IIT Kharagpur, India. Prior to joining TCS in 2007, she was a faculty at the Department of Mathematics at IIT Delhi, India from 1995 to 2007. Lipika has also been invited to speak at several Business Conferences like Sentiment Analysis Symposium, San Francisco and New York, Text Analytics Summit at Boston and Language Technology Accelerate, Brussels.

Hardik Meisheri Hardik Meisheri is a researcher in TCS Innovation Labs, Delhi since 2016. He received an M.Tech with specialization in Machine Intelligence from DA-IICT, Gandhinagar. His research areas are spread across different domains of machine learning and data science. He has worked and published paper in EEG based Brain Computer Interface. He has also experience working with Deep learning algorithms specifically related to Natural Language Processing. He has also explored social media data from the perspective of trend detection and sentiment analysis. His project titled “Smart Wheelchair” was selected among top 24 projects in India by ETNow and in top 11 by Government of Goa. He has served as Chairperson for ACM student chapter, DAIICT in 2015-16.