NOTE: The interview this article is based on was conducted in March 2020.
It’s not about building technology; it’s about finding a solution to an existing problem for the research group of Ihab Ilyas, a professor with the David R. Cheriton School of Computer Science at the University of Waterloo.
The first problem was data uncertainty. Ilyas has always been focused on data science, not in a general sense, but to provide high-quality data for analytics. While working on uncertain and probabilistic databases, Ilyas and his team discovered that uncertainty mostly comes from dirtiness in data. In the first phase of his research, Ilyas and his team tried to understand uncertainty in the data we collect, and in 2009 started building systems for data cleaning.
Ilyas realized that you need to build a system to get to the deeper problem. The next issue became data profiling, to mine the underlying patterns and constraints that apply to a given data set. The group looked at different types of constraints on the data, how to represent them, and how to force the data to satisfy the constraints.
In 2016, the availability of so many machine learning tools triggered a shift in their research. Along with their collaborators, they discovered that the problems they struggled with could be solved using machine learning tools, and a new line of research came into view: data curation as a statistical inference problem.
Leading a problem-focused group made it natural to work with industry. Ilyas started consulting early in his career because of his keen interest in applying his research. Collaborations followed with companies including IBM and Google. With access to an entrepreneurial ecosystem in Waterloo Region, he started collaborating with several tech startups.
With a unique intellectual policy at the University of Waterloo, a vibrant entrepreneurial ecosystem, and an opportunity to team up with a sister group at Massachusetts Institute of Technology (MIT), Ilyas co-founded a startup in 2013. Ilyas, Michael Stonebraker, and Andy Palmer led a team of researchers working on data mastering at scale and started Tamr. The company provides a way for large enterprises to consume accurate, up-to-date, unified data from several sources. It now employs around 150 people, and the founders have raised more than USD 70M.
“We took what we had done through our research and got this massive opportunity to talk to customers,” said Ilyas. “I’ve talked to a lot of Fortune 1000 companies over the last six to seven years. It led to them adopting the product, but it also inspired us to solve other problems that the companies weren’t solving.”
New problems led to another startup – Inductiv – based in Waterloo. Started in 2019 with some of his students, Theodoros Rekatsinas at the University of Wisconsin and Christopher Ré at Stanford University, Inductiv makes use of a new wave of a modern machine learning platform for predictions on structured data. The team recognized a market gap beyond data integration and beyond data quality – the ability to understand how structured data is generated and scalable engines that can abstract much of it at once. The new methods abstract most data problems as predictions on structured data, including cleaning, missing value imputation, error detection and prediction.
Known as a problem-solver, his work with industry and his customers has opened new doors for Ilyas. He has opportunities to share his research with audiences beyond academia. He regularly writes blog posts in O’Reilly, Medium, and Towards Data Science and gives general lectures for industry practitioners at conferences such as O’Reilly Data and Machine Learning Summits.
The flexibility and freedom for inventors at the University of Waterloo to take their ideas to the next level, not only encouraged Ilyas to co-found one company based on his research but two. Always problem-driven instead of technology-driven, Ilyas looks at problems differently, especially after his experiences with his customers. Many academics focus on elegant solutions and how to abstract the bigger problem, but Ilyas has learned to listen to his customers to see something others can’t see and shape the technology that millions will adopt to solve their problem.
“Working with industry gives us access to problems that nobody else has, data and resources we don’t have. And in a research setting, we can explore it quickly,” said Ilyas. “It might be normal for a non-academic entrepreneur to solve problems, but it’s important for an academic to know they’re solving someone’s problem.”