Talk
10 Things I Hate About Feature Engineering for AI
"In Data Science, 80% of time spent prepare data, 20% of time spent complain about need for prepare data." -- BigDataBorat. This continues to be true 12 years later. Teams building AI models and applications love the python ecosystem, but they're often spending more time dealing with data infrastructure rather than model experimentation. In this talk, let's explore all the ways in which feature engineering for AI data, especially multimodal, really sucks. And we'll talk about how to deal with these issues, old and new. No guarantee I can stop at just 10.
About
Chang is the CEO/Co-founder of LanceDB and has been making data tooling for ML/AI for almost two decades. One of the original co-authors of the pandas project, Chang started LanceDB to make it easy for AI teams to work with all of the data that doesn't fit neatly into all of those dataframes - from embeddings to images, from audio to video, at petabyte scale.
