6 Divisions of Data Science

  • Data exploration
  • Data representation and transformation
  • Computing
  • Modeling
  • Data viz
  • The science of data science

Prediction vs. Inference

Prediction

  • Accuracy with new data
  • ”Black box” methods like neural networks
  • Located in computer science fields
    Inference
  • Understanding input-output relationships
  • Creates interpretable, well-fitting models
  • Belief in a “true” underlying model
  • Central to traditional statistics

Common Task Framework

Key to predictive modeling

  • Reducing prediction errors over time
  • Best algorithms vary with dataset type
  • Focus on results, not the data’s true source

Project Ideas

https://www.kaggle.com/datasets/sveta151/tiktok-popular-songs-2022/data

  • Predict how popular a song will be on tiktok given certain features?
  • Merge lyrics data to dataset and do topic modeling NLP to determine common topics, phrases, etc.

https://www.kaggle.com/datasets/ludmin/billboard

  • Sentiment analysis to determine if popular songs are more happy, sad, lovey, etc.
  • Correlation between billboard, radio, hot1000, etc.?