6 Divisions of Data Science
- Data exploration
- Data representation and transformation
- Computing
- Modeling
- Data viz
- The science of data science
Prediction vs. Inference
Prediction
- Accuracy with new data
- “Black box” methods like neural networks
- Located in computer science fields
Inference- Understanding input-output relationships
- Creates interpretable, well-fitting models
- Belief in a “true” underlying model
- Central to traditional statistics
Common Task Framework
Key to predictive modeling
- Reducing prediction errors over time
- Best algorithms vary with dataset type
- Focus on results, not the data’s true source
Project Ideas
https://www.kaggle.com/datasets/sveta151/tiktok-popular-songs-2022/data
- Predict how popular a song will be on tiktok given certain features?
- Merge lyrics data to dataset and do topic modeling NLP to determine common topics, phrases, etc.
https://www.kaggle.com/datasets/ludmin/billboard
- Sentiment analysis to determine if popular songs are more happy, sad, lovey, etc.
- Correlation between billboard, radio, hot1000, etc.?