Data & Data Types
Data
Anything that can be stored on a computer
Variables and Features
Variables are for statistical analysis, features are for machine learning.
Quantitative vs. Qualitative
Quantitative: What you can measure objectively
- Integer, Discrete
- Continuous
Qualitative: What you can’t easily measure, but can observe subjectively- Nominal
- Ordinal
- Character
- Binomial
Date-Time Data
Date Components, Separator, Time Components
We categorize data storage by how structured it is
- Semi-structured
- Spreadsheets (tabular data)
- JSON & XML
- Structured
- Relational dadtabases (SQL)
- Unstructured
- Everything else: video, audio, images, websites, apps, text, etc.
Unstructured Data
All .csv can be .xlsx, but not all .xlsx can be .csv
RFC-1480 CSV "standards"
JSON - JavaScript Object Notation
known as a data-interchange format, language agnostic
ECMA-404 JSON Standards
XML - Extensible Markup Language
Structured Data
Types of Databases
- Relational
- Analytical
- Key-Value
- Column-Family
- Graph
- Document
How to get data from a RDB (Relational Database): Use a query (SQL code)
RDB
A piece of software that enables you to relate tables of data together and perform actions on that data