Data & Data Types

Data

Anything that can be stored on a computer

Variables and Features

Variables are for statistical analysis, features are for machine learning.

Quantitative vs. Qualitative

Quantitative: What you can measure objectively

  • Integer, Discrete
  • Continuous
    Qualitative: What you can’t easily measure, but can observe subjectively
  • Nominal
  • Ordinal
  • Character
  • Binomial

Date-Time Data

Date Components, Separator, Time Components

We categorize data storage by how structured it is

  • Semi-structured
    • Spreadsheets (tabular data)
    • JSON & XML
  • Structured
    • Relational dadtabases (SQL)
  • Unstructured
    • Everything else: video, audio, images, websites, apps, text, etc.

Unstructured Data

All .csv can be .xlsx, but not all .xlsx can be .csv

RFC-1480 CSV "standards"

JSON - JavaScript Object Notation

known as a data-interchange format, language agnostic

ECMA-404 JSON Standards

XML - Extensible Markup Language

Structured Data

Types of Databases

  • Relational
  • Analytical
  • Key-Value
  • Column-Family
  • Graph
  • Document

How to get data from a RDB (Relational Database): Use a query (SQL code)

RDB

A piece of software that enables you to relate tables of data together and perform actions on that data