math3080

What does it take to be a data scientist

A data scientist should be:

Skills needed for a data scientist (in order)

  1. Passionate about some field
    • DS for the sake of DS is generally not helpful - you need something to apply it to
    • Pi-Model
    • Many of the team leads are PhD’s (Often in fields other than Data Science)
  2. Social skills
    • Working in teams
  3. Sense of humor
  4. Math knowledge
    • algebra, geometry, calculus, basic statistics for DA
    • more statistics and linear algebra for DS
  5. Technical skills
    • computational thinking
    • some skills can be learned later, but there should at least be a background

Data Science Process and Tools

CRISP-DM: Cross-Industry Standard Process for Data Mining

  1. Business Understanding
  2. Data Understanding
    • Data Management
  3. Data Preparation
    • Data Integration and Transformation
    • ETL (Extract, Transform, and Load)
    • Data Visualization
  4. Modeling (MATH 3280 and 3480)
  5. Evaluation (MATH 3280 and 3480)
  6. Deployment
    • Model Monitoring and Assessment

Tools

  1. Data Asset Management
  2. Code Asset Management (e.g. GitHub)
  3. Development Environments (or IDEs)
  4. Execution Environments