+ - 0:00:00
Notes for current slide
Notes for next slide

Quantitative Data Management

CRMW Workshop 2

Charles Lanfear

27 Feb 2025
Updated: 24 Feb 2025

1 / 7

Today

A research question:

Controlling for density, how is deprivation related to crime in London?



2 / 7

Today

A research question:

Controlling for density, how is deprivation related to crime in London?



Today we will:

  • Create a basic project
  • Load data we will need
  • Prepare the data for analysis
2 / 7

Today

A research question:

Controlling for density, how is deprivation related to crime in London?



Today we will:

  • Create a basic project
  • Load data we will need
  • Prepare the data for analysis

Next time we will:

  • Visualize our data
  • Model our outcomes
  • Diagnose and (try to) address problems
2 / 7

Setup

  1. Open RStudio
3 / 7

Setup

  1. Open RStudio

  2. In the project menu (top right) select New Project...

3 / 7

Setup

  1. Open RStudio

  2. In the project menu (top right) select New Project...

  3. Select New Directory

    • Place it wherever you want
3 / 7

Setup

  1. Open RStudio

  2. In the project menu (top right) select New Project...

  3. Select New Directory

    • Place it wherever you want
  4. Using the files tab of the bottom-right panel

    • Make sure you are in your project's main directory
    • Create a new code folder
    • Create a new folder called data
    • Create a folders called raw and derived in data
3 / 7

Setup

  1. Open RStudio

  2. In the project menu (top right) select New Project...

  3. Select New Directory

    • Place it wherever you want
  4. Using the files tab of the bottom-right panel

    • Make sure you are in your project's main directory
    • Create a new code folder
    • Create a new folder called data
    • Create a folders called raw and derived in data
  5. Browse to this lecture on the course website

3 / 7

The Data

Save these to the data directory in your project

4 / 7

Get to Work!

We want to produce analysis-ready data:

  • Cross-sectional (one row per unit)
  • Columns for predictors
  • Columns for outcomes
5 / 7

Get to Work!

We want to produce analysis-ready data:

  • Cross-sectional (one row per unit)
  • Columns for predictors
  • Columns for outcomes

The process:

  • Load and clean up each file with separate scripts
  • Save derived data as separate files
  • Join together in another script and save the analysis data
5 / 7

Get to Work!

We want to produce analysis-ready data:

  • Cross-sectional (one row per unit)
  • Columns for predictors
  • Columns for outcomes

The process:

  • Load and clean up each file with separate scripts
  • Save derived data as separate files
  • Join together in another script and save the analysis data

We'll have at least four scripts!

  • We'll start by making 1_process-metro.R
    • Numbers make run order clear
5 / 7

Get to Work!

We want to produce analysis-ready data:

  • Cross-sectional (one row per unit)
  • Columns for predictors
  • Columns for outcomes

The process:

  • Load and clean up each file with separate scripts
  • Save derived data as separate files
  • Join together in another script and save the analysis data

We'll have at least four scripts!

  • We'll start by making 1_process-metro.R
    • Numbers make run order clear

Let's work on this together!

5 / 7

Cleaning Data


6 / 7

Wrap-Up

Practice!

  • Data management is the hardest and most time consuming part of any project
  • You get good with practice and intentional improvement
7 / 7

Today

A research question:

Controlling for density, how is deprivation related to crime in London?



2 / 7
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow