class: center, top, title-slide .title[ # Writing Up Quantitative Research ] .subtitle[ ## CRMW Lecture 6 ] .author[ ### Charles Lanfear ] .date[ ### 1 Feb 2024
Updated: 31 Jan 2024 ] --- # Today ## Quant Writing ## Describing Methods ## Presenting Results --- class: inverse # Quant Writing <br> ![:width 80%](img/stress.jpg) --- ## Target your writing Professional (academic) writing is purposeful **communication** to an audience -- * Know the purpose * Know the audience -- .pull-left[ *To show other academics that results from an experiment falsify a proposition of a theory* ] .pull-right[ *To explain to policy-makers that a policy was not effective in reducing crime* ] -- * Remember: * **Clarity comes first** * Numbers can easily overwhelm or mislead * Numbers are boring—people only care about what they describe -- .text-center[ *Everything you write should serve your purpose and be understood by your audience—numbers included* ] --- ## Matsueda (2019) .pull-left-70[ * Quantitative articles have a common structure + Efficient communication—*deviate at your peril* + **Hourglass** shape * Start with big picture * Narrow to what you did * Relate back to big picture * **Mimic others**: * Prominent outlets * Good writers * Highlight your contributions * Literature reviews are **important** * Position your work in the conversation ] .pull-right-30[ <br> <br> <br> ![:width 95%](img/ross.jpg) ] --- # Structure Empirical articles rarely stray far from this: 1. Abstract 2. Introduction 3. Literature Review / Background 4. **Methods** 5. **Results** 6. **Discussion** 7. Conclusion<sup>1</sup> .footnote[ [1] Often in discussion rather than separate ] -- <br> .text-center[ *We'll mainly focus on methods, results, and discussion* ] --- ## Abstract * An elevator speech: Summarize the entire paper in 200 words -- ## Introduction * **Hook the reader** * State the research question * Highlight contribution or main findings (optional) * Foreshadow the structure of paper -- ## Literature Review / Background * **Motivate the research question and design** * This section is an **argument** not a summary * Support key assumptions and arguments * *Your theoretical model and every variable should be covered* * Lay out hypotheses * *This is often done in its own section between background and methods* --- class: inverse # Data and Methods <br> ![:width 80%](img/klm.png) .footnote[[Knox, Lowe, & Mummulo (2020)](https://doi.org/10.1017/S0003055420000039)] --- ## What to Include .pull-left[ From reading: * **Be complete** + *Describe everything you did* * **Be clear** + *Describe each thing in sufficient detail for the average reader to understand what was done* * **Be credible** + *Justify your choices* + Data and method + What you did should follow logically from your **research design** ] -- .pull-right[ From me: * **Know your audience** + *Journal of Quantitative Criminology* + *Criminology* + Policy reports + Public scholarship * **Lean on references** + "Following Sampson et al. (1997), I..." * **Compartmentalize** + *Make it easy to skip the scary stuff* * **Do not oversell** ] --- ## Data * **Where** and **when** it was collected + e.g., London in 2014–2018 -- * **How** it was collected + *Sampling design and measurement method* + e.g., random-digit-dial phone survey -- * What are the **units**? + *Both for collection and analysis* + e.g., individual survey responses aggregated to wards -- * **Excluded** or **missing** data + *Both what is missing and what you did about it* + Include non-response and loss to follow-up -- * What **transformations** or **modifications** were made + *What you did and why* + e.g., standardization --- ## Methods * Statistical method + *What you did and why—and why **not** other ways* + e.g.: + "Crime was modelled using Poisson regression because the outcome is a strictly non-negative integer." + "We control for unstructured socialization as it is a known confounder for..." + "I did not control for ambient population, because it is a mediator for..." -- * The more established the approach, the less detail you need * Linear regression needs a sentence * A structural equation needs a paragraph * New approaches can require an entire paper -- .text-center[ *Level of detail and justification is context-specific; a common approach in one subfield may be new to another* ] --- ## Communicate the three quantities .pull-left-60[ * **Estimand**: the real-world quantity we want to measure + Your **literature review** justifies interest in the estimand ] .pull-right-40[ ![:width 75%](img/estimation1.png) ] .footnote[Image credit: [Richard McElreath](https://twitter.com/rlmcelreath/status/1582368904529137672)] --- count: false ## Communicate the three quantities .pull-left-60[ * **Estimand**: the real-world quantity we want to measure + Your **literature review** justifies interest in the estimand * **Estimator**: the procedure we use to calculate the estimate using data + Your **methods section** explains and justifies the use of the estimator ] .pull-right-40[ ![:width 75%](img/estimation2.png) ] .footnote[Image credit: [Richard McElreath](https://twitter.com/rlmcelreath/status/1582368904529137672)] --- count: false ## Communicate the three quantities .pull-left-60[ * **Estimand**: the real-world quantity we want to measure + Your **literature review** justifies interest in the estimand * **Estimator**: the procedure we use to calculate the estimate using data + Your **methods section** explains and justifies the use of the estimator * **Estimate**: the calculated quantity approximating our estimand + Your **results section** presents this, under the assumption your estimator is correct ] .pull-right-40[ ![:width 75%](img/estimation.png) ] .footnote[Image credit: [Richard McElreath](https://twitter.com/rlmcelreath/status/1582368904529137672)] --- class: inverse # Presenting Results <br> ![:width 85%](img/fig-2-survival-curves.png) .footnote[ [1] Source: [Lanfear, Bucci, Sampson, & Kirk (2023)](https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2804655) ] --- ## Where numbers go .container[ .column[ .text-center[ ### Text ] * **Key evidence** + Main estimates + Big "hooks" * Important standalone values + e.g., sample size * **Very few at a time** * **Nothing unimportant** ] .column[ .text-center[ ### Table ] * Precise values for look up + **Descriptive statistics** + Model estimates * Many values of different kinds + Data tables ] .column[ .text-center[ ### Figure ] * **Everything else** * Especially: + Comparisons with similar data + Multivariable relationships + Anything hard to explain verbally ] ] --- # Text For values in text (and tables)... .pull-left[ * Use appropriate **precision** + Bad: `\(3.1445 \pm 0.2331\)` + Good: `\(3.14 \pm 0.23\)` * Use appropriate **scaling** + Bad: `\(\beta=0.0025\text{km}\)` + Good: `\(\beta=2.5 \text{m}\)` ] -- .pull-right[ * Use **units** + "30 more burglaries" + "1 SD higher robbery (23 robberies)" + "A 5 percentage point difference" + "25 percent more thefts" ] -- * Recognize **substantive** and **statistical** significance * Detecting something and caring about it are different things -- * .text-center[ *Present only numbers that are related to your objective and meaningful to your audience* ] --- ## Writing about relationships Your design determines if estimated relationships are plausibly causal or not -- .pull-left[ ‍**Causal** language: * "increases" * "affects" * "influences" ] -- .pull-right[ **Non-causal** language: * "is associated with" * "correlates with" * "is positively related to" ] -- <br> .text-center[ *Do not try to be clever about this* ] -- <br> In either case, note when this relationship is found controlling for other variables: > "Conditional on sociodemographic structure of neighborhoods, we find a 1 standard deviation increase in collective efficacy to be associated with a 17.5% reduction in violent crime in the next year." --- ## Text Examples Highlighting big hooks: > Race and ethnic differences in exposure to firearm violence are pronounced, with both forms of exposure highest for Black respondents (Figure 2A). By age 40, an estimated 7.47% of Black and 7.05% of Hispanic respondents had been shot, with one Black and one Hispanic respondent fatally shot. In contrast, an estimated 3.13% of White respondents were victims of a shooting by age 40.<sup>1</sup> .footnote[ [1] [Lanfear, Bucci, Sampson, & Kirk (2023)](https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2804655)<br><br> ] -- Main estimates: > A 1 standard deviation higher level of abandoned buildings—21 percent more block faces with abandoned buildings on and around that block—is expected to be associated with, on average, approximately 20 percent more homicides and gun assaults than an otherwise similar block.<sup>2</sup> .footnote[ [2] [Lanfear (2022)](https://doi.org/10.1111/1745-9125.12304) ] --- # Tables *Tables should stand alone* .pull-left[ * Title, headings, notes, and structure should convey: + Purpose + Context + Units + Source of data + Definitions of abbreviations ] -- .pull-right[ * Make them **simple** and **intuitive** * No ornamentation + Lines only to guide eyes * Values with proper scale and precision * Order rows and columns purposefully ] -- Creating tables: .container[ .pull-left[ * Non-Programmatic * Word / Google Docs * Excel / Google Sheets ] .pull-right[ * Programmatic * R / R Markdown * `\(\LaTeX\)` (plz no) ] ] --- ![](img/apa_table_diagram.jpg) .footnote[ Source: APA Publication Manual, 7th Ed. ] --- ## An example <div class="tabwid"><style>.cl-8d4bdb88{}.cl-8d414d80{font-family:'Arial';font-size:16pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-8d45c6c6{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-8d45e3cc{width:1.5in;background-color:transparent;vertical-align: middle;border-bottom: 1pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-8d45e3cd{width:1.2in;background-color:transparent;vertical-align: middle;border-bottom: 1pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-8d45e3d6{width:1.5in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-8d45e3d7{width:1.2in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}</style><table data-quarto-disable-processing='true' class='cl-8d4bdb88'><thead><tr style="overflow-wrap:break-word;"><th colspan="4"class="cl-8d45e3cc"><p class="cl-8d45c6c6"><span class="cl-8d414d80">Table 1. Actions by treatment condition</span></p></th></tr><tr style="overflow-wrap:break-word;"><th class="cl-8d45e3cc"><p class="cl-8d45c6c6"><span class="cl-8d414d80">Condition</span></p></th><th class="cl-8d45e3cd"><p class="cl-8d45c6c6"><span class="cl-8d414d80">Walk-By</span></p></th><th class="cl-8d45e3cd"><p class="cl-8d45c6c6"><span class="cl-8d414d80">Mail</span></p></th><th class="cl-8d45e3cd"><p class="cl-8d45c6c6"><span class="cl-8d414d80">Theft</span></p></th></tr></thead><tbody><tr style="overflow-wrap:break-word;"><td rowspan="2"class="cl-8d45e3d6"><p class="cl-8d45c6c6"><span class="cl-8d414d80">Control</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">1496</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">176</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">28</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">88.0%</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">10.4%</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80"> 1.6%</span></p></td></tr><tr style="overflow-wrap:break-word;"><td rowspan="2"class="cl-8d45e3cc"><p class="cl-8d45c6c6"><span class="cl-8d414d80">Treatment</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">1617</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">136</span></p></td><td class="cl-8d45e3d7"><p class="cl-8d45c6c6"><span class="cl-8d414d80">28</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-8d45e3cd"><p class="cl-8d45c6c6"><span class="cl-8d414d80">90.8%</span></p></td><td class="cl-8d45e3cd"><p class="cl-8d45c6c6"><span class="cl-8d414d80"> 7.6%</span></p></td><td class="cl-8d45e3cd"><p class="cl-8d45c6c6"><span class="cl-8d414d80"> 1.6%</span></p></td></tr></tbody></table></div> **Causal** design: * "There was no **effect** of treatment on theft." * "The treatment **reduced** mailing by 2.8 percentage points." --- ![:height 600px](img/ce-be-factors.png) .footnote[Source: [Lanfear (2022)](https://doi.org/10.1111/1745-9125.12304)] --- # Figures * What do we put in figures? + *Nearly everything, if possible* + Model results + **Maps!** -- * *Figures should stand alone* + Use informative titles and labels + Include detailed captions -- * Visualization is hard! + Too involved to scratch surface today + Take courses on visualization, e.g.: * [Chris Adolph free materials](http://faculty.washington.edu/cadolph/index.php?page=22) * [Nathan Yau's affordable *Flowing Data*](https://flowingdata.com/) + *Read books* --- .pull-left-40[ ![:width 100%](img/fig-3-model-results.png) ] .pull-right-60[ .text-72[ ## An example! Figure 3. Model Estimates of Disparities in Exposure to Gun Violence to Age 40 Hazard ratios (HR) and incidence rate ratios (IRR) greater than 1 indicate higher estimated exposure to gun violence. HRs were estimated with a semi-parametric Turnbull proportional hazards model. IRRs were estimated with a negative binomial regression. Models are weighted to permit generalizing estimates of exposure to the population of children coming of age in Chicago in the 1990s. Case weights applied during estimation are constructed from a combination of original survey design weights and weights for attrition constructed using predictions from binary logistic regression models. The final weights used in the analysis are the product of the survey design weights and attrition weights at each wave. Observations receive the combined weight at their final follow-up period (e.g., an individual observed through wave 3 would receive a weight equal to the product of the survey design weight, their wave 2 attrition weight, and their wave 3 attrition weight). Results are insensitive to trimming weights to the center 90% of the distribution. Results are also similar as alternative specifications of survival models (see eFigure 2) and without weights (see eFigure 3). Sample sizes are 2,417 for seen shot, 2,135 for been shot, and 649 for nearby shootings. ] ] .footnote[Source: Lanfear, Bucci, Sampson, & Kirk (2023)] --- ## Put plots in tables! ![:width 50%](img/ce-be.png) .footnote[Source: [Lanfear (2022)](https://doi.org/10.1111/1745-9125.12304)] --- ## Use visual examples! ![:width 60%](img/maps-plot.png) .text-72[ Figure 4. Frequency of Nearby Shootings in the Past Year by Respondent Race A, C, and E depict race-specific quintiles of frequencies of shootings within 250 m of respondents' residence in the past year. B, D, and F depict representative locations with counts of nearby shootings approximately equal to the mean of each group's fifth quintile (7 shootings for Black respondents, 2 shootings for Hispanic respondents, and 1 shooting for White respondents). Red dots are shootings. Circles are 250-m radii around respondent places of residence but are shifted to protect anonymity. Only race differences are shown because nearby shooting frequencies did not notably differ by sex or age cohort. Analysis was restricted to Black, Hispanic, and White respondents of wave 5 (649 respondents). Map tiles by Stamen Design; map data by OpenStreetMap (OpenStreetMap Foundation). ] .pull-right-70[ .footnote[Source: Lanfear, Bucci, Sampson, & Kirk (2023)] ] --- ## Book Recommendations .pull-left[ ![:height 480px](img/dv-cover-pupress.jpg) * With programming (in R) ] .pull-right[ ![:height 480px](img/wilke_cover.png) * Without programming ] --- # Discussion * Limitations * *Be honest but not self-flagellating* * Don't present only trivial weaknesses * Do note if you improve on past methods * Specify how future research can address them -- * Implications * New or unanswered research questions * Connect back to theories * Resist making policy prescriptions (unless the goal of the paper) * Drawing *attention* to a problem is fine -- * Finish strong and **upbeat** * "Despite these limitations, this study contributes to our understanding..." * Reiterate most significant takeaways * *Some people skip right to the conclusion, so use **hooks** here too* --- class: inverse # Moving Forward .pull-left[ * Read books on: * Writing in general * Writing with numbers * Making figures * Read articles * Pay attention to **style** in addition to content * Take note of effective figures and tables * **Practice!** ] .pull-right-40[ ![](img/miller.jpg) ]