Question 1

What is the PHQ-9 and who developed it?

Accepted Answer

The Patient Health Questionnaire-9 (PHQ-9) is a validated, self-administered depression screening and severity measurement tool derived from the full PRIME-MD diagnostic instrument. It was developed by Drs. Robert Spitzer, Janet Williams, and Kurt Kroenke and first published in 2001 in the Journal of General Internal Medicine. The nine items map directly to the nine DSM diagnostic criteria for major depressive disorder (MDD), covering depressed mood, anhedonia, sleep disturbance, fatigue, appetite change, guilt or worthlessness, concentration difficulty, psychomotor changes, and suicidal ideation. Each item is rated on a 4-point Likert scale (0 = not at all, 1 = several days, 2 = more than half the days, 3 = nearly every day) over the preceding two weeks. Total scores range from 0 to 27. The PHQ-9 serves dual purposes: as a diagnostic aid (a score of 10 or above with functional impairment has a sensitivity of 88% and specificity of 88% for MDD) and as a severity measure to monitor treatment response over time. It is one of the most widely used mental health instruments in primary care globally and is freely available without licensing fees.

Question 2

How do PHQ-9 score ranges correspond to depression severity?

Accepted Answer

The PHQ-9 uses five severity bands established in the original 2001 validation study by Kroenke, Spitzer, and Williams. A score of 0–4 indicates minimal depression — symptoms at this level rarely require active treatment, though watchful waiting and reassessment are appropriate. Scores of 5–9 represent mild depression; clinical judgment is needed, and psychoeducation or watchful waiting with a follow-up in 4 weeks is recommended. Scores of 10–14 indicate moderate depression; this threshold has the strongest evidence for initiating pharmacotherapy or structured psychotherapy such as cognitive behavioural therapy (CBT) or interpersonal therapy (IPT). Scores of 15–19 represent moderately severe depression and warrant active treatment — either pharmacotherapy (typically an SSRI), psychotherapy, or combined treatment, with follow-up within 1–2 weeks. Scores of 20–27 indicate severe depression and require immediate, intensive intervention; psychiatric referral should be considered for scores at this level, particularly when suicidal ideation is present. Item 9 (suicidal ideation) should always be reviewed regardless of total score, as even a score of 1 on that item warrants direct clinical inquiry.

Question 3

What is the difference between PHQ-9 and PHQ-2?

Accepted Answer

The PHQ-2 is a two-item ultra-brief screening tool derived from the first two questions of the PHQ-9: (1) little interest or pleasure in doing things, and (2) feeling down, depressed, or hopeless. It was validated by Kroenke, Spitzer, and Williams in a 2003 study and is designed specifically as an initial screening step in high-throughput settings. Using a cut-point of 3 or above out of 6 possible points, the PHQ-2 has a sensitivity of approximately 83% and specificity of 92% for major depression. A positive PHQ-2 screen (score ≥3) should always be followed by the full PHQ-9 for detailed severity assessment. The PHQ-9 itself is also sometimes used as a screening instrument (with a cut-point of 10 for a probable MDD diagnosis), as a severity measure, and as a treatment response monitor. For population-level screening programs or annual wellness visits, the PHQ-2 first / PHQ-9 second sequential approach balances efficiency with diagnostic accuracy. In settings with more time, the full PHQ-9 can be administered directly without a PHQ-2 pre-screen.

Question 4

Is the PHQ-9 valid across different populations?

Accepted Answer

The PHQ-9 has been validated in numerous populations and cultural settings, making it one of the most extensively cross-validated mental health instruments available. Original validation was conducted in primary care and obstetric patients in the United States. Subsequent studies confirmed its psychometric properties in adolescents (PHQ-A, which adds a school performance item), older adults, perinatal women (where cut-points may differ), patients with chronic medical illness such as diabetes and cardiovascular disease, cancer patients, and multiple non-Western cultural contexts. The PHQ-9 has been translated and validated in over 80 languages. Notably, in perinatal women, a cut-point of 10 remains the most widely used, though some guidelines recommend 13 for specificity in this population. In older adults, somatic items (fatigue, sleep, appetite) may produce false positives due to comorbid medical conditions. Item 9 performance varies across cultures — in some populations, passive death wishes are more commonly endorsed than active suicidal ideation even at comparable overall severity. The NICE guidelines in the UK and USPSTF in the US both recommend PHQ-9 as the preferred depression severity tool in primary care. Clinicians should apply cultural and contextual judgment when interpreting scores.

Question 5

How is the PHQ-9 used to monitor treatment response?

Accepted Answer

The PHQ-9 is particularly valuable as a longitudinal treatment monitoring instrument, not just a one-time screen. Standardized response criteria allow clinicians to objectively evaluate whether treatment is working. A 50% or greater reduction in score from baseline is defined as a "response" — for example, a drop from PHQ-9 score of 16 to 8 constitutes a treatment response. "Remission" is defined as a PHQ-9 score of 4 or below, regardless of baseline. "Partial response" is generally a 25–49% reduction from baseline. "Non-response" is less than 25% reduction. Measurement-based care (MBC) — systematic use of validated measures like the PHQ-9 at every visit — has been shown to significantly improve treatment outcomes compared to unstructured clinical assessment. The Texas Medication Algorithm Project and STAR*D trial demonstrated that patients whose treatment was guided by structured outcome measurement achieved better and faster remission rates. Best practice is to repeat the PHQ-9 at every clinical contact (or at minimum every 4 weeks) during the acute phase of treatment, with less frequent monitoring during maintenance. Score trends over time are often plotted graphically to make progress visible to both patient and clinician — this transparency also improves treatment adherence.

PHQ-9 Depression Screening Calculator

Score & Severity

Clinical Flags

Progress & Plan

How to Use This Calculator

Formula

Example

Frequently Asked Questions

Related Calculators