Age-adjusted estimates

Age adjustment is useful when comparing estimates across groups whose age distributions differ. In surveytable, age adjustment is turned on when the survey is specified with set_survey(). The tabulation commands are otherwise the same commands used for crude estimates.

This example uses selected variables from the National Health Interview Survey (NHIS) 2024 Public Use File. The standard population is represented by uspop_example$age_group_std.

Crude estimates

First, calculate ordinary crude estimates. These estimates reflect the age distribution in the survey population.

library(surveytable)

set_survey(nhis2024a)
Survey info {NHIS 2024 PUF (Adults)}
Variables Observations Design
16 32,577 Stratified 1 - level Cluster Sampling design (with replacement) With (662) clusters. nhis2024a = svydesign( data = nhis2024_df, ids = ~ppsu, strata = ~pstrat, weights = ~wtfa_a, nest = TRUE )
set_opts(mode = "nchs", adj = "nhis")
#> * Mode: NCHS.
#> * Korn and Graubard confidence intervals for proportions with an adjustment that might be required by NHIS.

tab("dis3_indicator")
Disability indicator (knowns only) {NHIS 2024 PUF (Adults)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
No difficulty 16,229 139,173 2,366 134,611 143,890 53.7 0.4 52.8 54.5
Some difficulty 12,607 94,730 1,618 91,611 97,954 36.5 0.4 35.8 37.3
A lot of difficulty 3,740 25,361 675 24,072 26,719 9.8 0.2 9.3 10.2
N = 32576. Checked NCHS presentation standards. Nothing to report.
tab("alot_diff")
Disability: a lot of difficulty (knowns only) {NHIS 2024 PUF (Adults)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 28,836 233,903 3,405 227,324 240,672 90.2 0.2 89.8 90.7
TRUE 3,740 25,361 675 24,072 26,719 9.8 0.2 9.3 10.2
N = 32576. Checked NCHS presentation standards. Nothing to report.

The same setup works for subgroups. For example, estimate each outcome by sex:

tab_subset("dis3_indicator", "sex_a")
Disability indicator (Sex of Sample Adult = Male) (knowns only) {NHIS 2024 PUF (Adults)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
No difficulty 7,786 70,468 1,385 67,805 73,237 55.9 0.6 54.7 57.0
Some difficulty 5,713 45,024 982 43,140 46,991 35.7 0.5 34.6 36.8
A lot of difficulty 1,463 10,648 403 9,886 11,468 8.4 0.3 7.9 9.0
N = 14962. Checked NCHS presentation standards. Nothing to report.
Disability indicator (Sex of Sample Adult = Female) (knowns only) {NHIS 2024 PUF (Adults)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
No difficulty 8,442 68,699 1,311 66,177 71,318 51.6 0.5 50.6 52.7
Some difficulty 6,891 49,662 966 47,804 51,593 37.3 0.5 36.4 38.3
A lot of difficulty 2,276 14,700 458 13,828 15,626 11.0 0.3 10.5 11.7
N = 17609. Checked NCHS presentation standards. Nothing to report.
tab_subset("alot_diff", "sex_a")
Disability: a lot of difficulty (sex_a = Male) (knowns only) {NHIS 2024 PUF (Adults)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 13,499 115,493 1,930 111,771 119,339 91.6 0.3 91.0 92.1
TRUE 1,463 10,648 403 9,886 11,468 8.4 0.3 7.9 9.0
N = 14962. Checked NCHS presentation standards. Nothing to report.
Disability: a lot of difficulty (sex_a = Female) (knowns only) {NHIS 2024 PUF (Adults)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 15,333 118,361 1,870 114,752 122,084 89 0.3 88.3 89.5
TRUE 2,276 14,700 458 13,828 15,626 11 0.3 10.5 11.7
N = 17609. Checked NCHS presentation standards. Nothing to report.

The age distribution itself can also be tabulated:

tab("age_group_std")
Age group (knowns only) {NHIS 2024 PUF (Adults)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
18-44 11,741 118,528 2,164 114,361 122,847 45.7 0.4 44.9 46.5
45-54 4,446 40,006 913 38,255 41,837 15.4 0.3 14.9 15.9
55-64 5,467 41,151 799 39,615 42,748 15.9 0.2 15.4 16.3
65-74 5,999 33,234 655 31,975 34,543 12.8 0.2 12.4 13.2
75+ 4,924 26,349 575 25,246 27,500 10.2 0.2 9.8 10.6
N = 32577. Checked NCHS presentation standards. Nothing to report.

Age-adjusted estimates

To produce age-adjusted estimates, call set_survey() with two additional arguments:

  • aa_vr: the age-group variable in the survey.
  • aa_pop: a data frame with Level and Population columns describing the standard population.

The Level values in aa_pop must exactly match the levels of aa_vr.

uspop_example$age_group_std
#>   Level Population
#> 1 18-44  108151050
#> 2 45-54   37030152
#> 3 55-64   23961506
#> 4 65-74   18135514
#> 5   75+   16573966
set_survey(
  nhis2024a
  , aa_vr = "age_group_std"
  , aa_pop = uspop_example$age_group_std
)
Survey info {NHIS 2024 PUF (Adults)}
Variables Observations Age adjustment Design
16 32,577 Age-adjusted by age_group_std: 18-44, 45-54, 55-64, 65-74, 75+ Stratified 1 - level Cluster Sampling design (with replacement) With (662) clusters. nhis2024a = svydesign( data = nhis2024_df, ids = ~ppsu, strata = ~pstrat, weights = ~wtfa_a, nest = TRUE )
set_opts(mode = "nchs", adj = "nhis")

Now use the same tabulation commands. The table titles indicate that the estimates are age-adjusted.

tab("dis3_indicator")
Disability indicator (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
No difficulty 16,229 145,116 1,112 142,953 147,311 56.0 0.4 55.2 56.8
Some difficulty 12,607 91,217 1,046 89,189 93,291 35.2 0.4 34.4 35.9
A lot of difficulty 3,740 22,931 554 21,869 24,045 8.8 0.2 8.4 9.3
N = 32576. Checked NCHS presentation standards. Nothing to report.
tab("alot_diff")
Disability: a lot of difficulty (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 28,836 236,333 554 235,248 237,422 91.2 0.2 90.7 91.6
TRUE 3,740 22,931 554 21,869 24,045 8.8 0.2 8.4 9.3
N = 32576. Checked NCHS presentation standards. Nothing to report.
tab_subset("dis3_indicator", "sex_a")
Disability indicator (Sex of Sample Adult = Male) (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
No difficulty 7,786 72,882 714 71,495 74,296 57.8 0.6 56.7 58.9
Some difficulty 5,713 43,448 698 42,100 44,838 34.4 0.6 33.4 35.5
A lot of difficulty 1,463 9,811 357 9,135 10,537 7.8 0.3 7.2 8.4
N = 14962. Checked NCHS presentation standards. Nothing to report.
Disability indicator (Sex of Sample Adult = Female) (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
No difficulty 8,442 72,167 721 70,767 73,595 54.2 0.5 53.2 55.2
Some difficulty 6,891 47,796 678 46,485 49,145 35.9 0.5 35.0 36.9
A lot of difficulty 2,276 13,098 386 12,363 13,876 9.8 0.3 9.3 10.4
N = 17609. Checked NCHS presentation standards. Nothing to report.
tab_subset("alot_diff", "sex_a")
Disability: a lot of difficulty (sex_a = Male) (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 13,499 116,330 357 115,633 117,032 92.2 0.3 91.6 92.8
TRUE 1,463 9,811 357 9,135 10,537 7.8 0.3 7.2 8.4
N = 14962. Checked NCHS presentation standards. Nothing to report.
Disability: a lot of difficulty (sex_a = Female) (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 15,333 119,963 386 119,210 120,722 90.2 0.3 89.6 90.7
TRUE 2,276 13,098 386 12,363 13,876 9.8 0.3 9.3 10.4
N = 17609. Checked NCHS presentation standards. Nothing to report.

As a diagnostic, tabulating the age-adjustment variable should reproduce the standard age distribution. The percentage standard errors are expected to be zero, because the age distribution has been fixed to the standard population.

tab("age_group_std")
Age group (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
18-44 11,741 137,551 0 137,551 137,551 53.1 0 52.5 53.6
45-54 4,446 47,097 0 47,097 47,097 18.2 0 17.7 18.6
55-64 5,467 30,475 0 30,475 30,475 11.8 0 11.4 12.1
65-74 5,999 23,066 0 23,066 23,066 8.9 0 8.6 9.2
75+ 4,924 21,079 0 21,079 21,079 8.1 0 7.8 8.4
N = 32577. Checked NCHS presentation standards. Nothing to report.

Counts or proportions

The Population column in aa_pop can contain population counts or proportions. Values are normalized internally, so counts and proportions that describe the same standard population produce the same age-adjusted estimates.

uspop_example$age_group_std_prop
#>   Level Population
#> 1 18-44 0.53053662
#> 2 45-54 0.18165197
#> 3 55-64 0.11754353
#> 4 65-74 0.08896404
#> 5   75+ 0.08130384
set_survey(
  nhis2024a
  , aa_vr = "age_group_std"
  , aa_pop = uspop_example$age_group_std_prop
)
Survey info {NHIS 2024 PUF (Adults)}
Variables Observations Age adjustment Design
16 32,577 Age-adjusted by age_group_std: 18-44, 45-54, 55-64, 65-74, 75+ Stratified 1 - level Cluster Sampling design (with replacement) With (662) clusters. nhis2024a = svydesign( data = nhis2024_df, ids = ~ppsu, strata = ~pstrat, weights = ~wtfa_a, nest = TRUE )
set_opts(mode = "nchs", adj = "nhis")

tab("dis3_indicator")
Disability indicator (knowns only) {NHIS 2024 PUF (Adults) (age-adjusted)}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
No difficulty 16,229 145,116 1,112 142,953 147,311 56.0 0.4 55.2 56.8
Some difficulty 12,607 91,217 1,046 89,189 93,291 35.2 0.4 34.4 35.9
A lot of difficulty 3,740 22,931 554 21,869 24,045 8.8 0.2 8.4 9.3
N = 32576. Checked NCHS presentation standards. Nothing to report.