September 7th

Recall:

  • Up until 2010, both short and long forms were sent out.
  • Beginning in 2010, only short forms are sent while the American Community Survey is used to make up for the long form.
  • Bilingual or "swimlane" forms are automatically sent to heavily Hispanic neighborhoods.
  • Printed forms are available in English, Spanish, Chinese, Korean, Vietnamese, and Russian
  • Supplementary material that explains how to fill out one of the printed forms is available in many other languages

Interesting Links:

Bad Data & Imputation:

  • Level 1: A missing or questionable item on a person's form, but the person's other answers allow "correction". For example, sex is left blank but the name is Sarah, so the person is assumed female.
    • Correction of level 1 problems is called Assignment.
  • Level 2: A missing or questionable item on a person's form which cannot be determined by that person's other answers, but can be determined by other people of the same household. For example, person 2's age is left blank but is person 1's child and person 1 is 18. Person 2 can be assumed to be 0-5 years of age.
    • Correction of level 2 problems is called Allocation.
  • Level 3: A missing or questionable item on a person's form which cannot be determined by that person's other answers, nor by other people's answers of the same household.
    • Correction of level 3 problems is called Substitution.
    • Substitution is done by looking at data of neighbors in similar households to determine weights and then semi-randomly fabricating the missing data.

Accessing Data:

  • Imagine we wanted to determine what percent of the US population has an imputed age.
    • This can be done by the following steps:
      1. Go to the American FactFinder website
      2. Click "Get Data" under Decennial Census
      3. Use the SF1 data set and click "Detailed Tables"
      4. Click "Add" to add the United States and then click "Next"
      5. Click "By Keyword" and search for "Imputation"
      6. Select "P44. Imputation of Age (Not Substituted)" and click "Add"
      7. Search for Substituted
      8. Select "P39. Population Substituted (Total Population)" and click "Add" then "Show Results"
      9. Add together the number substituted (3,441,154) and the number with allocated ages (10,400,568) and divide by the total number who have filled out the census (281,421,906) to arrive at the percent of data with an imputed age.
(1)
\begin{align} \frac{10,400,568 + 3,441,154}{281,421,906} = \frac{13,841,722}{281,421,906} = .0492 = 4.92 \% \end{align}

We also discussed how to download & compare multiple geographies (e.g., Georgia counties) by using the Factfinder's option to download tables as comma-delimited files. (We could use some more notes on this…)