Introduction to the Union Army Study
The Union Army Data Set consists of approximately 39,340 white males mustered into the Union
Army during the Civil War, for whom military, socio-economic, and medical information
from several sources throughout their lifetimes has been collected.
The Union Army Data Set is composed of three principal data sets that are based on
three different sources:
- The "Military, Pension, and Medical Records"
The largest data set is the "Military, Pension, and Medical Records" data set,
which is derived from military-related documents housed in the National
Archives in Washington, D.C. These include both war-time records and
applications made by veterans for pension support.
- The "Surgeons' Certificates" data set
Associated with these pension applications are detailed physical examinations, completed by
physicians, that certify the veterans' health and disability status.
Information from these examinations is collected in the second major
dataset, known as the "Surgeons' Certificates" data set.
- The "Census Records" data set
The "Census Records" data set contains all information on the veterans
that is available in the U.S. Federal Censuses of 1850, 1860,
1900, and 1910, though not all veterans could be linked successfully to
the Census documents.
All individuals in the Union Army Data Set can be linked across data sets by means of a unique identification
The following pages cover in more detail the various aspects of the Union Army Data
The Background and Significance pages describe the motivation
for compiling the Union Army Data Set and briefly illustrate the wide range of issues the
data set has been and can be used to address, ranging from factors affecting the aging
process to economic determinants of labor force participation.
The Sample Design pages describe how the sample of
the approximately 39,340 observations included in the Union Army Data Set was drawn. These pages also
include a statistical analysis to assess how representative the recruits sample is of the
Union Army and of the entire white male military-age population at the time. In addition, a
list of the companies that are covered in the sample is
The Data Sources pages contain an in-depth description of the
three data sets that together constitute the Union Army Data Set, namely
"Military, Pension, and Medical Records"
"Surgeons' Certificates" data set
, and the "Census Records" data set
. These pages explain how the data were collected at the time of the Civil War, as well as how they were recorded at
the time and preserved over a period of more than one hundred years.
The Variables pages list the
information (variables) available for each recruit (observation) in the Union Army Data Set.
These pages describe how information on the recruits that was found in the original
records was standardized and coded to convert the original information into machine-readable,
The Download Data icon provides a link to the
Center's interactive platform that allows researchers to access the Union Army Data Set.
The data in the Union Army Data Set comprises a portion of the historical data collected
by the project Early Indicators of
Later Work Levels, Disease, and Death (abbreviated EI),
sponsored by the National
Institutes of Health and the National Science Foundation (Grant Numbers
NIH P01 AG10120 and NSF SBR 9114981).
The data were collected
under the direction of the Department of Economics at Brigham Young University
(BYU) and processed by the Center for Population Economics (CPE) at the
University of Chicago. The goal of the project is to
construct datasets suitable for longitudinal studies of factors affecting
the aging process.