|
|Military, Pension, and Medical Records| |Surgeons' Certificates||Census Records|
Introduction to the Census Records
The Early Indicators project attempted to find Census records
whenever there was enough information from other sources indicating where
to initiate a search. Collection began by extracting informative data
from the Military, Pension, and Medical Records data set. The 1900 Census
was searched first because it gave the year of immigration, if applicable.
If an immigrant veteran was found in 1900, the immigration information
could be used to determine whether or not it was possible to search for
the veteran in 1850 or 1860. After the 1850 and 1860 Census collections
were completed for an entire company, the data were sent to the Family
History Library in Salt Lake City for completion of the 1910 Census
collection because the BYU library did not have complete soundex films for
1910.
The search for veterans in the 1900 and 1910 Census was usually
more successful than 1850 and 1860 for two reasons. First, because Census
households were indexed by head of household, veterans' names were much
easier to find in 1900 or 1910, when they were likely the head of
household, than in 1850 or 1860, when they were usually children or
adolescents. Second, data collectors searched the 1900 and 1910 Census
only for veterans who received Civil War pensions and who were assumed,
based on information in the pension record (PEN), to be alive in these
later Census years. Without the information in the PEN, not enough
identifying information was known about the recruit to make a positive
identification in the later Census years; often, there was insufficient
information to even begin searching since the 40-50 years that had passed
were probably his years of greatest mobility and change. Pension records
often contained the veteran's wife's name, marriage date, children's names
and birth dates, and residence information. This kind of information aided
significantly in accurately identifying individuals in the later Censuses.
For the 1850 and 1860 searches, however, researchers looked for everyone
with sufficient information from the Military, Pension, and Medical
Records data set indicating where to begin a search, even if the only
information was the place of enlistment.
The paragraphs below give further information on the following
topics:
back to top
Searching the 1850 and 1860
Censuses
For the collection of the 1850 and 1860 Censuses, a printed index
listing only heads of households (alphabetically by surname) was used for
each state. In some of the more populous states the indexes were divided
by region within the state. Researchers searched for every recruit with
sufficient information from the Military, Pension, and Medical Records
data set. In general, the county of birth and surrounding counties were
searched in 1850, and the county of enlistment and surrounding counties
were searched in 1860. Many recruits were searched for in both places in
both years.
Locating recruits in the 1850 and 1860 Censuses was challenging
not only because of the difficulties listed above (Searching the 1900 and
1910 Censuses), but also because so few recruits were heads of households.
Therefore, it was necessary to know additional information about the
recruit's family. Knowledge of the father's name was particularly
important, since the Census records were indexed by the name of the head
of household. Unfortunately, for the majority of recruits, the father's
name was unknown. As a result, it was necessary to search all households
in a county or town with the same surname. For example, a researcher
searching for a recruit in 1850 named Edwin Church in Philadelphia County,
Pennsylvania, father's name unknown, would have found thirteen listings
with the surname Church. All thirteen would have to have been checked for
an Edwin of the approximate age. If the name being searched for was more
common, there would have been even more listings to be considered.
Obviously with some names there were too many possibilities to search in a
reasonable amount of time. Surnames such as Smith, Jones, Anderson,
Baker, Cook, or Miller had pages of listings in the index, and when the
father's given name was not known, these men could not be searched for
because of time constraints.
If the recruit was living in the county of his birth or if there
was residence information in the pension records, then it was
significantly more likely that he would have been located in the Census.
Even when this information was present, however, it was sometimes
difficult to make successful linkages because there were often numerous
variant spellings in the records. Researchers were careful to check the
index for other possible spellings of a surname and all known variations
found in the Military, Pension, and Medical Records data set.
back to top
Searching the 1900 and 1910
Censuses
The Census soundex indexing system was used to find pensioners in
the 1900 and 1910 Censuses. Under the soundex system, surnames were
converted to a code consisting of the first letter of the surname and a
three-digit suffix. If there were not enough consonants in a name to
convert to 3 digits, 0's were used to make 3 digits. If there were more
than 3 consonants, only the first 3 were used. All vowels, including W,
H, and Y, were deleted in the soundexing system and doubled letters
counted as single letters. (For example, the surname of Bassett would
have been converted to a soundex code of B-320.) Within a specific
soundex surname code, individuals were identified alphabetically by given
name.
The soundex indexing system was extremely valuable for finding
individuals in the Census records. Unfortunately, however, even with this
tool there were problems with the Census records that made it difficult to
identify individuals. Some difficulties included the following:
- Spelling Variations. Different spellings of veterans' names
on pension and Census records were common, making identification
difficult, particularly if the first letter was incorrect.
- Multiple given (first) names. Within a specific code
individuals were searched for by given name. The researcher had to know
the given name used in the Census.
- Incorrect coding due to illegible records. Some Census
records were extremely difficult to read because of poor handwriting, poor
filming, or damage to the original record. Occasionally a person found in
the soundex could not be found on the microfilm because of these problems.
Also, sometimes the Census film was too dark or too light to read, and
only partial information could be obtained.
- Lack of location/residence information. State of residence
had to be known in order to locate the recruit in the records.
- Missing head of household. If a recruit was living with a
relative of the same surname and was not the head of household, a
researcher had to know the name of the head of household since only the
head of household was soundexed by his or her given name within the
soundex code.
Although the methods for searching the 1900 and 1910 Censuses were
similar, only 33 states in the 1910 Census had been soundexed. This posed
considerable difficulty in finding veterans in the unsoundexed states. In
some places they could not be searched for at all because of time
constraints. In the unsoundexed states researchers used city directories,
maps, street number indexes, and other library resources. If information
from the pension records indicated that a veteran had lived in a town for
several years before and after the Census year, that town was searched
completely if it could be done in a reasonable amount of time. Using
alternate methods of searching added to the time required to find a
person, and so each case was analyzed to determine whether or not the time
should be spent.
back to top
The Walker Collection
Data
The Walker data was collected during the summers of 1980 and 1981
by Kent and Mini (Marion) Walker. From the muster rolls in Ohio and New
York, the Walkers collected file number, record number, first name, middle
initial, last name, age, height, place of enlistment, length of
enlistment, occupation, birthplace, skin, hair, eyes, comments regarding
wounds, discharge, death date and place, cause of death, promotions, and
battles. They then searched the 1850 and 1860 censuses for those
recruits, using the age, place of enlistment, birthplace, and occupation
as verifiers. During the collection for the Early Indicators
project, these data were compared against information collected from the
pension (PEN) and military (MSR) records to determine correct matches. If
a recruit was not considered a verifiable match, he was deleted from the
database and searched for again using current collection methods. Those
recruits considered to be verifiable finds in the Census were not searched
again, but no attempt was made to assign quality codes (see below, The
Quality Code System). The Walker data set was then merged into the
current Census Records data set. There is a binary variable in the
Early Indicators Census Records data set, walker, indicating
whether or not the observation originally came from the Walker collection.
back to top
Census Information
The U.S. Census changed many times since its inception. These
changes are reflected in the variables that were collected in each Census
year. Below is an alphabetical list of the variables and the Census years
that contain the indicated variable. Of course, the Census documents were
incomplete in some cases, so not all of the information below was
available for every recruit. A complete layout of variables is given in
Section III, which also contains the number of non-missing values for each
variable. Detailed variable descriptions are given in Appendix A.
- Identification of Individuals
- Identification Number (1850, 1860, 1900, 1910)
- Name (1850, 1860, 1900, 1910)
- Relationship to Household Head (1900, 1910)
- Demographic and Socio-Economic Variables
- Age
- At Time of Census (1850, 1860)
- At Last Birthday (1900, 1910)
- Year, Month, and Place of Birth
- Birth Year (1900)
- Birth Month (1900)
- Birthplace (1850, 1860, 1900, 1910)
- Children
- Number of Living Children (1900, 1910)
- Number of Children (1900, 1910)
- Color of Skin (1850, 1860, 1900, 1910)
- Disability
- Deaf, Dumb, Blind, or Insane (1850, 1860)
- Deaf and Dumb (1910)
- Blind in Both Eyes (1910)
- Education
- Attended School Within the Last Year (1850, 1860)
- Number of Months in School Since 09/01/1899 (1900)
- School Attended Since 09/09/1909 (1910)
- Employment Status
- Number of Months Unemployed Within Year (1900)
- Unemployed on 4/15/1910 (1910)
- Employment Status (Worker or Employer) (1910)
- Number of Weeks Unemployed in 1909 (1910)
- Gender (1850, 1860, 1900, 1910)
- Immigration / Naturalization
- Number of Years in U.S. (1900)
- Year of Immigration to the U.S. (1900, 1910)
- Naturalization Status (1900, 1910)
- Language
- Speaks English (1900)
- English or Other Language (1910)
- Literacy
- Household Member over 20 is Illiterate (1850, 1860)
- Reads (1900, 1910)
- Writes (1900, 1910)
- Marital Status
- Married Within the Last Year (1850, 1860)
- Marital Status (1900, 1910)
- Number of Years Married (1900, 1910)
- Occupation
- Occupation, Trade, or Other Work (1850, 1860, 1900, 1910)
- Nature of Industry or Business (1910)
- Occupation Code (1850, 1860, 1900, 1910)
- Parents' Birthplace
- Father's Birthplace (1900, 1910)
- Mother's Birthplace (1900, 1910)
- Property / Home Ownership
- Owns or Rents Home (1900, 1910)
- Owns Property in Question Free or Mortgaged (1900, 1910)
- Farm or house (1900, 1910)
- Veteran
- Veteran of Union or Confederate Army (1910)
- Veteran of Union or Confederate Navy (1910)
- Wealth
- Real Estate Owned (1850, 1860)
- Personal Property (1860)
- Quality Codes and Remarks
- Quality of Link Code (1850, 1860, 1900, 1910)
- Remarks about Individuals
- Inputter Remarks
- Enumeration Date and Place
- Date (1850, 1860, 1900, 1910)
- House Number on Street (1900, 1910)
- Institution (1900, 1910)
- Street Address (1900, 1910)
- Post Office District (1860)
- Enumeration District (1900, 1910)
- Supervisor's District (1900, 1910)
- Political Ward (1900, 1910)
- Town
- Name of Town (1850, 1860)
- Name of Township (1900, 1910)
- Name of Subdivision (1900)
- Name of Incorporated City, Town, or Village (1910)
- County (1850, 1860, 1900, 1910)
- State (1850, 1860, 1900, 1910)
- Census Record Information
There are also variables in the current data submission that
reference the original Census record used for each recruit. The variables
are given below:
- Family Number (1850, 1860, 1900, 1910)
- Dwelling Number (1850, 1860, 1900, 1910)
- Farm Schedule Number (1900, 1910)
- Library Call Number for Film (1850, 1860, 1900, 1910)
- Page Number of Census Manuscript (1850, 1860, 1900, 1910)
- Sheet Number of Enumeration (1900, 1910)
back to top
The Quality Code System
A quality code of 1, 2, 3, or 4 was assigned each time a veteran
was successfully linked to one of the Census years, except those
previously found in the Walker collection, where codes were not assigned
(see above, The Walker Collection Data). The quality code indicates the
reliability of the linkage or, in other words, the extent to which
information from PEN and MSR verified the information in the Census. A
quality code of 1 indicates the strongest match and 4 the weakest.
Although effort was made to make the quality codes specific and objective,
some subjectivity was involved in each assignment, particularly in codes 3
and 4. In all cases an individual who was considered found in the Census
had to have had a name and an approximate age to match those in the
recruit's PEN and MSR. However, the name may have been any one of several
different names or spellings that appeared in the PEN and MSR.
Quality Code 1:
In addition to agreement of name and age, in order to earn a
quality code of 1 a person found in the Census had to have two or more of
the following identifying pieces of information:
- Place of birth (state or country)
- Father's name
- Mother's name
- Names of siblings
- Names of children
- Name of spouse
- Specific address
In 1850 and 1860, a quality code of 1 was justified by the
father's and mother's name or siblings' names, as well as birthplace. In
the 1850 Census, 1 was rarely assigned because the parents' names were
seldom known. However, by 1860 more of the men in the sample were married
and these were often a code 1 because the name of the spouse was known.
In 1900 and 1910, names of a wife and children, or a specific address with
house number and street justified a code of 1. The 1900 Census asked for
the number of years married. When this number corresponded to the actual
marriage date found in the pension records, it was used as an additional
piece of identifying information.
Quality Code 2:
A quality code of 2 was given when specific names of family
members were not known but there was corroborating information
indicating a strong link with the recruit. In addition to agreement of
recruit's name and age, at least two of the following criteria had to
exist to justify a quality code of 2:
- Living in the expected place to be found. This could be a
birthplace, enlistment place, or marriage place at a date close to the
Census year.
- Skilled occupation. A match occurred when a skilled
occupation was found that matched the occupation found in the PEN and MSR.
This criteria was used frequently in geographic areas where most of the
men on the Census were farmers. It was not as reliable in urban areas.
- Surname was unique. If in the county or township where the
recruit was expected to be found there were no other families of the same
surname, this criteria could be met. This could be determined from the
index.
- A very uncommon name. This could be either a surname or a
given name.
- Name and middle initial matched Pension information. If a
person was found where expected and his age and birthplace matched the
information found in the PEN and MSR, a name with middle initial could
have justified a quality code of 2.
- Living next door or in close proximity to other recruits in the
same company. Since companies were often formed by volunteers from
the same town, this criteria could have justified a quality code of 2.
Quality Code 3:
A quality code of 3 was given when a person was living where
expected and had matching information for name, age, and birthplace. A
code of 3 was considered to be a good link and very often the person on
the Census was thought by the researcher to be a "sure find," but lacking
names of other family members, a higher quality code could not be given.
Most veterans found in 1850 and 1860 are code 3 because they were usually
of minor age with no parent, sibling, or spouse information.
Quality Code 4:
A quality code of 4 indicates that a possible link exists because
the name and age matched the information from the PEN and MSR, but there
was not enough information for the researcher to justify a higher quality
code. This occurred frequently for recruits from large cities. The
name, age, and birthplace may have matched the PEN and MSR information,
but multiple possibilities caused the link to be uncertain. Immigrants
without parental information were especially difficult to link in 1850 and
1860 when the date of entry to the United States was unknown. This meant
that the only location available to the researcher was the place of
enlistment, which may not have been the permanent place of residence.
Summary
Quality codes have been used with the Census data in an
attempt to indicate the accuracy of linkage. The codes were designed to
be as concise and objective as possible. However, there are many
subtleties of Census research that cannot be codified. The codes should,
nonetheless, prove to be valuable guides to data users.
back to top
Linkage Rates for Census Records
Some sample selection bias may arise in the use of the data in this
ICPSR submission due to linkage failures: the failure to find a given
individual from the main sample in the Census records. As noted earlier,
XX% of veterans are linked to at least one of the four Census years. The
direction and magnitude of the selection bias will depend on how closely
the variables in the linked data are correlated with the factors that
determine linkage to the Census manuscripts. Factors that are known to
influence linkage to the Census data include date of death, migration from
one state to another or within a state, movement into or out of different
households, and socio-economic status.
Users of the Census data should also take note of the differences
in variables across the different Census years. Some variables, such as
birthplace (recbpl) or occupation (recocc) can be traced
across all four Census years, while others, such as birth month
(recbmo), or blindness, (recbnd), occur only in a particular
Census year (in this case 1900 or 1910). Furthermore, it is possible that
the quality of data differs across locations, years, and census takers.
|