How Couples Meet and Stay Together (HCMST)
- A totally new survey, HCMST 2017, fielded in the summer of 2017, with a fresh sample of 3,510 American adults, with lots of new questions about phone dating apps and other ways of meeting and dating. This new dataset is available on a separate page: https://data.stanford.edu/hcmst2017
How Couples Meet and Stay Together (HCMST) is a study of how Americans meet their spouses and romantic partners.
- The study is a nationally representative study of American adults.
- 4,002 adults responded to the survey, 3,009 of those had a spouse or main romantic partner.
- The study oversamples self-identified gay, lesbian, and bisexual adults
- Follow-up surveys were implemented one and two years after the main survey, to study couple dissolution rates. Version 3.0 of the dataset includes two follow-up surveys, waves 2 and 3.
- Waves 4 and 5 are provided as separate data files that can be linked back to the main file via variable caseid_new.
The study will provide answers to the following research questions:
- Do traditional couples and nontraditional couples meet in the same way? What kinds of couples are more likely to have met online?
- Have the most recent marriage cohorts (especially the traditional heterosexual same-race married couples) met in the same way their parents and grandparents did?
- Does meeting online lead to greater or less couple stability?
- How do the couple dissolution rates of nontraditional couples compare to the couple dissolution rates of more traditional same-race heterosexual couples?
- How does the availability of civil union, domestic partnership or same-sex marriage rights affect couple stability for same-sex couples? This study will provide the first nationally representative data on the couple dissolution rates of same-sex couples.
Rosenfeld, Michael J.
Core funding from the US National Science Foundation, award SES-0751977
Supplementary funding from Stanford's Institute for Research in the Social Sciences
Supplementary funding from the UPS endowment at Stanford University
Waves 4 and 5 were funded by the US National Science Foundation, award SES-1153867
Wave 6 of HCMST 2017 was funded by the UPS Endowment at Stanford University
How to Cite this Dataset:
Rosenfeld, Michael J., Reuben J. Thomas, and Maja Falcon. 2018. How Couples Meet and Stay Together, Waves 1, 2, and 3: Public version 3.04, plus wave 4 supplement version 1.02 and wave 5 supplement version 1.0 and wave 6 supplement ver 1.0 [Computer files]. Stanford, CA: Stanford University Libraries.
The HCMST data are freely available to users who register with SSDS/ Stanford Libraries.
* Note [12/06/2011] data.stanford.edu has a new web server, so if you find that you old login and password do not work, please clear your browser cache and cookies and try again. Thanks, and sorry for any difficulties.
I acknowledge core funding support from the U.S. National Science Foundation, and supplementary funding from Stanford's Institute for Research in the Social Sciences, and the UPS endowment at Stanford University. For research assistance, I thank Reuben Jasper Thomas, Elizabeth McClintock, Esra Burak, Kate Weisshaar, Taylor Orth, Ariela Schachter, Maja Falcon, and Sonia Hausen. For Web design and assistance, I am grateful to Ron Nakao and the Stanford Library. The following consultants contributed to the development of the survey instrument and the research design: Gary Gates, Jon Krosnick, Brian Powell, Daniel Lichter, Matthijs Kalmijn, Timothy Biblarz, and the staff of Knowledge Networks/GfK.
The universe for the HCMST survey is English literate adults in the U.S.
Unit of Analysis:
Type of data collection:
Time of data collection:
Wave I, the main survey, was fielded between February 21 and April 2, 2009. Wave 2 was fielded March 12, 2010 to June 8, 2010. Wave 3 was fielded March 22, 2011 to August 29, 2011. Wave 4 was fielded between March and November of 2013. Wave 5 was fielded between November, 2014 and March, 2015. Dates for the background demographic surveys are described in the User's Guide, under documentation below.
United States of America
Smallest geographic unit:
The survey was carried out by survey firm Knowledge Networks (now called GfK). The survey respondents were recruited from an ongoing panel. Panelists are recruited via random digit dial phone survey. Survey questions were mostly answered online; some follow-up surveys were conducted by phone. Panelists who did not have internet access at home were given an internet access device (WebTV). For further information about how the Knowledge Networks hybrid phone-internet survey compares to other survey methodology, see attached documentation.
The dataset contains variables that are derived from several sources. There are variables from the Main Survey Instrument, there are variables generated from the investigators which were created after the Main Survey, and there are demographic background variables from Knowledge Networks which pre-date the Main Survey. Dates for main survey and for the prior background surveys are included in the dataset for each respondent. The source for each variable is identified in the codebook, and in notes appended within the dataset itself (notes may only be available for the Stata version of the dataset).
Respondents who had no spouse or main romantic partner were dropped from the Main Survey. Unpartnered respondents remain in the dataset, and demographic background variables are available for them.
Sample response rate:
Response to the main survey in 2009 from subjects, all of whom were already in the Knowledge Networks panel, was 71%. If we include the the prior initial Random Digit Dialing phone contact and agreement to join the Knowledge Networks panel (participation rate 32.6%), and the respondents’ completion of the initial demographic survey (56.8% completion), the composite overall response rate is a much lower .326*.568*.71= 13%. For further information on the calculation of response rates, and relevant citations, see the Note on Response Rates in the documentation. Response rates for the subsequent waves of the HCMST survey are simpler, using the denominator of people who completed wave 1 and who were eligible for follow-up. Response to wave 2 was 84.5%. Response rate to wave 3 was 72.9%. Response rate to wave 4 was 60.0%. Response rate to wave 5 was 46%. Response to wave 6 was 91.3%. Wave 6 was Internet only, so people who had left the GfK KnowledgePanel were not contacted.
See "Notes on the Weights" in the Documentation section.
Web site or document download link(s):
User's Guide with basic information about data sources and variable layout, version 5, June 16, 2015
Codebook with frequencies Waves 1, 2, and 3, data version 3.04, date December 16, 2011
Main Survey Instrument Wave I updated Feb 4 2011 (pdf)
Notes on the coding of open text Q24, updated for version 3 "How did you meet.." Date Sept 26, 2009 (pdf)
Notes on Knowledge Networks Survey Methodology, with references Date Sept 26, 2009 (pdf)
Selections from the Knowledge Networks Field Report, Main Survey Date September 26, 2009 (pdf)
Wave 2 (first follow-up) survey instrument Date Oct 24, 2010
Wave 3 (second follow-up) survey instrument Date July 30, 2011
Updated notes on use of weights in HCMST, revised 10/22/2012
Note on response rates for the various waves of HCMST, version 1, uploaded 2/9/2013
HCMST wave 4 survey instrument, from February 2013
Codebook for wave 4 supplment version 1.02
Questionnaire instrument for wave 5 supplement version 1.0
Codebook for wave 5 supplement, version 1.0
KN/GfK question text for background question, version 2
Questionnaire instrument for wave 6 supplement version 1.0
Codebook for wave 6, version 1.0
Data Use Agreement
- The data I download from the Data Archive will not be used to identify individuals.
- I will not charge a fee for the data if I distribute it to others.
- I will inform the contact person for each dataset about work I do using their dataset.
(This helps us keep an accurate bibliography. See each data page for its contact email.)
- I will cite the data appropriately. (See each data page for its bibliographic citation.)
Data Download Links
HCMST version 3.04, STATA format
HCMST version 3.04, SPSS format
HCMST version 3.04, SAS format
HCMST wave 4 supplement, version 1.02, STATA 12 format
HCMST wave 4 supplement, version 1.02, SPSS format
HCMST wave 4 supplement, version 1.02, SAS format
HCMST wave 5 supplement, ver 1.0, June 16, 2015, Stata 12 format
HCMST wave 5 supplement, ver 1.0, SPSS format
HCMST wave 5 supplement, ver 1.0, SAS format
HCMST wave 6 supplement, ver 1.0, Stata format
Note for SPSS and SAS users: we have replaced the portable versions of the SPSS and SAS files with the .sav and .sas7bdat versions, respectively, to accommodate the long variable names in the dataset. The Stanford research team does all their work in STATA, so if you find discrepancies between the SAS or SPSS versions of the dataset and the documentation, please let us know. Thus far we have found that SPSS truncates value labels to 32 characters.
Current Data Version 3.04 plus wave 4 supplement, version 1.02, wave 5 supplement version 1.0, and wave 6 supplement version 1.0
Schedule of Future Additions to the HCMST dataset
- All 5 budgeted waves of HCMST have been completed and publicly posted.
- HCMST wave 6 was fielded in summer, 2017, and will be posted soon.
- HCMST 2017, a new study with fresh subjects, was fielded in the summer of 2017 and will be posted soon.
Forthcoming and restricted HCMST data
- Disclosure of redacted full-text answers to q24 ("how couples meet") and q35 ("explain relationship quality"). Because of demand from users of HCMST, we have redacted the Q24 and Q35 text answers, and obtained IRB approval to share the redacted answers on a restricted basis. As of February, 2013, ICPSR is making the edited versions of full-text q24 and q35 available to researchers who get their own IRB approval to host the data. Contact ICPSR for access.
- We are planning (at a future date) to redact the text variables from waves 4 and 5 append them to the restricted data hosted by ICPSR.
- Geographic codes for ZIP code, as well as a variety of state-based variables, which have been suppressed from the public dataset in order to preserve respondent confidentiality, are available from ICPSR for users who obtain IRB approval.
Frequently Asked Questions:
Q) There are a variety of different kinds of questions in the dataset about sexual identity, whether the respondent is part of a same-sex couple, and what gender of person the respondent is sexually attracted to. The answers to these questions sometimes seem to provide contradictory information. Why?
A) There is some inherent ambiguity in the realm of sexual identity and in identifying same-sex couples. There is also the possibility that a small number of respondents don't understand the questions. PI Rosenfeld created several new variables for the dataset, same_sex_couple, potential_partner_gender_recodes, alt_partner_gender. These new variables represent the researcher's best guess as to the gender of the partner, and as to whether the couple is a same-sex couple. In creating these variables PI Rosenfeld relied mostly on the variables in the public data, and a little bit on the text answers that are not part of the public data.
Q) What is the variable that identifies the partnered respondents?
Q) Why do the variables for children in the household (such as ppt01, ppt25, etc) not yield exactly the same information about household members as the household roster variables (such as pphhcomp11_member2_relationship)?
A) There are several reasons for the discrepancies. First, the ppt01 and ppt25 variables were derived from answers provided by the head of household, while the household roster variables were derived from the survey respondent. Not all survey respondents are the head of household as far as Knowledge Networks is concerned (see variable pphhhead). Second, household survey that was the source of the ppt01 and ppt** variables may not have taken place at the same time as the Core Adult Profile which was the source of the household roster variables such as pphhcomp11_member2_relationship. Lastly, the ppt01 and ppt** variables are incremented over time, as the children in the household are presumed to age over time. So the ppt** variables are accurate for the time of wave 1 of the HCMST survey, whereas the household roster variables are accurate reports from the time of the Core Adult Profile, which took place earlier.
Changes, additions, and improvements to the dataset
Changes for version 2.0
- Version 2.0 of the dataset includes new variables from wave II of the survey, the one year follow-up, along with the previously available variables from wave I, the main survey. See the new User's Guide under documentation for more information about variable layout.
- Version 2.0 also includes a second round of background demographic data for most respondents in the dataset, see User's Guide for variable layout.
- The Stanford research team has added a new variable, how_met_online, which categorizes the prior social connections (if any) between respondent and partner for respondents who met their partners online, based largely on an exhaustive re-analysis of the respondents open text answers to q24 (the open text answers are not yet available in the public dataset for respondent confidentiality reasons). See also the new variable either_internet_adjusted .
- Version 2.0 includes a new couple weight, weight_couples_coresident, see the updated documentation on weights for more details.
- For version 2.0 the Stanford research team has added two new date variables in YYYYMMDD format, HCM_main_interview_fin_date and w2_HCM_interview_fin_date. These variables are easier to read but less useful for analysis than the other date-time format variables already in the dataset.
- The variable partner_deceased has been updated to reflect the discovery of a few more cases of respondents whose partner was already deceased at the time of the main survey.
Changes from version 2.0 to version 3.02
- Version 3 includes wave 3 of the survey (the second follow-up survey) variables generally starting with w3_*, along with the third round of core adult profile data, variables generally starting with pp3_*
- New variables describing the particular family members who played the intermediary role in respondent meeting partner, variables coded q24_fam*
- The documentation for earlier versions offered a not-quite correct explanation for the variable ppnet. ppnet actually codes whether subject had their own internet access at home at the time of the profile survey, so this can change with each wave of the KN profile survey, so each profile survey will carry new versions of ppnet (see pp2_ppnet and pp3_ppnet).
- Stanford research team decided to include newer versions of profile data for subjects’ race, and education with each new wave of profile data; see for instance pp2_ppeduc, pp3_ppeduc.
- Variable q18a_3 label and description were clarified.
- Variable w2_xss label and description were clarified.
- Labels for how_long_ago_first_met, how_long_ago_first_cohab, etc were clarified to make clear that the unit is years.
- re-coded the w2_broke_up and the w3_brokeup_actual to distinguish between break-up and partner deceased.
- Added variables for interstate moves by subjects between pp1, pp2, and pp3, see for instance interstate_mover_pp1_pp2
- Many clarifications to documentation and variable and value labels
Research Papers Using HCMST
- Rosenfeld and Thomas 2012, Searching for a Mate: The Rise of the Internet as a Social Intermediary, published in the August, 2012 American Sociological Review 77 (4) 523-547 Or link to the pre-typeset version here.
- Thomas, 2011 How Americans (mostly don't) Find an Interracial Partner, working paper.
- Rosenfeld, 2014, Couple Longevity in the Era of Same-Sex Marriage in the US, Journal of Marriage and Family 76: 905-918
- Weisshaar, Kate, 2014, "Earnings Equality and Relationship Stability for Same-Sex and Heterosexual Couples," Social Forces 93(1): 93-123.
- Falcon, Maja, 2015, "Family Influences on Mate Selection: Outcomes for Homogamy and Same-Sex Coupling" , working paper.
- Schachter, Ariela, 2015, "Measurement Error in Panel Data: A Comparison of Face-to-Face and Internet Survey Samples" , working paper.
- Rosenfeld, 2015, "Who Wants the Breakup? Gender and Breakup in Heterosexual Couples" , forthcoming in an edited volume Social Networks and the Life Course.
- Rosenfeld, 2017, "Marriage, Choice and Couplehood in the Age of the Internet." Published in Sociological Science, 4: 490-510.
- Rosenfeld, 2017, "How Tinder and the dating apps Are and are Not Changing dating and mating in the U.S," conference paper.
- USA Today, Feb 11, 2010 story by Sharon Jayson on friends, the Web, and How Couples Meet
- Stanford Report, Feb 11, 2010 a feature story on How Couples Meet, with video
- San Jose Mercury News, Feb 14, 2010 Growing Number of Singles Find Their Valentines Online (link currently unavailable).
- NPR, "Computers are Becoming Cupid's Best Weapon," story by Jennifer Ludden August 16, 2010
- Reuters Newswire Being Online can Boost Your Chances of Being In Love , August 16, 2010
- Radio Nacional de Colombia story , August 16, 2010
- The Economist story "Love at First Byte", December 29, 2010.
- The Discovery Chanel story "Does Online Dating Work?", February 11, 2011
- The ABC News version of the Discovery Chanel Story Here, from February 12, 2011
- A New York Times article on online dating "Love, Lies and What They Learned" , from November 12 (online) and November 13 (print), 2011.
- Aziz Ansari and Eric Klinenberg's 2015 book, Modern Romance, draws usefully on the changing way couples meet, based on findings from the How Couples Meet and Stay Together dataset, and on the analyses in the Rosenfeld and Thomas 2012 American Sociological Review paper. See also the June, 2015 Time Magazine online story by Ansari and the June 14, 2015 Ansari and Klinenberg opinion piece on online dating in the New York Times.
- More recent links to press coverage of the HCMST project can be found on the press links page at Rosenfeld's Stanford website, here.