Skip to main content Skip to secondary navigation

Database on Ideology, Money in Politics, and Elections (DIME): Public version 2.0

Main content start

Abstract: 

The Database on Ideology, Money in Politics, and Elections (DIME) is intended as a general resource for the study of campaign finance and ideology in American politics. The database was developed as part of the project on Ideology in the Political Marketplace, which is an on-going effort to perform a comprehensive ideological mapping of political elites, interest groups, and donors using the common-space CFscore scaling methodology (Bonica 2014). Constructing the database required a large-scale effort to compile, clean, and process data on contribution records, candidate characteristics, and election outcomes from various sources. The resulting database contains over 130 million political contributions made by individuals and organizations to local, state, and federal elections spanning a period from 1979 to 2014. A corresponding database of candidates and committees provides additional information on state and federal elections.

The included measures of ideology have been extensively validated across several studies spanning a variety of institutional settings and types of actors. A compendium of these validation results can be accessed here

The DIME+ data repository on congressional activity extends DIME to cover detailed data on legislative voting, lawmaking, and political rhetoric. (See https://data.stanford.edu/dime-plus for details.)

Principal Investigator: 

Adam Bonica

How to Cite this Dataset: 

Bonica, Adam. 2016. Database on Ideology, Money in Politics, and Elections: Public version 2.0 [Computer file]. Stanford, CA: Stanford University Libraries. <https://data.stanford.edu/dime&gt;

Contact Email: 


Introduction: 

The database is intended to make data on campaign finance and elections (1) more centralized and accessible, (2) easier to work with, and (3) more versatile in terms of the types of questions that can be addressed. A list of the main value-added features of the database is below:

Data processing: Names, addresses, and occupation and employer titles have been cleaned and standardized.

Unique identifiers: Entity resolution techniques were used to assign unique identifiers for all individual and institutional donors included in the database. The contributor IDs make it possible to track giving by individuals across election cycles and levels of government.

Geocoding: Each record has been geocoded and placed into congressional districts. The geocoding scheme relies on the contributor IDs to assign a complete set of consistent geo-coordinates to donors that report their full address in some records but not in others. This is accomplished by combining information on self-reported address across records. The geocoding scheme further takes into account donors with multiple addresses. Geocoding was performed using the Data Science Toolkit maintained by Pete Warden and hosted at http://www.datasciencetoolkit.org/. Shape files for congressional districts are from Census.gov (http://www.census.gov/rdo/data).


Ideological measures: The common-space CFscores allow for direct distance comparisons of the ideal points of a wide range of political actors from state and federal politics spanning a 35 year period. In total, the database includes ideal point estimates for 70,871 candidates and 12,271 political committees as recipients and 14.7 million individuals and 1.7 million organizations as donors.

Corresponding data on candidates, committees, and elections: The recipient database includes information on voting records, fundraising statistics, election outcomes, gender, and other candidate characteristics. All candidates are assigned unique identifiers that make it possible to track candidates if they campaign for different offices. The recipient IDs can also be used to match against the database of contribution records. The database also includes entries for PACs, super PACs, party committees, leadership PACs, 527s, state ballot campaigns, and other committees that engage in fundraising activities.

Identifying sets of important political actors: Contribution records have been matched onto
other publicly available databases of important political actors. Examples include:

Fortune 500 directors and CEOs: (data) (Paper)

Federal court judges: (Data) (Paper}

State supreme court justices: (Data) (Paper}

Executives appointees to federal agencies: (Data) (Paper)

Medical professionals: (Data) (Paper)

 

Validation: The ideological measures have been extensively validated across several studies spanning a variety of institutional settings and types of actors. A compendium of these validation results can be accessed here

Acknowledgements: 

I thank the Sunlight Foundation, the National Institute on Money in State Politics, and the Center for Responsive Politics for making their data publicly accessible. I also thank Keith Poole, Howard Rosenthal, Charles Stewart, Jeff Lewis, Jonathan Woon, and Georgia Kernell for providing data. I also acknowledge the generous support I received as a fellow at the Institute for Research in the Social Sciences (IRISS) at Stanford University and the Hoover Institution. Lastly, I thank Ron Nakao for his generous assistance in hosting the database.


Methodology/Sampling

Universe: 

The contribution database contains records for political donations made by individuals and organizations to local, state, and federal elections. The candidate database includes entries for candidates, PACs, super PACs, leadership PACs, 527s, party committees, campaigns for state ballot measures, and other recipient committees that engage in fundraising activities.

Unit of Analysis: 

Individual donors, Political action committees (PACs), Candidates, Party committees, Ballot campaigns, other political committees

Type of data collection: 

Campaign finance records, Candidate and district characteristics, Election outcomes

Time span: 

1979-2014

Time of data collection: 

2010-2015

Geographic coverage: 

United States

Smallest geographic unit: 

Geocoded addresses

 
 

Documentation


Data Use Agreement

You are free:

  • To Share: To copy, distribute and use the data.
  • To Create: To produce works from the data.
  • To Adapt: To modify, transform and build upon the data.​​

As long as you:

  • Attribute: You must attribute any public use of the data, or works produced from the data, in the manner specified in the license. For any use or redistribution of the data, or works produced from it, you must make clear to others the license of the data and keep intact any notices on the original data.

Read the full ODC-BY 1.0 license text for the exact terms that apply.

Data Download Links


Errata

Note: If you have issues uncompressing any of zip files, you may need to download third-party compression tools. For Windows users, 7-zip will support opening the larger zip files. For Mac users, Unarchiver and Keka are both good options.

Data Notes: 

Data Sources:

Federal Elections: Contribution records, candidate and committee filings, and election outcomes for federal elections are provided by the Federal Election Commission.

State Elections: Contribution records, candidate and committee filings,  and election outcomes for state elections are provided by the National Institute on Money in State Politics and the Sunlight Foundation. This data is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License by the National Institute on Money in State Politics. (See the link for details: http://followthemoney.org/Institute/about_data.phtml.)  When using data on state elections, please attribute credit to the National Institute on Money in State Politics.

527s: Donation records to 527s are from the Center for Responsive Politics (2002-2010) and the IRS (2011-2012). The Center for Responsive Politics licenses its data under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. Please attribute credit accordingly.

New York City Elections: Contribution records for New York City elections were downloaded from the New York City Campaign Finance Board's website (http://www.nyccfb.info/). 

Other Data:

Data on industry and sector codings are from the Center for Responsive Politics (http://www.opensecrets.org). The Center for Responsive Politics licenses its data under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. Please attribute credit accordingly.

DW-NOMINATE scores are provided by Keith Poole and Howard Rosenthal and are available for download at http://www.voteview.com.
Cite: Poole, Keith T., and Howard Rosenthal. 2007. Ideology & Congress. 2nd rev. ed. New Brunswick: Transaction Publishers.

Committee membership data are provided by Charles Stewart and Jonathan Woon and are available for download at http://web.mit.edu/17.251/www/data_page.html.
Cite: Charles Stewart III and Jonathan Woon.  Congressional Committee Assignments, 103rd to 112th Congresses, 1993--2011.

Data on district partisanship were made available by Georgia Kernell:
Cite: Kernell, Georgia. 2009. “Giving Order to Districts: Estimating Voter Distributions with National Election Returns.” Political Analysis 17(3): 215–35.