Often there are two different ways of splitting up a data collection into categories, and we want to know if those ways are related. For example, we will look at the members of the United States House of Representatives, who can be categorized either by political party or by what part of the country they represent.
In January of 2015, the United States House of Representatives had 435 members. Of those, 188 belonged to the Democratic Party and the other 247 belonged to the Republican Party. The number of data elements in a category is called a frequency, so the number of representatives belonging to each party is also referred to as the frequency of membership in that party.
The fraction of representatives who were Democrats was $$188/435$$, or 43.2% (rounded to the nearest tenth of a percent). A frequency divided by the total number of data elements like this is called a relative frequency, so 43.2% was the relative frequency of Democrats in the House of Representatives.
We would like to understand how party membership is related to other ways of splitting up the representatives, such as by geographical region.
The table to the left summarizes the party affiliations of the United States House of Representatives in January of 2015 in each census region. (Each census region consists of a group of nearby states. For example, the Northeast region consists of Maine, New Hampshire, Vermont, Massachusetts, Connecticut, Rhode Island, New York, New Jersey, and Pennsylvania.)
The number of Democrats from the Northeast was 48. A frequency of elements satisfying two criteria like this is called a joint frequency, so 48 was the joint frequency of being a Democrat and being from the Northeast.
A table giving the joint frequencies of two categorizations is called a two-way frequency table.
The total number of representatives in January of 2015 was 435. This means that the fraction of representatives who were Democrats from the Northeast was $$48/435$$, or 11.0% (rounded to the nearest tenth of a percent). A joint frequency divided by the total number of data elements like this is called a joint relative frequency, so 11.0% was the joint relative frequency of being a Democrat from the Northeast.
What was the joint relative frequency of each of the other seven groups of representatives shown in the table? Round each percentage to the nearest tenth of a percent.
The two-way frequency table for census region and party affiliation of the House of Representatives is shown again to the left. We have added the totals of each row and column in the table, giving the total number of representatives from each region and each party. Totals like this are known as the marginal frequency of each category, because they appear in the “margins” of the table. For example, the marginal frequency of representatives from the Northeast is 78.
The fraction of representatives from an entire category is called the marginal relative frequency of that category. For example, the marginal relative frequency of representatives from the South is $$161/435≈37.0%$$ (rounded to the nearest tenth of a percent).
In the table below, fill in the marginal relative frequency of representatives from each region.
The table above gives the marginal relative frequencies of the rows in the two-way frequency table. You can also look at the marginal relative frequencies of the columns in a table. In this case, these are just the percentages that you computed in frequency-qn:
We want to use a two-way frequency table to answer questions like: “Which region is more Democratic, the Northeast or the West?”
The table below shows the joint relative frequency of Democrats and Republicans in the Northeast and the West, as you computed in joint-frequency-qn.
In association-qn-1, you saw that the joint frequency of Democrats in the West was higher than the joint frequency of Democrats in the Northeast, but the joint frequency of Republicans in the West was also higher than the joint frequency of Republicans in the Northeast. A similar fact is true of the joint relative frequencies. This is possible because there are more total representatives from the West than from the Northeast.
In order to take into account the different sizes of the two regions, we’ll divide each joint frequency by the number of representatives from its region, rather than the total number of representatives.
The total number of representatives from the Northeast (the marginal frequency) is 78. Of those, 48 are Democrats (the joint frequency of Democrats from the Northeast). So the fraction of representatives from the Northeast who are Democrats is $$48/78≈61.5%$$ (rounded to the nearest tenth of a percent).
The fraction of representatives from one category (like a region) who belong to another type of category (like a party) is called the conditional relative frequency of the second category in the first category. So the conditional relative frequency of Democrats among representatives from the Northeast is $61.5%$.
In the table below, fill in the conditional relative frequency of both Democrats and Republicans among representatives from each region.
You can also talk about conditional relative frequency among a column. For example, there are a total of 188 Democrats in the House of Representatives. So the conditional relative frequency of representatives from the Northeast among Democrats is $$48/188≈25.5%$$.
In the table below, fill in the conditional relative frequency of representatives from each region among each party.
In 2015, the South was very Republican, but that has not always been true. The table and graph to the left show the number of members of the House of Representatives from the South who belonged to each party at the beginning of each Congressional session (that is, right after each federal election) from 1881 through 2015. (In the table, the number of Democrats is labeled with a D and the number of Republicans with an R, to save space.)
In trend-qn-1, you saw that the number of Democrats in the South first increased and then decreased over the period from 1881 to 2015. However, this doesn’t necessarily mean the South got more Democratic and then less Democratic, because the total number of representatives from the South changed over time.