1 Introduction
Bipartite networks, where ties connect two distinct actor types without intra-type connections, are common in political and social research. Examples include ethnic group memberships (Larson Reference Larson2017), U.S. state policy adoptions (Desmarais, Harden, and Boehmke Reference Desmarais, Harden and Boehmke2015), and product-level trade (Kim, Liao, and Imai Reference Kim, Liao and Imai2020). These affiliation networks also appear in many other domains, including customer–product relationships (Huang, Li, and Chen Reference Huang, Li and Chen2005), actor–movie ties (Peixoto Reference Peixoto2014), and even document–word occurrences of the kind typically used in text-as-data analyses (e.g., Lancichinetti et al. Reference Lancichinetti, Irmak Sirer, Wang, Acuna, Körding and Nunes Amaral2015).
The ubiquity of bipartite structures explains the wide variety of probabilistic models explicitly constructed to handle these types of networks. Examples include approaches designed to identify a given bipartite network’s “backbone,” or core sub-graphs (e.g., Neal Reference Neal2014); models to study the influence of nodes over others (e.g., Campbell et al. Reference Campbell, Marrs, Böhmelt, Fosdick and Cranmer2019); exponential random graph models that accommodate bipartite sufficient statistics in a regression context (e.g., Agneessens, Roose, and Waege Reference Agneessens, Roose and Waege2004; Wang et al. Reference Wang, Sharpe, Robins and Pattison2009; Wang, Pattison, and Robins Reference Wang, Pattison and Robins2013); and community detection models that can uncover latent groups from observed bipartite relationships (e.g., Kim and Kunisky Reference Kim and Kunisky2021; Zhou and Amini Reference Zhou and Amini2019).
Despite the availability of modeling strategies that can accommodate them, bipartite networks are often analyzed by first aggregating them into a unipartite network, focusing on relationships among only one node type. To understand this process better, consider the stylized example of a bipartite network depicted in panel (b) of Figure 1, in which legislators (circles) and bills (triangles) represent two separate types of nodes, and with cosponsorship ties occurring only between the two types rather than within each type. Researchers commonly “project” this onto a unipartite network of legislators, aggregating edges over one type of nodes (bills) and forming a new network among nodes of the other (senators). For instance, Panel (a) of Figure 1 shows a projected unipartite network in which weighted edges between legislators indicate the number of cosponsored bills they share. Projections of this and similar kinds are common among applied researchers, ranging from examples in policy collaborations (Fischer and Sciarini Reference Fischer and Sciarini2016) to police community connections (Haim, Nanes, and Davidson Reference Haim, Nanes and Davidson2021). As shown in Table A1, all but one of the 26 recently published articles in major political science journals that study bipartite networks project the relational data onto a unipartite graph.

Figure 1 Example networks for bill cosponsorship in bipartite and unipartite forms.
Note: Panels (b) and (c) show different bipartite networks that project to the same unipartite network in panel (a). This projection loses information about bill types (triangle colors) and cosponsorship details (e.g., number of cosponsors and number of bills). For instance, in (b), senators cosponsor many bills in total, with a set of (gray) 9 bills that each draw some bipartisan support, such that the proportion of bipartisan supported bills compared to single-party bills is 3:4, whereas in (c), the senate is much less productive and has a single (gray) bill that draws all senators in support, with a lower bipartisan proportion of cosponsorship of 1:2.
Unfortunately, such projections are known to induce substantial loss of information (e.g., Campbell et al. Reference Campbell, Marrs, Böhmelt, Fosdick and Cranmer2019), possibly resulting in misleading estimates of network tie determinants. For example, panels (b) and (c) in Figure 1 depict entirely different bipartite graphs that nevertheless yield an identical unipartite network (depicted in panel (a)) when aggregated by summing all shared edges between legislators. In the bipartite network (b), there are many bills, and each is cosponsored by only two senators. In contrast, the bipartite network (c) has far fewer bills, but each bill is sponsored by many senators. Information about these differences is completely lost as a result of these two networks generating the same unipartite projection, leading in potentially incorrect conclusions.
This poses a unique challenge for researchers interested in understanding the role that groups play in the formation of ties in a network: when projecting, the heterogeneity in the number of connections of the nodes being aggregated over is lost, which often results in inflated clustering coefficients driven by the presence of nodes of unusually high-degree (e.g., highly popular bills) (e.g., Guillaume and Latapy Reference Guillaume and Latapy2004; Newman, Strogatz, and Watts Reference Newman, Strogatz and Watts2001). These artificially large clustering coefficients in turn translate into incorrect estimates of relevant group-related features—including the network’s community structure, levels of polarization across groups behind tie formation, and coalitional behaviors among actors (e.g., González-Bailón and Wang Reference González-Bailón and Wang2016; Larson et al. Reference Larson, Nagler, Ronen and Tucker2019; Sunstein Reference Sunstein2009; Sunstein Reference Sunstein2018). If researchers are uninterested in understanding such group-related characteristics of the network, the loss of information induced by common projection strategies may be less problematic. But if the goal is to explore and understand how groups affect the formation of bipartite networks, researchers should pay careful attention to these and related issues.
To enable the evaluation of theories of group-driven edge formation in bipartite networks and avoid the need for projections altogether, we extend the mixed membership stochastic blockmodel (Airoldi et al. Reference Airoldi, Blei, Fienberg and Xing2008; Olivella, Pratt, and Imai Reference Olivella, Pratt and Imai2022) to bipartite networks in which groups are theorized to play an influential role. The proposed model, which we call bipartite Mixed membership Stochastic Blockmodel (biMMSBM), allows researchers to discover the groups of nodes, within each node type, that share common probabilistic patterns of edge formation (so-called stochastic equivalence classes). In the example of cosponsorship, biMMSBM categorizes legislators and bills into meaningful groups based on cosponsorship patterns, avoiding the possibility of discovering artificial hyper-partisanship in the U.S. Congress.
The biMMSBM is based on a mixed-membership (or admixture) structure, allowing nodes of one type to belong to multiple unobserved groups depending on interactions with nodes of the other type. This flexibility allows us to capture nuanced social interactions whereby actors adopt different roles when interacting with others. It also sets our model apart from most of the existing bipartite community detection models, which typically assume every node (or every edge) belongs to a single group (e.g., Kim and Kunisky Reference Kim and Kunisky2021; Zhou and Amini Reference Zhou and Amini2019).Footnote 1
Our model also supports the use of covariates to explain the edge formation between nodes of different types (Razaee, Amini, and Li Reference Razaee, Amini and Li2019; White and Murphy Reference White and Murphy2016) in two ways: (1) node-level covariates describe learned group memberships, like legislators’ ideology and partisanship, and bills’ policy content or author characteristics in the cosponsorship example and (2) dyadic covariates predict edge formation directly, relaxing the reliance on latent group structures alone in network generation. This accommodates theoretically relevant variables defined for pairs of nodes of different types—like whether a legislator belongs to committee(s) a bill was referred to. In contrast, many existing modeling approaches force researchers to adopt a two-step analytic strategy, conducting standard regression analyses of network model outputs (e.g., Battaglini, Sciabolazza, and Patacchini Reference Battaglini, Sciabolazza and Patacchini2020; Cao Reference Cao2009; Handcock, Raftery, and Tantrum Reference Handcock, Raftery and Tantrum2007; Maoz et al. Reference Maoz, Kuperman, Terris and Talmud2006; Tam Cho and Fowler Reference Tam Cho and Fowler2010; Zhang et al. Reference Zhang, Friend, Traud, Porter, Fowler and Mucha2008). In sum, our model offers a single-step, comprehensive approach to network analysis with well-behaved posterior distributions, facilitating research into how group memberships can predict network formation.
One disadvantage of MMSBM-type network models is that a fully Bayesian inference strategy relying on Markov chain Monte Carlo simulation is computationally prohibitive for networks of medium or large size. To overcome this, we develop a computationally efficient variational Bayes approximation to our model’s collapsed posterior that relies on stochastic optimization (Airoldi et al. Reference Airoldi, Blei, Fienberg and Xing2008; Gopalan and Blei Reference Gopalan and Blei2013; Hoffman et al. Reference Hoffman, Blei, Wang and Paisley2013; Olivella et al. Reference Olivella, Pratt and Imai2022; Teh, Newman, and Welling Reference Teh, Newman and Welling2007). We implement our algorithm in the open-source software package NetMix (Olivella et al. Reference Olivella, Lo, Pratt and Imai2021). To demonstrate biMMSBM’s applicability, we fit the model to a network of cosponsorship decisions in the U.S. Congress. As coalitions are at the heart of legislative politics (Riker Reference Riker1962), a model adept at identifying and explaining latent group memberships is ideal for understanding the politics of cosponsorship decisions. We study the patterns of cosponsorship during the penultimate instance of a perfectly split Senate in U.S. history—the 107th Congress. We model the bipartite network connecting Senators to legislation (or “bills”) through the discovery of latent groups, while examining the roles of Senator and bill characteristics, as well as Senator–bill dyadic features.
Contrary to the results of a unipartite network analysis and an analysis with existing cross-sectional bipartite models, our proposed strategy uncovers cross-party collaboration among senators occurring through low-stakes legislation, which later facilitates consideration of more contentious bills. Junior senators from both parties are notably predictive of this cooperation. Additionally, the model uncovers the role of shared committee memberships and of bill-specific reciprocity norms, which are often missed by analyses of the projected network.
We now discuss this motivating application—the politics of cosponsorship in the US—and explain the risk of misunderstanding the role played by groups in edge formation when projecting bipartite networks to unipartite ones (Section 2). We then detail our modeling approach in Section 3, and present empirical findings from the 107th Congress cosponsorship network in Section 4. Finally, in Section 5, we conclude with implications for other domains and future research.
2 The Cosponsorship Network Among Senators
Senator cosponsorship reveals legislative interests and goals, as it signifies public endorsement of specific legislation (see, e.g., Arnold, Deen, and Patterson Reference Arnold, Deen and Patterson2000; Kirkland Reference Kirkland2011; Koger Reference Koger2003; Tam Cho and Fowler Reference Tam Cho and Fowler2010). In the Senate, sponsorship constraints make cosponsorship crucial for indicating broader support, increasing media attention, and serving as a credible commitment device (Bernhard and Sulkin Reference Bernhard and Sulkin2013; Krutz Reference Krutz2005). The 60-vote filibuster threshold further elevates the importance of cosponsorship, especially bipartisan support (Rippere Reference Rippere2016).
Collaboration among senators is crucial for legislative productivity and influence, with costs associated with reneging on cosponsorships (Bernhard and Sulkin Reference Bernhard and Sulkin2013; Fowler Reference Fowler2006; Holman, Mahoney, and Hurler Reference Holman, Mahoney and Hurler2022). Although cosponsorship is not predictive of bill passage Anderson, Box-Steffensmeier, and Sinclair-Chapman (Reference Anderson, Box-Steffensmeier and Sinclair-Chapman2003) and Wilson and Young (Reference Wilson and Young1997), it significantly impacts legislator effectiveness (e.g., Harbridge-Yong, Volden, and Wiseman Reference Harbridge-Yong, Volden and Wiseman2023) and signals issue positions (Desposato, Kearney, and Crisp Reference Desposato, Kearney and Crisp2011; Lawless, Theriault, and Guthrie Reference Lawless, Theriault and Guthrie2018).
Scholars have examined many factors influencing cosponsorship (see, e.g., Campbell Reference Campbell1982; Fong Reference Fong2020; Grossmann and Pyle Reference Grossmann and Pyle2013; Krutz Reference Krutz2005). Building on these, our study seeks to understand bipartisan cooperation in the face of partisan gridlock using observed cosponsorship patterns as our outcome of interest, examining how groups of legislators interact with legislation of different types to increase the Senate’s ability to overcome hyper-partisanship.
Considering both different types of legislators and of legislation is important, since both can result in very different patterns of collaboration. Consider the two sample bipartite networks in Figure 2, drawn from the full network of consponsorships during the 107th Congress (2001–2003). The networks in both panels contain 100 senators (left-side nodes) and two samples of bills (right-side nodes). The left panel depicts highly partisan bills; the right, highly bipartisan bills. We observe substantial heterogeneity in cosponsorship behaviors, even within the same session of Congress: while a few bills attract many cosponsors (multiple drawn edges to the bill), many more have relatively few.Footnote 2 If the partisan composition of cosponsorships is systematically associated with whatever brings about this heterogeneity, we risk painting an incorrect picture of how partisanship predicts collaboration among legislators when aggregating over it and omitting crucial bill-specific and senator-specific information (e.g., policy content, timing, and collaboration extent via popular legislation; see Kirkland and Gross Reference Kirkland and Gross2014; Neal Reference Neal2014, Reference Neal2020).

Figure 2 Cosponsorship networks among senators in the 107th Congress.
Note: The figure shows two bipartite networks sampled from the 107th Congress, with 100 senators sorted by ideology (most conservative senators at top) and a sample of bills sorted by node degree. The left panel network shows bills with predominantly partisan cosponsorship; the right panel shows highly bipartisan bills, highlighting significant heterogeneity in bill cosponsorship composition and degree.
2.1 Projection onto a Unipartite Network Can Be Misleading
Aggregating over heterogeneity in bill- and senator-bill level data can lead to incorrect substantive takeaways. To see how this may be the case, revisit the stylized scenario presented in Figure 1. In it, two distinct bipartite networks (b and c) represent different collaborative environments—one with high productivity and cross-party collaboration, and the other with low productivity and limited cross-party work. Despite these differences, they both result in the same unipartite projection (a). In (b), the bipartisan-to-within-party cosponsorship ratio is 3:4, with a 0.43 probability of randomly selecting same-party cosponsors; in (c), this ratio is 1:2, and the probability is 0.84. These crucial differences, which speak to the degree of polarization in the underlying network, are obscured in the unipartite projection (a), which shows strong within-party ties regardless of the underlying bipartite structure.
Such distortions are palpable when considering the 107th Senate. Figure 3 shows the distribution of probabilities that a randomly selected pair of cosponsors belong to the same party, computed for each bill in the original bipartite cosponsorship network (left panel) and for each senator in the unipartite weighted projection of the same. While the strongly bimodal distribution associated with the bipartite network (Figure 3, left panel) suggests we are roughly as likely to find perfectly bipartisan bills as we are to find perfectly partisan ones, the distribution associated with the projected unipartite network (Figure 3, right panel) paints a completely different picture. With an average same-party cosponsorship probability of 0.75 and a left-skewed distribution, the projection suggests the majority of senators collaborate with copartisans only.

Figure 3 Probability of copartisan cosponsors during the 107th Senate.
Note: The left panel shows the probabilities that any two distinct cosponsors of a bill are from the same party, and the right panel shows the probabilities that a senator’s randomly chosen pair of cosponsors are copartisans. The bipartite network reveals substantial bipartisan cosponsorship, while the weighted unipartite network among senators indicates less cooperation.
Unfortunately, this kind of strong, artificial clustering present in both the real projected network of the 107th Senate and the simple example in Figure 1(a) is a common phenomenon (Latapy, Magnien, and Del Vecchio Reference Latapy, Magnien and Del Vecchio2008; Newman et al. Reference Newman, Strogatz and Watts2001; Tam Cho and Fowler Reference Tam Cho and Fowler2010), and can lead to incorrect conclusions about the extent and nature of polarization in Congress, as we show in Section 4 below. In general, only in the rare cases when degree and group composition are independent among nodes in the family being aggregated over can we expect the projection to have no effect on the conclusions that can be drawn from the projected network. To address these concerns, we introduce a new modeling strategy next.
3 The Proposed Methodology
This section describes the core intuition behind our model, which we refer to as biMMSBM. It presents the full modeling approach and discusses estimation strategies that enable the analysis of large networks.
3.1 Modeling Strategy
We represent an observed network as a bipartite graph, with two disjoint node sets (e.g., senators and bills) linked only by edges representing cosponsorships (no edges among legislators or bills).
The biMMSBM allows nodes to belong to one of several latent groups when interacting with each node of the other family. For any dyadic relationship between two nodes of different families, the latent group memberships of the nodes determine the likelihood of forming an edge. Thus, a senator may belong to different latent communities when deciding whether to co-sponsor different legislation. Similarly, bills can be sorted into separate latent groups across senator–bill dyads. For instance, John McCain (R-AZ) might have behaved similarly to other Republicans when deciding whether to cosponsor bills related to national security, but might have acted differently when considering bills related to campaign finance reform—a pattern that could help us understand his reputation as a party maverick.
To capture this, we define a probabilistic model to account for diverse latent community memberships. Figure 4 schematically depicts our model’s mixed membership structure. Pie charts show the probability of four senators belonging to two communities (blue and orange) and five bills belonging to three (red, green, and yellow). Node-level covariates (e.g., senator ideology and bill policy area) explain these mixed memberships.

Figure 4 Mixed-membership stochastic blockmodel for bipartite networks.
Note: The schematic depicts a 2
$\times $
3 latent community model, where senators exhibit mixed memberships across two communities (blue and orange) represented as pie charts to indicate probabilities in each community summing up to 1, and bills exhibit mixed memberships across three communities (yellow, red, and green). Community affinities are encoded in the block model matrix (right), illustrated by edge thickness (left).
The
$2 \times 3$
matrix on the right of Figure 4 shows the blockmodel, indicating probabilities of cosponsorship between senator and bill communities. Certain community pairs (e.g., orange senators and red bills) show higher cosponsorship probabilities than others (e.g., blue senators and green bills), reflecting diverse coalitional strategies among senators toward legislation.
Cosponsorship networks often exhibit this stochastic equivalence, where coalitions of senators support similar legislative classes (e.g., Bratton and Rouse Reference Bratton and Rouse2011). Similar group-based dynamics are found in other networks, like economic trade between countries and co-occurrence of words in documents. We now proceed with a formal presentation of our full model.
3.2 The Bipartite Mixed-Membership Stochastic Blockmodel
Formally, let
$G(V_1,V_2,E)$
represent a bipartite graph, where (
$V_1$
,
$V_2$
) denote the two disjoint families of nodes (
$V_1 \cap V_2 = \emptyset $
), and E represents the undirected edge set, or node pairs of different families. Suppose that family 1 has
$N_1 = |V_1|$
total nodes, and that family 2 has
$N_2=|V_2|$
. For each dyad, let
$z_{pq}\in \{1,\ldots ,K_1\}$
denote the latent group, to which node
$p \in V_1$
of family 1 belongs when interacting with node
$q \in V_2$
of family 2, whose latent group membership is denoted by
$u_{pq}\in \{1, \ldots , K_2\}$
. Generally, we allow
$K_1\neq K_2$
. Further, we use
$y_{pq} = 1$
to denote the existence of an edge between node pair
$\{p,q\}\in E$
while
$y_{pq} = 0$
indicates its absence.
We assume edge formation probability is a function of dyadic predictors
$\boldsymbol {d}_{pq}$
and a blockmodel
$\boldsymbol {B}$
, which is a
$K_1\times K_2$
matrix representing the log odds of edge formation between members of any two latent groups (Figure 4),

where
$\boldsymbol {\unicode{x3b3} }$
is a dyad-level regression coefficient vector. Our dyadic predictors allow for varying edge formation probabilities even within the same latent group pairs, relaxing the stochastic equivalence assumption standard to stochastic blockmodels (SBMs). Substantively, this allows for scenarios where senator–bill dyads whose respective nodes sort into the same pairs of latent communities to be further differentiated by characteristics pertinent to their particular dyad (e.g., a senator’s history with a bill’s author).
As is common in mixed-membership SBMs, we define a categorical sampling model for the dyad-specific group memberships,
$z_{pq}$
and
$u_{pq}$
, so that

where the probability that family 1 node p (family 2 node q) belongs to a latent group on any possible interaction is given by
$\boldsymbol {\unicode{x3c0} }_p$
(
$\boldsymbol {\unicode{x3c8} }_q$
)—a
$K_1$
-dimensional (
$K_2$
-dimensional) probability vector usually known as the mixed-membership vector (represented as pie charts in Figure 4).
Critically, our model incorporates node-level information (e.g., senator and bill level predictors) into the definition of the mixed-membership probabilities of latent groups. These covariates themselves predict the likelihood of an edge (e.g., cosponsorship) through the resulting instantiated element of the blockmodel. Specifically, we assume that the mixed-membership probability vectors are generated by a Dirichlet distribution with concentration parameters that are a function of node covariates,Footnote 3

where hyper-parameter vectors
$\boldsymbol {\unicode{x3b2} }_{1g}$
and
$\boldsymbol {\unicode{x3b2} }_{2h}$
contain regression coefficients associated with the gth and hth groups of vertex families 1 and 2, respectively.
Putting it all together, the full joint distribution of data and latent variables is given by,

This specification allows us to more formally describe the potential issues raised by aggregation illustrated informally in Section 2.1. A common strategy simply sums the number of connections to a member of family
$V_2$
shared by two members of family
$V_1$
, forming an aggregated sociomatrix
$\widetilde {\boldsymbol {Y}}=\boldsymbol {Y}\boldsymbol {Y}^\top $
. Under this strategy, and in the absence of dyadic covariates, the model in Equation (4) implies

where
$\boldsymbol {\Pi }$
is an
$N_1\times K_1$
matrix that stacks mixed memberships
$\boldsymbol {\unicode{x3c0} }_p$
for all
$p\in V_1$
, and similarly for
$\boldsymbol{\Psi}$
. The issue arises because the bracketed terms in Equation (5) (i.e., blockmodel and Family
$V_2$
mixed-memberships) cannot be separately identified from the aggregated sociomatrix
$\widetilde {\boldsymbol {Y}}$
, leading to observational equivalence like the one illustrated in Figure 1. This can lead to misconstrued relationships among members of Family 1 when relying on aggregated data. Our model avoids this by directly modeling the bipartite network without the need to aggregate.
3.3 Estimation
With the thousands of vertices and millions of potential edges involved in an application such as bill cosponsorships, sampling directly from the posterior distribution given in Equation (4) is computationally prohibitive. To obtain estimates of quantities of interest in a reasonable amount of time, we follow the computational strategy of Olivella et al. (Reference Olivella, Pratt and Imai2022) by first marginalizing latent variables and then defining a stochastic variational approximation to the full posterior. We briefly summarize these computational strategies here.
3.3.1 Marginalization
To reduce complexity, we collapse the full posterior over the mixed-membership vectors (i.e.,
$\boldsymbol {\Pi }$
and
$\boldsymbol{\Psi}$
):

where
$\Gamma (\cdot )$
is the Gamma function;
$\unicode{x3b1} _{pg} = \exp (\boldsymbol {x}_{p}^\top \boldsymbol {\unicode{x3b2} }_{1g})$
,
$\xi _{p} = \sum _{g=1}^{K_1} \unicode{x3b1} _{pg}$
(and similarly for
$\unicode{x3b1} _{qh}$
and
$\xi _{q}$
);
is a count representing the number of times node p instantiates group g across its interactions with nodes in family 2 (and similarly for
$C_{qh}$
); and
$\unicode{x3b8} _{pq, z_{pq}, u_{qp}} = \text {logit}^{-1}(B_{z_{pq}, u_{qp}} + \boldsymbol {d}_{pq}^\top \boldsymbol {\unicode{x3b3} })$
is the probability of a tie between the vertices in dyad
$p,q$
.
3.3.2 Stochastic Variational Inference
Then, to enhance scalability, we employ two strategies. First, we rely on a mean-field variational approximation to the collapsed posterior in Equation (6) (Blei, Kucukelbir, and McAuliffe Reference Blei, Kucukelbir and McAuliffe2017), which first defines a lower bound
$\mathcal {L}(\boldsymbol {\Phi })$
for this target, and then tightens the bound by updating the parameters
$\boldsymbol {\Phi }$
of the approximating distributions by following a strategy similar to that of the EM algorithm. Previous studies indicate that marginalization approaches like the one described above enhance variational approximation quality (Teh et al. Reference Teh, Newman and Welling2007).
Second, we rely on stochastic optimization to find the maximum of the lower bound (Dulac, Gaussier, and Largeron Reference Dulac, Gaussier and Largeron2020; Foulds et al. Reference Foulds, Boyles, DuBois, Smyth and Welling2013; Hoffman et al. Reference Hoffman, Blei, Wang and Paisley2013). To do so, our algorithm follows, with decreasing step sizes, a noisy estimate of the gradient of
$\mathcal {L}(\boldsymbol {\Phi })$
formed by subsampling dyads in the original network.Footnote
4
Provided the schedule of step sizes satisfies the Robbins–Monro conditions, and the gradient estimate is unbiased, the procedure is guaranteed to find a local optimum of the variational target (Hoffman et al. Reference Hoffman, Blei, Wang and Paisley2013). Importantly, it does so while using a fraction of the available data at each iteration, thus dramatically improving estimation time. Details of our exact estimation procedures—including a description of how we compute measures of uncertainty, initialize all relevant parameters and latent variables, and sample dyads to form the sub-network on which gradient estimates are based—are available in Section S.1 of the Supplementary Material.Footnote
5
3.4 Methodological Contributions
While the biMMSBM model is an extension of the unipartite MMSBM (Airoldi et al. Reference Airoldi, Blei, Fienberg and Xing2008) and its structural variant (Olivella et al. Reference Olivella, Pratt and Imai2022), we believe it makes three methodological contributions. First, while the MMSBM is a popular modeling framework for network data across disciplines, there is no version of it that can be applied to bipartite network data, which are common in political science. The model we propose can take full advantage of information about both kinds of vertices involved in bipartite networks, without the need to aggregate and ignore either. Second, by avoiding projections that are common in practice, our model allows researchers to avoid biased results (such as artificially higher clustering) related to either of the types of vertices under study. Finally, and particularly by incorporating node-level predictors, our model allows researchers to make predictions about specific pairs of actors (e.g., which senators will support which specific bills). Such granular predictions are not possible when working with the projected network, which most prior models forced researchers to do.Footnote 6
4 Empirical Analysis of the 107th U.S. Senate
Before 2021, the Senate had only been perfectly split three other times—with the first months of the 107th session being the most recent instance of this rare event in the Senate’s history. Despite this, the 107th Senate was not unusual in terms of its productivity, passing about 17% of the 3,242 pieces of legislation introduced between 2000 and 2002—close to the average 22% passage rate during the modern Senate—and adopting major legislation, including the Patriot Act and the so-called No Child Left Behind bill.
Such sustained productivity during times of narrow or non-existent partisan majorities is not uncommon, with many major bipartisan pieces of legislation in U.S. history passing under similar circumstances—including the legislation that made the interstate highway system possible, the National Housing Act of 1954, and the Civil Rights Act of 1957. In the 107th Senate, over 20,660 bills had cosponsors, with roughly half showing bipartisan support.
To explore the drivers of collaboration in cosponsorship, we use the proposed biMMSBM model to better understand why this session of the Senate remained legislatively active, avoiding the gridlock that many associate with partisan divisions. The model highlights junior, bipartisan senators as key collaborators, building consensus through low-stakes resolutions and popular programs. It also confirms the influence of quid pro quo behavior and committee experience on cosponsorship, supporting prior research.
We further show in the Supplementary Material that fitting a unipartite version of our model would make it impossible to identify these pathways to collaboration (see Section S.3.6 of the Supplementary Material). As we would expect, given the descriptive analysis in Section 2.1, the unipartite network model reveals little other than partisanship as the main driver of coalitional politics, making it hard to understand how a perfectly divided legislature was able to remain productive.Footnote 7
4.1 Model Specification and Fit
Our goal, then, is to understand the structural and contextual features that made collaboration possible during the 107th Senate. A rich literature on collaboration in Congress suggests that legislators make cosponsorship decisions based on partisanship, seniority, gender, and personal political history (Bratton and Rouse Reference Bratton and Rouse2011; Holman and Mahoney Reference Holman and Mahoney2018; Rippere Reference Rippere2016). Therefore, our model includes each senator’s party, ideology, seniority, and gender as predictors of community membership.
Harward and Moffett (Reference Harward and Moffett2010) articulate that senators are more likely to cosponsor bills when they share closer preferences with the sponsor of the bill, and when they are more connected to their colleagues. To capture this, we model legislation groups as a function of their corresponding sponsors’ party, ideology, seniority, and gender (self-sponsorship dyads are excluded).
Lastly, senators tend to cosponsor bills within specific policy domains (Harward and Moffett Reference Harward and Moffett2010) and may opt into bipartisan cosponsorships based on legislative bill topics (Harbridge Reference Harbridge2015). This inclination cannot be modeled in a senator-only unipartite network, but can be directly accounted for when modeling the bipartite structure. We address this by including the substantive topic as a bill covariate.Footnote 8
To capture the described shifts in the temporal context in which bills are introduced, we also include a bill-level covariate indicating whether a bill was presented in the first phase of the Congress (lasting only several months prior to Vermont Senator Jeffords leaving the Republican party in May 2001), in the second phase (post Jeffords leaving and prior to 9/11) or in the third phase (after 9/11). This temporal context would be lost in a unipartite network analysis (Kirkland and Gross Reference Kirkland and Gross2014).
As we indicated earlier, we also pay close attention to two additional forces that can be expected to affect the likelihood of cosponsorship. First, we aim to capture reciprocity behaviors, or favor-trading on the Senate floor (Brandenberger Reference Brandenberger2018; Harbridge-Yong et al. Reference Harbridge-Yong, Volden and Wiseman2023). The model includes a dyadic predictor: the log-transformed proportion of times a bill’s sponsor reciprocated cosponsorship in the previous Congress. As this proportion of reciprocity is heavily skewed and contains many zeros, we use the log transformation of non-zero values and an indicator variable for the cases of zeros.
Second, our dyadic model includes the number of committees shared by a senator and a piece of legislation. A greater number of shared committees indicates a higher chance that the senator has overseen the development of a bill and holds relevant substantive expertise. While the roles of committees have been studied previously (Cirone and Van Coppenolle Reference Cirone and Van Coppenolle2018; Porter et al. Reference Porter, Mucha, Newman and Warmbrand2005), our analysis directly examines how overlap in committees between legislator and legislation relates to cosponsorship. Relatedly, Gross and Kirkland (Reference Gross and Kirkland2019) find evidence of strong predictive power of shared committee leadership among the subset of ranking legislators when exploring cosponsorship decisions.
With predictors at the monadic and dyadic levels in place, we determine the number of latent groups for senators and bills; we first randomly select 25% of data as a test set, and compare models with a range of possible latent group-size pairings through the area under the ROC curve (AUROC) values for the out-of-sample edges. We select group sizes offering the best fit according to this criterion, resulting in three groups each for legislators and bills, i.e.,
$K_1=K_2=3$
.
In Section S.3.3 of the Supplementary Material, we establish that the model generally fits the data well even out of sample (on posterior predictive goodness-of-fit checks and comparisons of network-level statistics); we obtain the estimates of all parameters and hyper-parameters in Equation (4) for this
${K_1=K_2=3}$
model fitted to the entire bipartite cosponsorship network. More specifically, we compute various quantities of interest in the form of predicted probabilities of block interactions and block memberships. As our discussion hinges on these derived quantities, we present all estimated values in Tables S.4 and S.5 in the Supplementary Material.
4.2 Pathways to Legislative Collaboration
What kinds of coalitions are at play when it comes to making cosponsorship decisions, and how do these coalitions interact when considering different types of legislation? Figure 5 presents the 107th Senate estimated blockmodel, showing cosponsorship probabilities between senator and bill groups. Node size reflects group frequency; edge shading, cosponsorship likelihood. Ideological distributions of senator and bill groups are also presented. The density for each senator group represents the distribution of ideal points of its members, while the density for a bill group is that of its members’ sponsors.

Figure 5 Blockmodel of senator and legislation latent group connection probabilities.
Note: Block size is proportional to the number of nodes expected to instantiate the corresponding latent group, and connections between them are shaded denoting cosponsorship probabilities between group members (darker shades indicate higher connection likelihoods). Senator groups tend to engage more with an “Uncontroversial” legislation group but less with a larger “Contentious” one. Next to each block, we also present the density of ideological positions of member senators (top row) and bill sponsors (bottom row), revealing that while ideology can help distinguish across types of senator coalitions, it cannot discriminate across relevant types of legislation.
As expected, Figure 5 shows senator groups aligning with party lines, including seasoned Republicans (e.g., Strom Thurmond [R-SC] and Jesse Helms [R-NC]) and Democrats (like Robert Byrd [D-WV] and Edward Kennedy [D-MA]). In addition, however, our model identifies a distinct senator group (depicted in purple) who stand out as having different cosponsorship patterns than their more partisan counterparts. Exemplars of this group, whom we call the junior power brokers, include Jon Corzine (D-NJ), Tom Carper (D-DE), Susan Collins (R-ME), Bill Frist (R-TN), Zell Miller (D-GA), and Hillary Clinton (D-NY)—all junior Senators at the time. Figure 6 presents the estimated mixed memberships of all Senators (i.e., their probability of acting as part of any of the discovered latent groups), highlighting a few of the most notable legislators of the session. Table S.2 in the Supplementary Material presents the top 10 members of each senator latent group by mixed membership probability.

Figure 6 Ternary plot of senator latent group membership probabilities.
Note: For clarity of presentation, example senators are colored by party. Senators in group 1 (top corner) are more likely to be Democrats, while senators in group 2 (right corner) are more likely to be Republicans; Group 3 (left corner) senators hail from both sides of the aisle and are likely to be junior and involved in cross-partisan bill sharing.
This third bipartisan group is likely to be formed by senators who have little experience in the Senate coming from all over the ideological spectrum, as evidenced by the distribution of ideological positions depicted over the corresponding group in Figure 5.Footnote 9 We explore this in the left-most panel of Figure 7, depicting how the probability of group membership changes as a function of ideology. While junior power brokers (depicted in purple) is primarily predicted to be composed of left-leaners, positions along the second ideological dimension (seen in the central panel of Figure 7)—often interpreted as capturing cross-cutting salient issues of the day (Poole and Rosenthal Reference Poole and Rosenthal2017)—distinguish this group of senators from their staunch Democratic counterparts.

Figure 7 Predicted mixed memberships of senator predictors.
Note: The y-axes plot average predicted mixed memberships across the three possible senator latent groups, given each shift in the value of a senator predictor in the x-axes; for instance at low values of Ideology (dimension) 1, the average predicted memberships for being in group 1 (Seasoned Democrats) and group 3 (power-brokers) are highest; as Ideology 1 values increase (corresponding to increase in the conservative direction), average predicted group 2 membership (Seasoned Republicans) increases and supplants group 1 entirely.
Many of these junior power brokers would become leaders within their parties. For instance, during the latter part of the 107th Congress, Republican Conference leader Trent Lott resigned and was swiftly replaced by Bill Frist (R-TN)—a top member of the junior power brokers identified by our model.
Similarly, many of them were pivotal “last” votes in large contentious bills that required just an extra nudge for passage. For example, consider the Farm Bill, designed to repeal the Freedom to Farm Act of 1996. While politics over agriculture had historically been regional rather than ideological, the Freedom to Farm Act was a significant deviation from that norm. Veteran senators Tom Daschle (D-SD) and Agriculture Committee Chairman Tom Harkin (D-IA) collaborated to bring the Farm Bill together, and negotiations began to generate the necessary support—including that of small dairy farmers affected by the bill. In the end, the largely Democratic set of supporters was complemented by key support from Republicans Susan Collins (R-ME) and Jeff Sessions (R-AL)—again identified by our model as likely members of the power brokers group. This role as brokers is further supported by analyses of the betweenness centrality of Senators who are likely to instantiate this group, which tends to be higher than that of Senators likely to instantiate other groups (see Table S.6 in the Supplementary Material).
The model is also able to identify the types of legislation which these groups of senators are likely to cosponsor. Specifically, the model uncovers three broad classes of bills and resolutions (depicted in the bottom row of circles in Figure 5), and the corresponding probabilities that members of any of the three senator groups will cosponsor them. While ideology plays an important role in defining the latent senator groups that structure cosponsorship (with right-skewed, left-skewed, and bimodal distributions characterizing membership into the three groups at the top of Figure 5), no such differences in the ideology of sponsors can help distinguish across the groups of legislation uncovered by our model (as indicated by the similarly bimodal densities of sponsor ideology across all three groups in the bottom of Figure 5).
We next show that investigating this nuance in bill composition can help us understand how collaborations took place during this nominally partisan Congress.
4.3 Legislation Types That Facilitate Cosponsorship
The largest type of legislation uncovered by our model is also the least likely to be supported by members of any senator group, suggesting that the bulk of legislation introduced in the Senate received little support from Senators other than the original sponsor. This latent class of bills, which we labeled “Contentious Bills” in Figure 5, consists of high-stakes bills on controversial economic issues and social programs, including those that handle the allocation of public funds for such programs. For example, the Senior Self-Sufficiency Act (SN 107 2842), Bioterrorism Awareness Act (SN 107 1548), and the Nationwide Health Tracking Act of 2002 (SN 107 2054) belong to this group. Table S.3 in the Supplementary Material presents details of legislation with the top ten mixed membership probabilities in each of the three latent groups.
The size of the “Contentious Bills” group grew during the last phase of the 107th Senate, after the 9/11 attacks. This is easily seen in Figure 8, which presents radar plots of predicted legislation memberships by phase of the Congress (panels from left to right present bills from the pre-Jeffords’ split phase, post-Jeffords’ split second phase, and post 9/11 phase). Each radar graph positions the six observed substantive topics along spokes of a wheel, and plots the predicted number of bills on that topic as a point along the corresponding spoke: the farther away from the wheel center, the more bills are predicted to be on that topic. Doing this for each of the three latent groups results in the three shaded polygons presented in each panel of the figure.Footnote 10 The dominance of bills in the “Contentious Bills” group in the third phase, depicted in orange, is readily apparent.

Figure 8 Radar graphs of predicted legislation by topic within each phase of Congress, by bill latent group.
Note: Panels are phases 1 (pre-Jeffords split), 2 (post-Jeffords split), and 3 (post 9/11) in the Congress, from left to right. Each radar plot includes bill topics as poles, with the estimated number of bills in the topic plotted against each pole, by latent group. Phase 2 produces the fewest pieces of legislation, while Phase 3 produces the most. Over time, the predicted number of bills in the “Contentious Bills” group (orange polygon) increases, especially in domains related to social public programs and the economy. The number of bills in the “Bipartisan Resolutions” group grew more slowly than that in the “Contentious Bills” block (green polygon), but has similarly favored social/public programs and the economy. Finally, the number of bills in the “Popular & Uncontroversial” (yellow polygon) changed the least throughout the session.
The composition of the other two latent bill groups uncovered by our model—the groups we have labeled “Bipartisan Resolutions” and “Uncontroversial Bills” in both Figures 5 and 8—provides valuable clues for understanding how cross-party collaboration took place. Specifically, the topical composition of the “Bipartisan Resolutions” almost mirrors that of the “Contentious Bills” (i.e., it is composed of pieces of legislation that deal with controversial public social programs and economic issues, as indicated by the similarly-proportioned shapes of green and orange polygons in Figure 8), but it is mainly composed of concurrent and simple resolutions, rather than bills. As they do not result in codified law (unlike continuing resolutions), such resolution offer low-staked opportunities to build bridges across partisan divides. Table S.3 in the Supplementary Material contains the top pieces of legislation in the group.
In turn, legislation in the comparatively smaller “Uncontroversial” group (shown in yellow in Figures 5 and 8) also draws consistent cosponsorship support from all senator groups and across the aisle, as pieces in it tend to be either uncontroversial resolutions or bills on popular social programs. For instance, the Senate joint resolution over the September 11 attacks (SJ 107 22) has the second-highest mixed membership probability in this group, followed closely by bills, such as the Railroad Retirement and Survivors’ Improvement Act of 2001 (SN 107 697). Such legislation forms a small but steady core that supplements low-stakes efforts (such as those in the “Bipartisan Resolutions” block), and that can nevertheless result in substantial legislation, such as the Family Opportunity Act of 2002 (SN 107 321).
The importance of this meaningful cooperation mechanism revealed by the blockmodel is particularly notable, as the model was able to identify it net of two important drivers of cosponsorship: quid pro quo behaviors, measured as the coefficient on the (log) proportion of “reciprocity” (Log Reciprocity), and the shared committee experience of a given senator–bill dyad (Shared Committee). For the former, our model suggests that a 1% increase in the reciprocity (i.e., the proportion of times the sponsor of a piece of legislation acted as a cosponsor for a given senator’s bill in the previous Congress) is associated with a roughly 2% increase in the odds of cosponsorship. In the case of the latter, we find that sharing a committee is significantly and positively associated with collaboration, making cosponsorship about 3.7 times more likely. These results, which are fully explored in Table S.4 in the Supplementary Material, are consistent with previous research on the determinants of legislative collaboration.
In sum, Senators appear to have leveraged a mix of low-stakes resolutions over potentially contentious issues and a small but important set of bills for which there was bipartisan support. This enabled them to build cross-partisan bridges and keep the 107th term from devolving into stalemate. Our model identified these novel patterns of cooperation after accounting for other, more traditional forces predictive of collaboration and cosponsorship. Our application offers clues on which kinds of legislators likely to collaborate, but also about which kinds of legislation make such collaborations possible. These insights would be lost when analyzing data aggregated over bills and their characteristics.Footnote 11
5 Conclusion
While bipartite networks are common in the social sciences, researchers often choose to project such data onto unipartite networks for analysis. As shown in this article, however, this projection results in loss of valuable information, and can lead to misleading conclusions about the community structure that drives tie formation in these types of networks except in the rarest of cases. Moreover, as the information that is lost through standard modes of aggregation cannot be recovered from projected data, this implies that bipartite networks should generally not be analyzed in their projected, unipartite form when the goal is to understand the role communities and groups play in network formation.
To address this problem, we have developed a new approach to modeling bipartite networks that allows researchers to directly study the role played by groups of nodes. As bipartite networks are quite common in the social sciences, we see natural applications in a number of different domains. For example, our model could be used to examine questions relating to country–trade product networks, state memberships in organizations, posts on social media platforms and hashtags, or product recommendation systems—all of which are theorized to be affected by groups (or segments) or actors. Readers interested in using our proposed approach in their own work can do so easily by installing the open-source software NetMix, available at (https://CRAN.R-project.org/package=NetMix). Our replication materials offer a good template for how to estimate the model and generate useful tables and figures for interpretation purposes.
While we believe that the proposed model is widely applicable, one drawback is its computational intensity. In particular, fitting the proposed model to a larger network data set may take considerable computational resources.Footnote 12 This makes it difficult for researchers to try different model specifications in a relatively short amount of time.Footnote 13 For example, one may prefer to conduct an exploratory analysis based on commonly used descriptive network statistics, which can be computed quickly, even on the original bipartite structure. At the very least, and if the size of the network necessitates projection, careful use of weights that maintain some of the heterogeneity that is typically lost through aggregation should be the norm (e.g., Newman Reference Newman2001).
In the future, given the prevalence of bipartite networks observed over time (e.g., Marrs et al. Reference Marrs, Campbell, Fosdick, Cranmer and Böhmelt2020), fruitful extensions of our proposed approach would allow researchers to incorporate dynamics into the generative model of bipartite network formation (for an extension incorporating dynamics in the unipartite case, see Olivella et al. Reference Olivella, Pratt and Imai2022). In addition, we could explore larger multi-mode networks, integrating entities like lobbying firms into cosponsorship networks or examining relationships among NGOs, IGOs, and groups of countries internationally. These networks allow for co-clustering of diverse actors sharing indirect connections, necessitating improved tools for studying relational data beyond traditional single-mode representations and avoiding aggregation bias.
Appendix 1 Projecting Bipartite Networks onto Unipartite Networks Is a Common Practice
Table A1 Applications with naturally bipartite applications in top field journals in 2000s.

Note: “Projected” indicates unipartite network considered for empirical application.
Data Availability Statement
Replication code for this article has been published in Code Ocean, a computational reproducibility platform that enables users to run the code, and can be viewed interactively at https://doi.org/10.24433/CO.3081370.v1 (Lo et al. 2025). The methods described in this article can be NetMix, available at https://CRAN.R-project.org/package=NetMix.
Ethical Standards
All analyses use publicly available, de-identified data.
Supplementary Material
The supplementary material for this article can be found at https://doi.org/10.1017/pan.2025.10021.