Communities and central nodes in the mobility network of US inventors

This study presents the basic features and community structure of the US inventor mobility network between 1999 and 2010, based on an analysis of patent documents. Since mobile inventors have proved to be among the most effective knowledge mediator entities, this mobility network can be seen as a knowledge diffusion network among innovative companies. During the investigation, we identified the basic features of the network, such as short effective diameter and scale-free degree distributions, and we also demonstrated the central nodes, community structure, and hidden core of the network. Our results indicate that there is a small number of nodes that can effectively absorb knowledge from the network and pool it. We also find that this core mostly consists of IT and semiconductor companies as well as the largest universities in the USA.


Introduction
According to Gassmann and Bader (2006), innovation and technologies are responsible for half of the economic growth in developed and industrialized economies, while leading innovative companies realize more sustainable profits than imitators and trend followers.However, R&D costs increase steeply due to shortening innovation cycles and the growing number of imitators, with empirical evidence indicating a positive correlation between the success of a company and the strength of its intellectual property, R&D, and patent portfolio (Gassmann and Bader, 2006).In innovation-driven economies, it is an obvious scientific question how knowledge as an essential resource of R&D activities is diffused among business entities.It is difficult to find an ultimate indicator for company innovativeness, but the most frequently used measure for it is patenting because patents provide a monopoly for their owners for a space in the technology arena.Approximately two-thirds of the market value of large companies in the USA can be traced to intellectual property, especially patents and trademarks (Shapiro and Pham, 2007).
Patents grant their owners monopoly rights over novel technologies.However, ideas about research directions or developments can be diffused from firm to firm and from inventor to inventor without infringing patent rights.The platform for knowledge flows can be formal agreements, collaborations, and informal social ties among researchers Intellectual properties are mainly embodied in patents and trademarks (Shapiro and Pham, 2007).According to the EPO, "A patent is a legal title granting the holder the executive right to make use of an invention for a limited area and time by stopping others from, among other things, making, using or selling it without authorization".This is an accurate definition; however, it only captures one aspect of patenting.In addition to the important function of protection and commercial exploitation of new technologies, patents provide important information for others on the market.According to Granstrand (1999), the contract between the patenting firm and society grants a temporary monopoly, on the one hand; on the other hand, it works as an information system that provides ideas about recent research directions for others.The patent system thus simultaneously stimulates invention and investment in R&D and public disclosure of technical information.Finally, it fosters technological progress and competition after patent protection has ceased (Granstrand, 1999).
The previous section shows the importance of intellectual property, i.e., a patent portfolio and the knowledge behind it, both for organizations and the economic system as well.However, knowledge and skills as sources of innovation and patents cannot be materialized; they are in the heads of the inventors and managers.
From the market participants' point of view, innovative ideas and knowledge can be considered a scarce resource.By pooling knowledge, firms become able to utilize complementary skills which are otherwise inaccessible and create new technologies or overcome resource restrictions (Penrose, 1959).Networks of innovation are the spaces where common knowledge creation and utilization take place among firms.The sharing of firms' resources leads to a decrease in the risk that the development, introduction, or application of a new technology entails (Freeman, 1991).Furthermore, it creates an opportunity to combine and access knowledge which would otherwise be impossible for individual organizations to attain (Freeman, 1991;Knell, 2011).Formal and informal networks play an equal role in the creation, transfer, and absorption of new knowledge and technology (Powell and Grodal, 2005).Formal networks are established based on collaborative innovation, in which case organizations utilize their resources in a cooperative manner to achieve their aims (Freeman, 1991).By today, these formal networks have developed into global cooperative systems and serve common knowledge-creating goals in several ways, e.g., research collaboration, joint ventures, technical assistance programs, and technological licensing agreements (Knell, 2011).
Informal network ties are seen as undeclared platforms for knowledge flow, and they function via various social interactions among the company's employees; in many cases, they lay the groundwork for the development of formal network ties or enhance their sustainment (Powell and Grodal, 2005).Using the steel mini-mill industry in the United States as an example, von Hippel (1987) demonstrates the way in which the personal network among engineers and the norms of the professional community facilitated a flow of technical knowledge among rival companies.
Although networks can produce a significant knowledge surplus, companies have to face the geographical (Jaffe et al., 1993) and technological (Rosenkopf and Almeida, 2003) limits of the accessibility of external sources of knowledge.The path dependency characteristic among companies derives from the fact that the knowledge search often happens in the local network, and thus it is only the knowledge capital of geographically proximate companies that can be utilized.On the other hand, the absorption capacity of a particular company is basically determined by its technological portfolio; therefore, organizations are not able to utilize any of the complementary sources of knowledge (Rosenkopf and Almeida, 2003).
In the present study, we consider knowledge flow via researcher mobility a special case of informal (and, to a certain extent, formal) network relationships because with mobility the company utilizes external skills, usually without any formal agreement.However, it is important to note that the flow of researchers among companies is possible in the case of different strategic alliances, which makes such phenomena the outcome of formal agreements.

2.3
The impact of mobility on organizations and inventors Rosenkopf and Almeida (2003) found that inventor mobility significantly fosters interorganization knowledge transfer, while such an effect could not be detected in the case of strategic alliances.This result stresses the importance of mobility for knowledge transfer.Other studies have also investigated the impact of mobility on organizations in many ways.Employee mobility and employee enticement from rival companies generate a serious existential risk for the parent company; on the other hand, they create important developmental potential for the progeny company (Phillips, 2002).Newly established and less embedded companies are able to increase their innovative capacity and overcome resource restrictions by attracting talent (Rao and Drazin, 2002).The reason why mobility can have such an impact on knowledge flow is that during an employee move employees take not only their human capital away but also their social capital, along with company routine and practices (Pennings and Wezel, 2007).Nevertheless, the utilization of knowledge and routines strongly depends on the structure of the new company since operating practice is not an individual-level task but a group-level one (Phillips, 2002;Rao and Drazin, 2002;Pennings and Wezel, 2007).Breschi and Lissoni (2006) argue that it is social distance -direct and indirect ties among inventors -that is of the greatest importance in knowledge spillover.The greater and more diverse a researcher's social capital is, the more easily he or she can access external or new knowledge through personal contacts.These crucial interpersonal links are mostly established by mobile inventors who move from firm to firm and sell their brainpower to various business entities (Breschi and Lissoni, 2006;Moen, 2000;Lamoreaux and Sokoloff, 2009).This increased social capital among mobile inventors is one of the key factors in knowledge transfer (Breschi and Lissoni, 2009).Moreover, mobility influences researcher productivity as well.The greater the size of the organization, the more productive its inventors are due to the availability of a significantly greater number of resources for R&D projects and the lower the perceived risk of company failure (Hoisl, 2007;Kim el al., 2004).Mobile researchers are more productive than their immobile fellows; nevertheless, increasing productivity decreases willingness for mobility due to effective research practice (Hoisl, 2007).Researcher productivity also increases if mobility happens in the direction of a less path-dependent organization and if the inventor possesses unique knowledge compared to the existing knowledge base of the particular company (Song et al., 2003).

Patent analysis methods in knowledge diffusion studies
Various studies using a variety of methods have relied on patent data to investigate knowledge spillover and the knowledge diffusion process.Numerous studies focus on patent co-citations, which represent the links between older (cited patent) and novel (citing patent) inventions and can therefore trace how existing knowledge affects new knowledge.A pioneer investigation in this field was carried out by Jaffe et al. (1993) in the area of patent co-citation and local knowledge spillover.The authors revealed the importance of localization in knowledge flows by finding evidence that with cocited patents the cited patent is more likely to come from the same state and from the same metropolitan area than randomly selected patents in the same field of technology.Wang et al. (2011) identified fields of technology among leading organizations by analyzing patent co-citations among Fortune 500 companies.Petruzzelli et al. (2015) also measured the inter-and intra-firm and -industry impact of biotechnology innovations through patent co-citations.Xuefeng et al. (2012) investigated Chinese and American co-patenting companies based on USPTO datasets.Co-patenting in this sense denotes the sharing of developmental expenditures and rights of a patent between firms from the USA and China as well.Breschi and Lissoni (2009) analyzed a co-invention network of inventors to investigate knowledge flow processes.Co-invention occurs when inventors jointly file at least one patent application.In that study, mobile inventors are defined as individuals who move across companies and file patent applications with different assignees.The authors found that co-citations of patents occur more frequently when inventors are closer to each other in the co-inventor network.These close ties between inventors are established by mobile inventors; they can therefore be seen as key figures in the knowledge spillover process (Breschi and Lissoni, 2009).This approach is incorporated into the present study.Although we describe the knowledge flow network based only on inventor mobility, we do not measure the co-citation and co-invention properties of the network in this study.

Concept of the mobility network of US inventors
Since inventor mobility has proved to be a key factor in inter-organizational knowledge transfer, tracing movements enables us to draw up the organization-level mobility network among inventors.The present study aims to detect and investigate researcher mobility and the network of inter-organizational knowledge flow through an examination of patent documents.Our approach identifies the mobility of inventors who file patents with changes in assignees over time.Thus, the network that we aim to investigate is that of companies with patent rights, while the knowledge flow among them is demonstrated through the continuous patenting activity of mobile inventors.Movements identified in this way do not necessarily mean that the researcher has physically changed jobs.However, it can be considered mobility in the sense that he or she utilizes his or her mental capacity and technological and organizational knowledge acquired at another company; hence, we see complementary skills and a flow of knowledge among organizations.It is important to emphasize the fact that this network is only one of the possible approaches to the entire knowledge flow among organizations.In this case, manager movements cannot be detected, nor can those cases when ideas flow via interpersonal relations such as through advice from researchers working for different companies.It is also impossible to detect knowledge flow which is generated by the mobility of inventors whose innovation is not manifested in patents.Furthermore, the flow of knowledge based on existing patents also remains invisible for this method.However, as we have seen in the introduction, protecting valuable knowledge with patents bears great economic significance; thus, researchers that create patents are considered key figures in knowledge creation, and the knowledge flow generated by their mobility is seen as a substantial aspect of the total knowledge transfer.
We believe that our study represents the first attempt to describe a network of inventor mobility in the United States and, thereby, a network of knowledge flow among organizations.Therefore, the first aim of our study is to explore the basic properties of this network.Our second aim is to investigate the community structure of the network, seeking answers to the following questions: Which organizations tend to interact with each other and why?What is the organizing principle of the communities?How do path dependency and technological distance influence the flow of knowledge and which organizations are able to overcome them?
3 Dataset and methods

Source data and transformation
Our investigation is based on Harvard University's Patent Network Dataverse (Lai et al., 2011).This is a cleaned and disambiguated dataset based on the NBER database and the US Patent and Trademark Office weekly publications from 1975 to 2010.In their study, the authors propose a disambiguation algorithm which is an application and further development of the existing A uthor-ity approach developed by Torvik and Smalheiser (2009).The primary input data of the disambiguation algorithm is a cleaned and formatted version of the data sources noted above.The processed dataset consists of the name of the inventor, the patent, the assignee of the patent, the technology class of the patent, and the location of the inventor.This data preparation process defines inventor-patent instances, which form the units of analysis.The algorithm applies a comparison function which returns the similarity vector of inventor-patent instance pairs.The dimensions of the similarity vector, and the scale of the similarity are the following: first name [0.Zero values were assigned when the variable values were completely different and maximum values were assigned when they were identical in the case of inventor-patent instance pairs.To decide whether the given inventor-patent instance pairs match or not, the algorithm uses an iterative blocking scheme.As a result of the classification process in the disambiguated database, the inventors receive unique ID numbers to distinguish them from others with very similar attributes.The primary aim of this dataset is to investigate the coauthorship and collaboration networks among inventors who have registered patents in the USA.The network files are split into three-year intervals, so the ties in particular files only contain collaborations during that specific time interval.
In the original network files, the nodes represent the inventors.Every inventor has the following attributes: inventor ID, assignee, assignee ID, first name, last name, city, state, country, and patents.If some of the attributes of the inventors change, one can trace them as a difference in attributes for the same inventor among network files applied to different time intervals.For example, if an inventor works for IBM and takes a patent out on some technology between 1999 and 2001, his or her assignee attribute in the network file will be "IBM".If this inventor later moves from IBM to Microsoft, his or her attribute in the corresponding network file will change to "Microsoft".We used the 1999 to 2010 time interval, because a longer interval would present us with older, obsolete edges in our network, which may represent knowledge flows which are no longer relevant.Although 1999 to 2002 seems remote from 2010, that period http://www.open-jim.org102 coincided with the dot-com boom, which indicated the growing importance of the IT industry in the economy.We therefore considered this interval an important turning point in technology development which created an industrial environment that is relevant even in our days.
The differences in the inventors' attributes among the network files gave us the idea of transforming the collaboration network to the network of firms, where the edges represent the migration of inventors detected through temporal changes of the assignee attribute in the network data.This network is directed, meaning we can name the source and target of the knowledge flow represented by the migration.The attributes of the nodes in our network are the following: assignee, assignee ID, community, in-degree, out-degree, betweenness centrality, and multi-edge node pairs.The assignee and assignee ID variables signify the name and the unique identifier of organizations that hold the patent rights.Inventors with no assignee have been excluded from our network.
The other attributes of the nodes will be described later.On the edges of the network, we defined one attribute, weight.Weight denotes the number of mobile inventors that move from the source company to the target company.We wanted to focus on US inventors; hence, we viewed mobility as knowledge flow only in the case of US residents.This filtering was based on the country attribute of the inventor nodes from the Patent Network Dataverse.Although international innovation networks play an important role in knowledge creation (Knell, 2011), our assumption is that foreign inventors with patent applications in the USA tend to work for multinational corporations, while smaller foreign firms cannot file their patents abroad.If foreign inventors were present in the network, it would over represent multinational corporations.In spite of the fact that US firms hire inventors with foreign residency, we assume that restricting the mobility to US residents gives us a better and less biased picture of knowledge flow patterns in the USA.

Definitions and variables
In the following part of this section, we explain the basic definitions, which are crucial for an understanding of our results.
A network G consists of a set of nodes, N={n 1 , n 2 , … , n g }, and a set of edges, E={e 1 , e 2 , … , e l }.Nodes represent the entities we examine, while edges indicate the presence or absence of some relation between them (Wasserman and Faust, 2009).In our case, as we have already noted, companies, institutions, and universities are the nodes, and the migration of mobile inventors among them are the edges that we consider as knowledge flow.Our network is directed; therefore, we can count in-degrees and outdegrees on the nodes.In-degree is how many companies sent inventors to a particular node, and out-degree is how many companies received inventors from it.Geodesic distance: the length of the shortest path between two nodes in the network.If a company shares an edge with another company, the geodesic distance between them is one.If a company has no edge with the other company, but they have a common neighbor, the geodesic distance is two.Breschi and Lissoni (2009) showed that short geodesic distance between inventors greatly increases the probability of knowledge flows.We also assume that the closer the organizations are in the mobility network, the more chance they have to reach each other's knowledge base.
Network diameter: the shortest path between the two furthest nodes; in other words, this is the longest geodesic distance in the network.Large network diameter may present a less effective feature of knowledge diffusion since there are nodes which are inaccessibly far away from others.This can happen when there is almost no knowledge exchange among industries or among strategic alliances.Characteristic path length: the average geodesic distance between any nodes in the network.This is a very important feature of the mobility network because the smaller the characteristic path length is, the greater the chance for knowledge diffusion among nodes, and the more effective the network is.
Betweenness centrality: one of the most often used measures of the centrality indices besides the in-degree and out-degree values.Betweenness centrality indicates the proportion of pathways passing through the given node.If the betweenness centrality is high, the node serves as a mediator of information in the network (Freeman, 1979).
In the case of the mobility network, nodes with large betweenness centrality values represent the backbone of knowledge diffusion.An enormous amount of knowledge is accessible for and is mediated by such organizations.
Multi-edge node pairs: the number of mutually connected neighbors.In our case, the number of neighbors who also send and receive knowledge from a given node.
Organizations with multi-edge connections use each other's knowledge base.The parts of the network where multi-edge connections frequently occur can be seen knowledge pools.
Communities (or clusters): densely connected parts of the network, in which the nodes probably share some common property or play similar roles (Fortunato, 2010).
Communities are densely connected internally and sparsely connected externally.In our case, we expect that communities will be based on industrial similarities, where nodes from the same industry exchange knowledge with each other more often than with other communities based on another industry.The community attribute of a node indicates the number of the community to which the organization belongs.
Modularity: a measure of how densely communities are connected to each other and how appropriate the network is to find communities in it.The modularity score represents the strength of the communities.The stronger the community structure of a network, the more edges exist inside those communities and the fewer exist outside.
Strong communities in the case of a mobility network means that the potential circle of organizations is small for an inventor who would like to move from one to another.It also indicates the overall path dependency of the organizations since it restricts the circle of companies with which they can exchange knowledge.The modularity value varies between 0 and 1.The 0 value indicates a network without communities, and 1 characterizes graphs with perfect communities (where edges exist only inside the communities).The modularity score of real world-based networks often varies between 0.3 and 0.7 (Fortunato, 2010; Newman and Girvan, 2004).On the time horizon of the investigation, we found 28,695 companies, universities, and institutions involved in inventor mobility.Among these, there exist 50,170 paths, where inventors move from one to the other.This network of inter-firm inventor mobility is an exceedingly sparse network with only a 0.0001 network density value.In other words, only 0.01% of the possible ways exist among the nodes.Network density value would be 1 if all the firms in the network were connected to all the others.On the pathways, we detected 83,640 inventors who changed their position between 1999 and 2010.
The biggest component in the mobility network contains a large number of nodes.20,998 organizations out of 28,697 are connected to a giant component.Next to 73% of the nodes, this coherent subgraph contains 45,707 edges, 91% of all of them.This interconnected component is the core of the knowledge flow.Many of the firms are linked to this network by almost all of the mobility paths.This is the main platform where the entities compete for inventors' knowledge.If a company is tied to this network by a mobile inventor, it means that it transmitted or received knowledge from this common platform for knowledge exchange.Due to this connectedness, knowledge and experience can be accumulated in the network, since it is available for the interconnected nodes.We can consider this knowledge system as the space for knowledge recombination and for the acquisition of social capital.
The network diameter is 17, so seventeen hops separate the furthest companies from each other.Nonetheless, the characteristic path length is just 4.81, which means the average distance between any two firms in the network is slightly less than five paths (Fig. 1).This indicates a small world property, where despite the sparseness and size of the network, the indirect links between firms are quite short (Wats and Strogatz, 1998).Firms competing for the same inventors are similar in their knowledge needs.
As we have noted, the closer firms are in the mobility graph, the more similar the knowledge required for their innovation practice.Therefore, it is striking that the characteristic path length is less than five among patenting firms in the USA, regardless of the location, industry, or size of the companies.This indicates that there must be a set of companies or technology fields which are less path-dependent with their considerable absorption capacity and are responsible for making the network with such diverse nodes so "small". http://www.open-jim.org105

Degree distributions and correlation of network variables
In our first attempt to unravel the mystery, we investigate centrality indices for the firms.One of the most important traits of a network is the degree distribution or, in other words, the distribution of the edges among the nodes.This is a highly informative property of the graph; hence, it can ultimately describe the structure of the network.If the degree distribution follows a normal or Gauss distribution, the nodes in the network will have a typical value of edges connecting them to others.In the case of such a distribution, extremely large values are rare or not present and the average degree characterizes the biggest proportion of the nodes in the network.However, if the degrees follow a power-law distribution (i.e., the network is scale-free), there is no characteristic value at the edges.Many nodes have just a few links -and most have only one link -to other nodes, while a small but considerable group of them has a high or extremely high number of direct paths to others.The power-law distribution can be described with its exponent, which is -1.656 for in-degrees and -1.585 for out-degrees in our case.In classical studies of networks, for example, the World Wide Web, the exponents have been between -2 and -2.5 (e.g., Barabási, 2011).Where the exponent is closer to zero, there is a bigger chance of there being nodes with values higher than one. http://www.open-jim.org106

Fig. 2. In-degree Distribution of the Nodes on a Log-log Scale
As we can see in Fig. 2, the in-degree distribution of the mobility network is unbalanced.Instead of a characteristic or average edge value, many nodes ( 8153) have managed to obtain only one edge, but a small minority can absorb connections more frequently.Some of the nodes have more than 100 acquaintance organizations (the maximum is 360) that transmit knowledge to them directly.These powerful companies in the network are the greatest beneficiaries of the knowledge flow.Due to the fact that they are the most attractive for inventors, they represent the core of the competences and accumulated experience in the network.It is likely that the knowledge and social capital of the researcher at these centers grow multiple times compared to those at the peripheries.It seems these nodes serve as "black holes" in the knowledge network and can absorb a great amount of -and maybe more diverse -knowledge from it.The next question is whether they put something back into the common pot or not?The function of out-degree distribution (Fig. 3) is quite similar to that of the previous one.Hence, there are firms, institutions, or universities in the network that do the opposite of the knowledge "black holes".The tail of the distribution, namely, the nodes with the greatest out-degrees, transmit knowledge to the network and continuously lose their researchers and knowledge base.The greatest giver in this case had to let its patenting inventors go to 394 different organizations.How could a firm survive such great losses?Maybe these great knowledge-providers went bankrupt or downsized.
As we can see in Fig. 4, there is an undoubtedly strong positive linear correlation between the in-and out-degrees of the nodes.In this sense, both the great knowledge takers and givers are the same entities.These strong nodes simultaneously serve brain drain and brain gain functions in the network.They function as cores of knowledge creation, accumulation, and distribution.These are junctions, the most frequent platforms where inventors with various corporate histories can meet and increase their professional skills and social capital.However, these organizations also mediate knowledge to the network through departing researchers with their rich human and social capital.This high equality of knowledge absorption and loss is also a feature of the less frequented nodes.Due to this strong linearity in the degree distributions, there is a low number and extent of outliers.A further interesting property of the network is the fact that the slope of the curve is very close to 1.This indicates that in general the number of organizations from which the company can recruit inventors tends to be equal to the number of organizations which can recruit researchers from that same company.Although there are a few cores that frequently absorb and transmit knowledge, it is doubtful how effectively they do so.It is possible that they simply recruit from and hand over inventors and knowledge to a clique of nodes, and therefore the knowledge of this subset of companies circulates within a relative small space.Table 1 shows the correlation matrix of the main centrality indices and the variable of multi-edge node pairs, which is the measure of mutually connected neighbors of any node.According to the correlation matrix, betweenness centrality and multi-edge node pairs also show a surprisingly strong correlation with in-degree and out-degree scores.The betweenness centrality score indicates the proportion of pathways passing through the particular node.If the betweenness centrality is high, the node serves as a mediator of information in the network.The strong correlation between the degree centrality variables and betweenness shows that the cores of the mobility have far-reaching ties in the network, therefore functioning as the backbone of knowledge diffusion.On the other hand, the correlation between the three centrality indices and the multi-edge node pairs variable means that the higher number of edges a given company has, the more likely it is to send and receive knowledge from entities of similar size.

The top 30 central nodes in the network
Table 2 contains a list of the top 30 central nodes with rank scores of in-degree, outdegree, and betweenness centrality variables.In this case, the value "1" represents the highest value of a given variable.The table is sorted by in-degree ranks.Although the identity of the companies outlines the fact that the most IP-intensive industries are pharmaceuticals, communications equipment, and semiconductors (Shapiro and Pham, 2007), it is the dominance of IT and IT-related sectors that emerges from the data, while members of the pharmaceutical industry are at the back of the pack.It is a pleasant surprise that the four universities managed to make their way to the top 30.However, it is not surprising that they form a small elite of the world's leading universities.In the upper right quartile in Fig. 4, we can see seven dots separating them from the others.These are the most powerful nodes in the network.With one exception, they are the top-ranked entities in all three categories.

5
Community structure and the core of the network In the previous section, we highlighted the basic properties of the inventors' mobility network.However, the main organizing principle of the network ties has still remained hidden!From whose knowledge base do companies tend to absorb external knowledge and why?

Modularity score and community structure of the mobility network
In this section, we investigate whether the network contains communities in which organizations tend to develop more ties with their community members than with outer nodes.Community structure can provide us with a better understanding of the organizing principle of the knowledge flow network.In order to identify these communities, we used the fastgreedy algorithm proposed by Clauset et al. (2004).
Based on this algorithm, we found that the modularity value of the network is 0.72, which suggests an exceedingly strong community structure.The modularity value in real world networks very rarely reaches such a high modularity score.This indicates that many of the nodes keep in touch with restricted types of organizations with whom they are willing or able to exchange inventors.Presumably, this is the system-level outcome of the path dependency of individual organizations.When organizations absorb knowledge from others, the closer that knowledge is to their existing knowledge base, the easier it is to successfully achieve the absorption.Therefore, the strong cluster structure and high modularity value stress the fact that many of the nodes in the network must be highly path-dependent with a very restricted possible circle of knowledge absorption.It also predicts the importance of industrial differences in the knowledge flow system.
Although the fastgreedy algorithm found 268 clusters, the six largest of them contain more than 50% of the nodes.These six clusters were examined in this study for clear interpretability.We analyzed them as independent graphs; hence, only the intracommunity edges are present in the statistics.Table 3 contains the number of nodes (Nodes), number of edges (Edges), characteristic path length (Cp), and network diameter (Nd) values of the communities as well as a list of the five most central companies per community.
All the communities can be seen as significantly separated subgraphs of the whole mobility graph.Surprisingly, their basic properties are very to the features of the parent graph.While not presented in the table, these subgraphs are also scale-free networks with power-law degree distribution.On the other hand, these communities can easily be characterized by their central nodes.Community No. 1 is led by pharmaceutical companies, the second consists of firms in the household, fashion, and cosmetics industries, and the third is made up of medical technology-related companies.The IT and communication industries, regardless of the hardware or software feature, form the biggest community with a very short diameter and characteristic path length.In the fifth cluster, we can find the US Navy and the prestigious universities, while the sixth cluster consists of the semiconductor industry.
It seems that the main organizing principle of the network is the industrial structure of the economy.This is not surprising since every industrial area has specific knowledge needs.Although it is more striking that despite the strong impact of industrial knowledge needs on the network structure, the whole mobility network has almost as short a diameter and characteristic path lengths as the clusters in it do.It seems that, despite the strong community structure, there are companies which can overcome path dependency and that this ability increases the effectiveness of knowledge transfer on the system level.As we can see, the most central nodes in the whole network are the leaders of the communities as well.Moreover, according to the results in Table 1, which have been examined previously, the more central a particular node, the more mutual ties it has with other central nodes.Therefore, next to the strong cluster structure of the network, community leaders must have frequent contacts not just with their community members but potentially with other community leaders as well.It is possible that despite the strong modularity of the network, network leaders have formed a sort of elite club with dense and mutual ties.

The core of the mobility network
To examine the core of the mobility network postulated above, we used the k-core algorithm proposed by Batagelj and Zaversnik (2002), which finds the most densely connected and interconnected subgraph in the network.Fig. 5 shows a graphic representation of the result attained with this algorithm.The core of the mobility network consists of 58 organizations with at least 23 ties to other core members.The total number of edges in this subnetwork is 1112, which indicates high density since the maximum number of possible edges is 3306.The circular layout and spatial proximity help to identify common community memberships.Parallel to the size of the communities, the IT industry -the biggest circle of nodes -is by far the most overrepresented cluster in the core with 31 companies.Besides hardware and software companies, the dominant community in the core is that of the universities with 10 institutions of higher education, supplemented by the US Navy and 3M Properties.
Semiconductor companies are also frequently represented in the core graph, but the pharmaceutical industry is not included.
Our study has aimed to explore the mobility network of US inventors.Based on the transformed Harvard University's Patent Network Dataverse, the present paper has highlighted the knowledge flow network among US firms, institutions, and universities tracked by mobile inventors from 1999 to 2010.
Studies conducted so far in this area have investigated specific economic segments, such as the semiconductor and pharmaceutical industries.To the best of our knowledge, this is the first attempt to examine the inventor mobility phenomenon on a nationwide, multi-industrial level.Our findings may therefore provide a wider perspective on the inter-organizational knowledge flow processes via inventor mobility, such as interindustry knowledge transfer, system level innovation communities, and identification of community and network leader organizations.Consistent with previous studies, our results underline the importance of knowledge pooling as well as informal innovation networks and researcher mobility in knowledge transfer.We have also pointed to the strong impact of path dependency on inventor mobility, but at the same time we found evidence of a previously hidden coherent network, where some of the nodes which manage to overcome path dependency can achieve significant advantages in knowledge absorption.
We have demonstrated that this system is a scale-free network with a short effective diameter and small characteristic path length, where central nodes simultaneously engage in both brain drain and brain gain functions.Giant communities in the graph reflect the set of the most frequent patenting industries, plus the unique cluster of the Navy and universities.Though the community structure is extremely strong in this network, community leader entities hold important intra-and inter-cluster ties as well.
According to recent outcomes, three main lessons can be learned from the core structure of the network.First, the IT and semiconductor industries have emerged as organizers of the mobility network with a high absorption capacity and low path dependency.The leaders in these industries are the far more central and interconnected nodes in the network.Their high betweenness enables these companies to search effectively in the network.The literature (Breschi and Lissoni, 2009) has highlighted the fact that mobile inventors' personal networks play a significant role in knowledge spillover and knowledge diffusion.With their far-reaching ties, these central firms are able to absorb knowledge from other communities and expand the search horizon to the social ties of new employees.On the other hand, scientists who leave these organizations bring valuable intellectual and social capital to periphery firms and mediate knowledge to them.The knowledge-emitting function of the core is supported by the fact that inventors were obtained from the 58 organizations noted above as follows: they were received by 1210 nodes out of a total 8153 nodes with 1 in-degree, 705 nodes out of a total 2833 nodes with 2 in-degrees, and 436 nodes out of a total 1139 nodes with 3 indegrees.
The second remarkable consequence of the results is the central position of universities.It seems that they are no longer "ivory towers" of science but proactive creators and mediators of knowledge not only in educating people but also in patenting and exchanging inventors with others.Nonetheless, they are not just isolated points in the system, but leading and organizing entities within one of the largest communities of knowledge transfer.
Third, the dense mobility ties between rival firms underline the importance of the knowledge transfer represented by inventor mobility.It is very likely that companies must continuously search and obtain skilled inventors from rivals in order to keep up with rapid technological development and with the expanding knowledge base of their competitors.This frequent inventor exchange gives rise to an effective and coherent, but invisible and informal knowledge network among innovative organizations in the United States.Furthermore, in the center of the network, the dense core of the biggest companies form and run a valuable knowledge pool.It is possible that this pool serves a dual function.First, it levels knowledge among these firms, preventing individual organizations from breaking away and enabling them to use the most advanced technologies and ideas.Second, it preserves the technological advantage of the core members against the periphery.

Limitations
One of the greatest drawbacks of our work is the temporality feature of the source files, since the original Dataverse network files are split into three-year intervals.The disadvantage of this is if an inventor changes assignees more than once in three years, only one assignee can be indicated for him or her in the network file.A side-effect of this is that our analysis underestimates frequency of mobility in the graph.The other limitation factor of our study lies in the noise in the assignee names.Although inventor names are disambiguated, unfortunately, in some cases, the misspelling of assignee names creates false nodes in the network.The third limitation we faced during our research is associated with the size of the network.Since nearly twenty-one thousand organizations are represented in our network, we could not involve a further set of control variables in the investigation, only the network-based ones.We were not able to examine such properties as size, market value, profile, or type of organization (e.g., profit-oriented company, governmental institution, university etc.).In the case of the central nodes, we deduced these variables from the names of the organizations, but overall statistics cannot be displayed.We also could not filter mergers, acquisitions, and parent company and subsidiary relations.

Managerial implications
Our findings represent further evidence of the importance of inventors' mobility in technology development and knowledge transfer.As we have seen, large successful companies maintain a knowledge pool, which is a place for effective knowledge recombination.We also presented the knowledge absorption and emission feature of this core, where IT-related companies are overrepresented compared to other industries.Consistent with recent findings, acquiring external knowledge in order to catch up with advanced organizations on the market by hiring inventors from core organizations in the network can foster technology development, especially if the inventors come from the IT industry.

Avenues for further research
We believe that this approach toward the mobility of inventors and knowledge diffusion processes offers great potential for further research.One of the possible options for future studies is to analyze the evolution of the network.This could reveal how and when ways of knowledge diffusion have changed over time and how successful companies (such as Microsoft and Apple) have risen in the network.It may provide a better understanding of how knowledge flow supports emerging organizations.Another possible direction is to examine the career of inventors by measuring their productivity compared to the network properties of companies where they invent technology or to the diversification of the classes of technology to which their inventions belong.The third most promising path for future research is to analyze the specific features of individual communities.The remarkable importance of universities in the knowledge flow network, for example, raises the question of how academic inventors move between the academic and the business sector, or what specific features universities show in the network.Fourth, it would also be interesting to compare the mobility network of inventors with the patent co-citation network.Once we identify the similarities and differences between these two networks, we could gain an understanding of the different forms and mechanisms of inter-organizational knowledge transfer.

Fig. 1 .
Fig. 1.The Frequency Distribution of the Shortest Paths in the Network

Fig. 3 .
Fig. 3. Out-degree Distribution of the Nodes on a Log-log Scale

Fig. 4 .
Fig. 4. Linear Regression of the In-degree and Out-degree variables.The regression is significant at the 0.01 level.

Table 1 .
Person Correlation Matrix.All the correlations are significant at the 0.01 level.

Table 2 .
The Most Central Nodes in the Network Sorted by In-degree Ranks

Table 3 .
The Six Biggest Communities in the Mobility Network