Gephi is a really powerful data visualizing tool with comprehensive features, supporting calculations, formatting, filtering and etc. It also provide user abundant choice on layout and coloring with multiple ways of classification on nodes and edges. So in this assignment, I tried many combinations of layouts, ranking of nodes and partitioning to see how different dimensions are combined to show more information about baptism of native American.
After I go through what kinds of information provided in each columns of the provided csv file, I want to see how is family members related in baptism. However the all family relationship are not recorded with usable format and the file is sorted by timeline, which makes it hard to inspect the relationships of family members. Therefore, I attempted to involve all family relationships in edges. I classified edges with four kinds: parent-child, couple, sibling and relatives.
First image here is the default layout after I import the csv files of nodes and edges. The import of data in Gephi is very flexible and it allows user to do editions to the data tables and export the data. My original Gephi project went wrong and all data is lost, but thankfully I export my edges table right after I completed it. Each nodes represent a person and edge represent there is family relationship between two people. The data is consisted of over 300 nodes and over 500 edges, so it looks very messy at first, so we can hardly get any useful information from the graph right now.
Then I partition the nodes with by whom each person are baptized and use Force Atlas and Yifan Hu layout to the graph by sizing nodes with degree of them. From the graph we can see that the layout forms of both Force AtlasYifan Hu put closer related nodes together and each group of nodes are usually in the same same color, which makes perfect sense that relatives are more possible to be baptized by the same person. Then I use Fruchterman Reingold layout to have better view of relationships of all people.
When I partitioned the nodes with nation in Yifan Hu layout, Gephi provides me an interesting graph. We can see that most people are from three nations: Delaware, Wanpanoag and Mahican. The top right corner is mainly consisted of green, top left and bottom right are mainly consisted of purple and left bottom corner is consisted of blue. The obvious separation between three nations is actually very reasonable; it is intuitive that family members are usually in the same nations. But if look closer to the boundaries of each sub parts, we can see that the density of edges is significantly lower that the partition of each nation. So compared the Yifan Hu graph above we can assume that the difference between nations contributes more to the grouping of nodes than by whom each person is baptized.
Although the graph looks cleaner, it still provides limited information of the data. One advantage of Gephi is that users can use different partitioning and layouts to make comparisons. Therefore, I tried several different partitionings and had insights into one small group of nodes. Nodes are partitioned with baptized by whom, Eigenvector centrality, modularity and nations. We can see that family members share a lot commons in baptism. In the graphs I listed below, we can see that this group of nodes is mainly consisted by two sub networks, which are connected by “Esther” in the middle. And we can see that those two sub networks are partitioned with different colors in both Eigenvector and nations partitioning, so Gephi works great on separating nodes and family members are usually highly related in baptism.
I also apply ranking of degree on nodes, and I took insight into four nodes with highest degree. Interestingly, those four node have different features: Augustus has two wives connected by two thick green edges, who are Ana Benigna and Esther; Salome has a lot of siblings; Nicodemus has a lot of children; Abraham has a larger family tree. Although those four have different structure of family network, all of them show that family has big influence on the spread of Christianity.
Because I manually typed in all edges for the graph, I found many interesting features in the data, which are not revealed in Gephi due to the lack of consideration on time. Because people are sorted by the date of baptism, people who were baptized earlier are usually parents. However, there are special cases that parents were baptized later. Moreover, there are also a lot couples that one of them may be baptized after marriage (obvious for those who have second wives) and many kids are baptized in their young age. Therefore, we can assume that family relationship contributes a lot in the spread of Christianity.
I also found some weird problems when I was using Gephi. Apart from the failure of loading my previous object, the percentage calculated in partitioning of edge also seems wrong. The proportion of each kind of edge is right, but the sum of them is only 1%.
Above all, compared to all data visualization tool we have been learning this semester, Gephi is the most powerful one, which provide comprehensive tools, flexible manipulation on data and more aesthetically pleasing features. After this assignment, I have learned many useful skills of Gephi, but I found there are still many features I haven’t used, so I hope I can learn and use more powerful tools and features in Gephi in future data visualization projects.






























































