Football数据集可视化处理——gephi可视化处理数据

本文主要是介绍Football数据集可视化处理——gephi可视化处理数据，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

#1 football数据集的文件格式
根据如图所示football数据集和的文件格式如下所示：
下图表示football数据集节点部分信息
这里写图片描述
下图表示football数据集边的部分信息

根据上述两个图中的格式对football数据集的格式介绍可以介绍为如下所示：

Creator "Mark Newman on Sat Jul 22 05:32:16 2006"
graph
[node[id **value **label ****]...node[id **value **label ****]edge[id **value **label ****]...edge[id ***value **label ****]
]

#2 football数据集文件格式的转化
根据上述的football文件，我们将数据文件转化成两个文件，这两个文件分别用来存储football数据集的边信息和节点信息，对football数据集文件的处理如下。
##2.1 football数据集节点信息文件
根据gephi通过csv导入信息的需要，我们将数据信息处理成如下的数据集节点文件格式：

Id Label Value
1  Tom   3
2  Bob   4

在football数据集中将football.gml文件处理得到的结果如下所示：
这里写图片描述
其中:

Id:用于标识唯一的一个点
Label:标识节点的标签或者是名称
Value:标识节点的所属的社区。

##2.2 football数据集边信息文件
根据gephi通过csv导入数据的格式，我们分为有向图和无向图两种数据格式，对于有向图的导入数据格式如下所示：

Source Target Weight
1 3 2
2 4 1

根据上述公式:
Source:表示源节点
Target:表示目的结点
Weight:表示对应的边的权重

在无向图的导入中需要加入Type类型得出的数据格式如下所示：

Source Target Weigth Type
1 3 2 Undirected
2 4 1 Undirected

如下图所示为football数据集的数据个格式,football数据集是无权图因此没有有weight。
这里写图片描述

在football数据集的616条边中有三条边是重复出现的分别为

28 18
85 4
100 15

在通过gephi对这些边进行模块化社区划分运算的时候需要将这些边删除，否则无法运行。
##2.3 对football.gml处理代码

#include <iostream>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
using namespace std;int main()
{FILE* inputfile = NULL;FILE* nodefile = NULL;FILE* edgefile = NULL;inputfile = fopen("football.gml","r");nodefile = fopen("nodefile.txt","w");edgefile = fopen("edgefile.txt","w");fprintf(nodefile, "Id Label Value\n");fprintf(edgefile,"Source Target Type\n");char strLine[1024];int i = 0;int node = 0;int edge = 0;//char nodeinfo[100];char edgeinfo[100];while(!feof(inputfile)){fgets(strLine,1024, inputfile);if(strncmp(strLine+4,"id",2)==0 ){char id[5];char label[50];char value[5];memset(label,0,50);int idint = 0, valueint = 0;int copylen = 0;copylen = strlen(strLine) - 8;strncpy(id,strLine+7,copylen);idint = atoi(id)+1;fgets(strLine,1024, inputfile);copylen = strlen(strLine) - 13;strncpy(label,strLine+11,copylen);fgets(strLine,1024, inputfile);copylen = strlen(strLine) - 10;strncpy(value,strLine+10,copylen);valueint = atoi(value)+1;//cout << valueint << endl;fprintf(nodefile,"%d %s %d\n",idint,label,valueint);}if(strncmp(strLine+4,"source",6)==0){char target[5];char source[5];int sourceint = 0,targetint = 0;memset(target,0,5);memset(source,0,5);int copylen = 0;copylen = strlen(strLine)-12;strncpy(source,strLine+11,copylen);sourceint = atoi(source)+1;fgets(strLine,1024, inputfile);copylen = strlen(strLine)-12;strncpy(target,strLine+11,copylen);targetint = atoi(target)+1;fprintf(edgefile,"%d %d undirected\n",sourceint,targetint);}}fclose(nodefile);fclose(edgefile);return 0;
}

#3 gephi点表和边表的导入并生成football图像
(1)点击文件->Import spreadsheet如下图所示：
这里写图片描述