
Data Mining Parallelization Method Based on Spark Platform
Abstract: With the continuous transformation of Internet technology and the rise of various social software, virtual social network gradually replaced the traditional social way to become the new era of people exchanges and information dissemination of the main stage. But the social network is a double-edged sword, in the communication becomes simple, fast at the same time, there are false users, network fraud and other issues. In the wave of large data mining, the analysis of the social network of the community structure, help to understand the user's communication model, the user classification and behavior analysis. It is the goal of this paper to group users into different groups by means of group recognition in social networks. This paper firstly analyzes the group construction method and basic characteristics of social network by means of Facebook as an example, and uses the louvain algorithm based on modularity to identify the user group. [版权所有:http://DOC163.com]
Key words:Data mining; social network analysis; community discovery

目 录
1绪论 1
1.1 研究背景 1
1.2 研究意义 2
1.3 国内外研究现状 4
2社交网络理论 5
2.1六度分割理论 5
2.2复杂网络理论 5
2.2.1复杂网络的概念 5
2.2.2复杂网络特征 6
2.3群体发现方法 8
2.3.1群体结构 8
2.3.2群体发现算法 8
2.4社会网络分析方法 9
2.4.1社会网络分析概述 9
2.4.2网络密度 9
2.4.3网络中心性 9
2.5分析软件UCINET 10
3 facebook网络群体结构分析 10
3.1样本数据获取与处理 10
3.1.1facebook中群体的特点 10
3.1.2数据获取与处理 11
3.2facebook群体的网络特征 12
3.2.1网络图 12
3.2.2网络密度 13
3.2.3中心性分析 13
3.3本章小结 14
4群体识别方法应用实例 15
4.1基于中心性的核心用户识别方法 15
4.1.1中心性分析 15
4.1.2实验分析 15
4.2基于模块度的louvain算法实现群体识别 16
4.2.1 louvain算法 16
4.2.2 facebook用户数据使用louvain算法进行社区发现 17
5总结与展望 19
参考文献 20
致谢 22 [版权所有:http://DOC163.com]