Community detection is a method used to find groups or clusters within network through graphs.
We have used the Louvain algorithm for community detection in our ingredients network. This algorithm uses a parameter called modularity to extract communities or groups. Modularity measures the strength of division of a network into modules (groups). High modularity in networks represents dense connections between the nodes within modules, and sparse connections in different modules.
The Louvain Algorithm is a greedy optimisation method that tries to optimise the modulariry of a community in the network.
Thus, the Louvain algorithm evaluates how densely connected the nodes in a partition are and recursively merges communities into a single node and executes the modularity clustering on the condensed graphs.
For this we use the python library community_louvain.
import pandas as pd
import community as community_louvain
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import networkx as nx
import csv
#read the graph
G = nx.read_edgelist('output_files/ingredient_weights.csv', delimiter=',' ,encoding='latin1', create_using=nx.Graph(), nodetype=str, data=(('weight',int),))
# compute the best partition
partition = community_louvain.best_partition(G)
# draw the graph
pos = nx.spring_layout(G, k=0.25)
plt.figure(figsize=(10, 10))
plt.axis('off')
plt.title('Community Detection - Louvain Algorithm')
# color the nodes according to their partition
cmap = cm.get_cmap('Set1', max(partition.values()) + 1)
nx.draw_networkx_nodes(G, pos, partition.keys(), node_size=40, cmap=cmap, node_color=list(partition.values()))
nx.draw_networkx_edges(G, pos, alpha=0.5)
plt.show()
plt.savefig('plots/community_detection.png')
These ingredients along with their community number are stored in the file 'community_lists.csv'.
#create the csv file
csv_file = open('output_files/community_lists.csv','w',newline='')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['Name','Category'])
arr_cats = []
for x,y in partition.items():
arr_cats.append(y)
csv_writer.writerow([x,y])
csv_file.close()
#save the ingredients with their community number
data = pd.read_csv('output_files/community_lists.csv')
data = data.sort_values(["Category"])
data.to_csv('output_files/community_lists.csv', index=False)
print('Number of communities detected:', len(set(arr_cats)))