CIShell Manual : Fast Pathfinder Network Scaling
This page last changed on Jan 12, 2011 by dapolley.
DescriptionNetwork scaling algorithms such as the Pathfinder algorithm are used to prune many different kinds of networks, including citation networks, random networks, and social networks. However, this algorithm suffers from run time problems for large networks and online processing due to its O(n4) time complexity. Hence another alternative, Fast Pathfinder algorithm, is used which prunes the original network to get its PFNET(r,n-1) in just O(n3) time. Since it essentially employs different algorithm than MST based PAthfinder scaling its resultsmight differ from it. The underlying idea of this approach is based on a classical algorithm in graph theory for shortest path computation - Floyd-Warshall's Shortest Path Algorithm. Pros & ConsThere is a substantial speed increase due to the changes made in shortest path distance matrix computation. This leads to a reduction in the time complexity from O(n4) to O(n3). Also the space complexity is drastically reduced from (2 * n) - 1 matrices in the original algorithm to 2 matrices. However, the user cannot control the q parameter, which constrains the number of indirect proximities examined in generating the network. The q parameter is an integer value between 2 and n-1, inclusive where n is the number of nodes or items. Value of q is fixed to n-1. Also we have found that networks that have a low standard deviation for edge weights or if many of the edge weights are equal to the minimum edge weight, then the network might not be scaled as much as expected. This causes unweighted networks to be not scaled at all. If the networks being processed are undirected then MST based Pathfinder Network scaling algorithm can be used. This will give results in 30 times faster than Fast pathfinder algorithm. Also it does not face the disadvantages faced by Fast Pathfinder algorithm. Implementation DetailsOnce the input file is validated the algorithm is started. It works as follows,
Usage HintsThe user has to provide 4 inputs; the file storing the information about the network, value of r parameter, the column name that represents the edge weight and how the edge weight should be considered (Dissimilarity or Similarity). The provided network should not contain edge weights of less than equal to 0 else a warning is generated and default edge weight is assumed for that edge. The value of r ranges from 1 to infinity. Since r = infinity is the most commonly used in network scaling, there is a check-box, which when checked indicates that value of r is infinity (as default it is checked). The algorithm assumes the edge weights to be a measure of dissimilarity i.e. More the weight, more is the dissimilarity. If the input edge weights are a measure of Similarity i.e. More the weight, less is the similarity then the user can select Similarity option from the drop-down box. Only in the case of user selecting Similarity the edge weights will be inverted. For example, in a co-authorship network, more number of articles authored by 2 subjects equates to a stronger relationship. In this case the user should select Similarity option. The output will be provided as a .NWB file with a pruned network. LinksAcknowledgmentsThe original Pathfinder Network Scaling algorithm authored by Roger Schvaneveldt et al. was adapted with a Fast based approach by Arnaud Quirin et al. is implemented, integrated and documented by Chintan Tank. Also thanks to Micah Linnemeier and Russell Duhon for providing guidance during implementation. For the description I acknowledge Wikipedia. References
See Also |
![]() |
Document generated by Confluence on May 31, 2011 16:37 |