A set of scripts to visualize header inclusion trees. Quickstart ---------- Install perl, graphviz, and kghostview. Configure your linux kernel and 'make prepare'. Adjust the Gather::Linux line near top of graph.pl. Only tested on x86_64 so far. mkdir tmp ./graph.pl - do_it (this takes 30-120 seconds) - save_it ./graph.pl - load_it (this takes about 10 seconds) - x "linux/types.h" - x "linux/list.h" - x "linux/sched.h" - x "fs/dcache.c" Concepts -------- tsize transitive size, a measure of how many header files are included from a given top-level root file. unique tsize The number of different files included from a given root file. total tsize The number of files included from a given root file, if include guards were not used. this number is useful when evaluating header file partitioning schemes which involve splitting up a header file. Doing that will tend to slightly increase the unique tsize while drastically lowering the total tsize. backbone A contiguous subgraph starting from the root and composed of the nodes with "large" unique tsizes, for some arbitrary value of large. Any child nodes below this size are considered "regular headers" and are not considered to be part of the backbone. "large" is currently hardcoded to a value useful for the Linux kernel headers. Description ----------- Nodes in the graph repesent files, edges represent inclusions. Node Shapes: To help relax the graph and clean up the layout, some child edges may be snipped. The node shapes and labels indicate where and when this has occurred. The house-shaped pentagon is the root of the inclusion tree. Ellipses are normal header files. Boxes are header files which have had one or more links to child nodes snipped in order to relax the graph. Circles are the places where the root node children were snipped, in the case that root had more than 3 children. Node Colors: Different colors are used to flag interesting nodes. Orange represents the root node of the graph. Blue represents the "backbone" of the inclusion hierarchy. This is the set of files with the largest unique tsizes. The theory is that these files represent the "most important" concepts used by the root node. Yellow represents a popular inclusion target with small unique tsize. Red represents a popular inclusion target with a large unique tsize. Pale green represents normal header files. The more saturated a red, yellow or blue colored object is, the larger the unique tsize. Conceptually, colors are applied in the following order: - The entire graph is painted green. - Backbone nodes are then painted blue. - Cutpoint targets are painted yellow. - Yellow nodes with very large tsizes are repainted red. - Finally, the root node is painted orange. Node Labels: All nodes are labelled with the file name. The unique and total tsizes are placed underneath the name and separated by a dash. Nodes which have had some incoming edges snipped have a third number indicating the total number of parents for that file, including any incoming edges that weren't actually snipped. Nodes with outgoing edges snipped also include a list of the children who have been detached. The information for each child (name, unique tsize, total tsize, and parent count) is placed on a single line per child.