7.5. Documentation

phd comics

Software documentation covering

  1. its use

  2. its installation

  3. the development

with a README file at the root of the project summarising the main information about the software, or with specific tools in different languages for comprehensive documentation. This documentation should also include copyright terms and the conditions for distributing the code (licence).

The documentation will include the installation procedure, which should be as simple as possible and based on clear documentation of the software's dependencies.

I recommend you to structure the documentation in big parts, each part dedicated to kind of audience.

  • the users

  • The developers you in few years, your collaborators or to allow external contributions.

  • eventually administrators (if it's a server)

  • or another audience

In python the most used and powerful framework is sphinx It allow to generate documentation in lot of formats (html, pdf, epub, ...) and include very powerful features.

  • The python documentation is made with sphinx

  • This course is powered with sphinx.

Although sphinx has been designed for python project it now support other languages.

7.5.1. User documentation

The user documentation should cover

  1. The software licensing and copyright

  2. how to install

  3. how to use the software, with concrete examples.

  4. the algorithm but from a user point of view

7.5.2. Developer documentation

code documentation

each function class must be documented.

  1. what does this function

  2. what are the parameters

  3. what is the data type of each parameter

  4. what this function returns

  5. what is the data type of the return

  6. the kind of error this function raised and in which conditions

 1def build_clusters(hits, rep_info, model, hit_weights):
 2    """
 3    From a list of filtered hits, and replicon information (topology, length),
 4    build all lists of hits that satisfied the constraints:
 5
 6        * max_gene_inter_space
 7        * loner
 8        * multi_system
 9
10    If Yes create a cluster
11    A cluster contains at least two hits separated by less or equal than max_gene_inter_space
12    Except for loner genes which are allowed to be alone in a cluster
13
14    :param hits: list of filtered hits
15    :type hits: list of :class:`macsypy.hit.Hit` objects
16    :param rep_info: the replicon to analyse
17    :type rep_info: :class:`macsypy.Indexes.RepliconInfo` object
18    :param model: the model to study
19    :type model: :class:`macsypy.model.Model` object
20    :return: list of clusters
21    :rtype: List of :class:`Cluster` objects
22    """
23    def collocates(h1, h2, model):
24        # compute the number of genes between h1 and h2
25        dist = h2.get_position() - h1.get_position() - 1
26        g1 = model.get_gene(h1.gene.name)
27        g2 = model.get_gene(h2.gene.name)
28        inter_gene_max_space = max(g1.inter_gene_max_space, g2.inter_gene_max_space)
29        if 0 <= dist <= inter_gene_max_space:
30            return True
31        elif dist <= 0 and rep_info.topology == 'circular':
32            # h1 and h2 overlap the ori
33            dist = rep_info.max - h1.get_position() + h2.get_position() - rep_info.min
34            return dist <= inter_gene_max_space
35        return False
36
37    clusters = []
the html rendering of the example above

The developer documentation is NOT ONLY the API. It must describe also the general software architecture and it's functioning.

7.5.2.1. code comments

Comments should not paraphrase the code.

But every where the code is not obvious, or there is a caveat you should add a comment.

        if len_scaffold == 1:
        # handle circularity
        # if there are clusters
        # may be the hit collocate with the first hit of the first cluster
        if clusters and collocates(cluster_scaffold[0], clusters[0].hits[0], model):
            new_cluster = Cluster(cluster_scaffold, model, hit_weights)
            clusters[0].merge(new_cluster, before=True)
        elif model.get_gene(cluster_scaffold[0].gene.name).loner:
            # the hit does not collocate but it's a loner
            # handle clusters containing only one loner
            new_cluster = Cluster(cluster_scaffold, model, hit_weights)
            clusters.append(new_cluster)
        elif model.min_genes_required == 1:
            # the hit does not collocate but the model required only one gene
            # handle clusters containing only one gene
            new_cluster = Cluster(cluster_scaffold, model, hit_weights)
            clusters.append(new_cluster)

    elif len_scaffold > 1:
        new_cluster = Cluster(cluster_scaffold, model, hit_weights)
        clusters.append(new_cluster)

    # handle circularity
    if len(clusters) > 1:
        if collocates(clusters[-1].hits[-1], clusters[0].hits[0], model):
            clusters[0].merge(clusters[-1], before=True)
            clusters = clusters[:-1]
return clusters