=========== Internals =========== The entire darcs change history is imported into the database, using the output of ``darcs changes --xml-output --summary --reverse``. A check for newer patches is performed everytime the DarcsRepository object is created, and any new patches are immediately imported into the database. After that the darcs repository is used only for fetching the contents of a file: with darcs 2.x we use ``darcs query contents``, while with darcs 1.x we have to do ugly tricks; at the extreme, ``darcs annotate`` output is massaged by ann2ascii.py to fetch the contents of a file at any given point in time. Each changeset is assigned a revision number according to their order in the output of ``darcs changes --xml-output --summary --reverse``. The first patch gets a revision number of 1, and second revision number 2 etc... This assumes that the patches in a darcs repository **NEVER** get reordered or deleted. This condition is satisfied as long as commands such as ``darcs unpull`` or ``darcs optimize`` are not performed. Cache ===== For performance reasons, the backend creates and maintains a few other tables, where it keeps darcs specific information. The following tables are automatically created at `upgrade` time and populated by `sync` (see components.py). darcs_changesets ---------------- Each row represents a darcs changeset:: create table darcs_changesets ( repo_id text, rev integer, hash text, name text, primary key (repo_id, rev)); repo_id repository containing this changeset rev the revision number assigned hash the unique patch identifier assigned by darcs name the name of the darcs patch darcs_nodes ----------- Each row represents a single node: a node is either a file or a directory which has its history stored in the repository. .. note:: a node doesn't have a particular name or content but, for a given revision, its name and content will be well defined. :: create table darcs_nodes ( repo_id text, node_id integer, node_type text, add_rev integer, remove_rev integer, primary key (repo_id, node_id) ); node_type is one of (dbutil.NODE_FILE_TYPE, dbutil.NODE_DIR_TYPE) add_rev is the revision that added this node remove_rev is the revision that removed this node (possibly NULL) darcs_node_changes ------------------ Each row represents a node change for a particular revision. Only one entry can exist for a node in each revision. Of course, if there are no changes to the node then no entries will be present! :) :: create table darcs_node_changes ( repo_id text, node_id integer, rev integer, path text, parent_id integer, the_change text, primary key (repo_id, node_id,rev) ); the_change one of following (defined in dbutil.py): CHANGE_ADDED, CHANGE_REMOVED, CHANGE_MOVED, CHANGE_EDITED, CHANGE_MOVED_EDITED parent_id the node id for the node's parent directory path the path of the node at the end of revision 'rev': when change is CHANGE_REMOVED then 'path' is the previous path. darcs_cache ----------- A cache of file contents: as soon as the content of any file at any particular revision is requested for the first time, it's computed and stored here, so succeeding requests won't require executing darcs at all. .. warning:: this may quickly grow in size! OTOH, you can just delete all the rows at any time, the content will be recomputed when reasked. :: create table darcs_cache ( repo_id text, node_id integer, rev integer, content blob, size integer, primary key (repo_id, node_id,rev) ); Some sample queries =================== Get all existing nodes as of revision r --------------------------------------- :: select dnc.node_id as node_id, max(dnc.rev) as rev from darcs_node_changes as dnc, darcs_nodes as dn where dnc.node_id = dn.node_id and dnc.rev <= r and dnc.repo_id = dn.repo_id and dnc.repo_id = 'somerepo' and (dn.remove_rev is null or dn.remove_rev > r) group by dnc.node_id Get all latest nodes -------------------- :: select dnc.node_id as node_id, max(dnc.rev) as rev from darcs_node_changes as dnc, darcs_nodes as dn where dnc.node_id = dn.node_id and dn.remove_rev is null and dnc.repo_id = dn.repo_id and dnc.repo_id = 'somerepo' group by dnc.node_id Get node_id of /some/path p, as of revision r --------------------------------------------- .. XXX: here "node_rev(r)" means a subquery, see .. ``_nodeid_rev_for_revision()`` in dbutil.py :: select dnc.node_id as node_id from darcs_node_changes as dnc, (node_rev(r)) as nr where dnc.node_id = nr.node_id and dnc.rev = nr.rev and dnc.repo_id = nr.repo_id and dnc.repo_id = 'somerepo' and dnc.path = p Get history of node_id nid, till revision r ------------------------------------------- :: select * from darcs_node_changes as dnc where dnc.node_id = nid and dnc.rev <= r and dnc.repo_id = 'somerepo' Get children of node_id nid, as of revision r --------------------------------------------- :: select dnc.node_id as node_id from darcs_node_changes as dnc, (node_rev(r)) as nr where dnc.node_id = nr.node_id and dnc.rev = nr.rev and dnc.parent_id = nid and dnc.repo_id = nr.repo_id and dnc.repo_id = 'somerepo'