ApacheCon is Coming 9-12 Sept. 2019 - Las Vegas The Apache Software Foundation
Apache 20th Anniversary Logo

Celebrating 20 years of community-led development "The Apache Way"

Apache Support Logo


GitPubSub is a Publisher/Subscriber service for git commits, used at the Apache Software Foundation for tracking commits to the git repositories. It emits JSON-encoded data whenever a new commit is made to a git repo, containing the repository changed, the hash, files changed, the author and so on..

To get a feel of what GitPubSub sends, the easiest thing to do is monitor the traffic for a while:

curl -i http://gitpubsub-wip.apache.org:2069/json/*

This will output chunks of JSON data, each chunk containing either a keepalive (stillalive) code or an actual git commit notification, for example:

        "commit": {
            "repository": "git", 
            "project": "trafficserver", 
            "ref": "refs/heads/master",
            "hash": "e69de29",
            "sha": "e69de29bb2d1d6434b8b29ae775ad8c2e48c5391",
            "date": "Mon May 27 14:00 2013 +0100",
            "authored": "Mon May 27 12:30 2037 +0100",
            "author": "John Doe",
            "email": "jdoe@apache.org",
            "committer": "John Doe",
            "commited": "Mon May 27 14:00 2013 +0100",
            "subject": "Reverting everything!",
            "log": "I'm reverting everything because I can",
            "body": "I'm reverting everything because I can\nAnd that's all there is to it",
            "files": ["STATUS", "README", "foo.c"]

Hooking into GitPubSub

Hooking into GitPubSub is just like hooking into SvnPubSub: Create a daemon that connects to the GitPubSub address (http://gitpubsub-wip.apache.org:2069/json/*) and reads the chunks emitted. For each chunk emitted, check if the repository variable in the object is set and says 'git'. If it does (this variable can also say JIRA, ReviewBoard or anything else, GitPubSub accepts any object as long as it's valid JSON), pick up the JSON object and use it for whatever you need.

Going back in time

While the Pub/Sub model usually deals with real-time events, it is possible to go back in time and retrieve past events using the X-Fetch-Since request header. This value must be set to the UTC UNIX timestamp of the last time a client visited the Pub/Sub service, in order to continue where it left off. For example, one could construct the following request:

  GET /json HTTP/1.1
  X-Fetch-Since: 1366268954

These timestamps can be acquired by parsing the stillalive messages sent by GitPubSub, using the X-Timestamp response header sent back from POST/PUT requests, or by using whatever time function your programming language provides.