Jump to content

Git migration


Recommended Posts

Spurred on by various comments here and elsewhere, I've been thinking about how this project (or some part of it) might be migrated to the Git revision control system. My main realization has been that (correct me if I am wrong) there are actually two seperate projects hosted in SVN: Pyrogenesis and 0 A.D.

This made me wonder if it would possible to migrate the Pyrogenesis project (that is, all the C++ code) to Git, while letting the 0 A.D. project (that is, binaries plus data) stay on SVN?

SVN might contain a build script(s) which would fetch the latest version of the Pyrogenesis source code from Git and compile it for the given platform. This way, getting fully up-to-date would still be as easy as "svn up; fetchandbuildpyrogenesis".

Additionally, the 0 A.D. repository could be split into a "binaries" and a "data" directory, such that those on non-Windows platforms need only download the "data" directory.

What would the shortcomings of this setup be and can we address them? (And do we want to?)

Link to comment
Share on other sites

I'm one of those wanting to switch to Git, esp now we have so many developers.

But problem with your suggestion is a fair amount of the source code is in binaries/data (all the GUI code and the JS for the simulation, ai etc).

Splitting them up will also mean an extra step for developers who modify files in both the data and source directories (i.e. touching JS and C++).

Link to comment
Share on other sites

I can think of a few pretty good reasons not to do all of that :) First it means programmers would need both Git and SVN installed and set up on their systems just to build the game, and they'd have to remember to keep both updated. Also the autobuild utility would need to juggle both Git and SVN which sounds like a mess. We don't want to force everyone to build the game, that's why we have autobuild. And there seems to be issues with Git's handling of line-endings with files from different OSes (either it's a problem with Git generally and/or a problem with the 0ad Github).

SVN seems to do what we need already, only some people prefer Git, which doesn't do everything we need. So the logical solution is to keep SVN which is enough for most, make a script that automatically updates our Github whenever there's a new commit, and then developers have the choice of using Git locally if they want.

Better division of the repo is worth discussing, but it's hard to find a structure that's clearly better than we have now. In particular if the goal is to prevent Linux developers from having to download Windows-only EXEs, DLLs, and LIBs, that means somehow separating both /binaries/system and /libraries, possibly as different root directories from the rest of the game (see Blender SVN for an example), which means a more complicated checkout and update process. That's more feasible than using different VCS for different parts of the game but less than ideal.

Link to comment
Share on other sites

With respect to line-endings, have you tried the per-repo setting described here?

Though I agree that any changes to the revision control system needs to be thought through, I'd be wary of the "we have no problems, so why change it?" type of thinking. The problem with SVN is that works well for maintainers (those with commit access), but poorly for everyone else. This potentially stifles development in ways that are invisible for the maintainers.

Link to comment
Share on other sites

Zoot, you have to remember that it works well for non-programmers as well ;) For programmers it's one thing to "just" do svn up this or that, but for artists etc it's really helpful to just be able to use a GUI tool and just one at that ;) Complicating the process is probably the biggest concern. I'm not too concerned with what system we use as long as it isn't more complicated than today, if anything it ought to be simpler if we should move to something else, to outweigh the costs of having to learn a new system.

Link to comment
Share on other sites

One major disadvantage of SVN for developers is that it does not allow you to switch between multiple branches locally. I.e., when you've made some changes and midway through you want to work on something else for a while, you have the following options

  • commit your half-finished code to the live repo (not an option)
  • revert your changes and lose them (also not an option)
  • (edit for completeness) create a new remote branch, switch to it and commit them there (also not an option, we don't have branches at the moment)
  • extract your changes into a diff and reapply them when you're done with the other stuff
  • check out an additional working copy to work on the other stuff

Neither one is desirable. You either end up with a ton of different working copies (and used disk space), or tending to an entire garden of different patches that you need to version manually as your work progresses.

Git is much more flexible in this regard, in that you can safely commit or stash your changes locally to be picked up again where you left off when you're done with the other stuff. It has other advantages and disadvantages, but this alone is sufficient reason for me to prefer it over SVN.

I am, however, sympathetic with the need for easy-to-update binaries for artists who don't wish to spend their time rebuilding the game all the time nor have to learn a new, arguably more complicated SCM tool. SVN works well for this, so perhaps we're already doing it in the best way: keep SVN as the official repository, but offer Git as an additional option for those developers who desire it. The only thing that's perhaps missing is automatic synchronization between SVN commits and the Git repo (poke k776)

Link to comment
Share on other sites

Yeah, I can imagine it being tedious with additional steps on every update. As suggested above, non-programmers would update the same way they do now. Only difference would be that they don't get the C++ source code as part of the update (since, as far as I can tell, they don't need it - they rely on the autobuilt binaries which reside in SVN).

Link to comment
Share on other sites

One major disadvantage of SVN for developers is that it does not allow you to switch between multiple branches locally.

Are braches on the serverside supported for our repository? Wouldn't that solve the problem?

As far as I know it would, but I didn't yet use braches much, so I'm not sure if it works in practice.

Link to comment
Share on other sites

Keeping the C++ away from non-programmers does not sound like it's worth the trouble of splitting the codebase into two separate repositories running two different SCM systems. Couldn't artists just ignore the code if they so please and keep using SVN happily ever after? :)

Your proposal implicitly assumes that the C++ codebase and the binaries are not correlated and can be developed independently, which is most certainly not the case. Javascript code routinely calls C++ code and vice-versa. In your proposed scheme, changes that simultaneously need to modify e.g. both javascript and C++ components would be split out over different commits to different repositories, with no link between them. If something breaks, you then have to search and manually cross-reference two code repositories, and manually combine two diffs to get a global overview of co-dependent changes that were made. This sounds like a nightmare.

Also, we would no longer be able to use Trac in the way we use it now, where a single commit can be flagged as solving a ticket (because you might now need two separate commits to fix one issue). The barrier to entry for new developers would also be heightened considerably, which is most certainly not what we want.

There are more reasons, but I think you can see where I'm going with this :)

Link to comment
Share on other sites

One major disadvantage of SVN for developers is that it does not allow you to switch between multiple branches locally. I.e., when you've made some changes and midway through you want to work on something else for a while, you have the following options

You missed one. Use git svn (needs both installed) and you can work with git locally and commit to the svn repo with git svn dcommit.

Apart from that I believe we should just integrate a git sync with trac's post-commit-hook and be done with it.

Link to comment
Share on other sites

Keeping the C++ away from non-programmers does not sound like it's worth the trouble of splitting the codebase into two separate repositories running two different SCM systems.

Well, it wouldn't be the aim. The aim would be to get as much of the 'codebase' (which is actually code, data and binaries) over on Git. SVN would (reluctantly) be kept around for the bits which non-programmers need to be able to update easily. But the point is moot if the programmers aren't comfortable with it either.

Link to comment
Share on other sites

Are braches on the serverside supported for our repository? Wouldn't that solve the problem?

As far as I know it would, but I didn't yet use braches much, so I'm not sure if it works in practice.

We don't currently have branches on the SVN repository (at least not that I'm aware of). It would somewhat solve the problem, in that you could save your unfinished work to a new branch that you first create, 'svn switch' to, and commit. The problem with that approach, however, is that you now face the problem of sanely merging SVN branches that have had intermediate updates from trunk back into the trunk (which in practice is not quite as easy as it should be). Also, although more minor; we'd end up with a boatload of WIP branches, and you'd be required to have internet connectivity to save your local work in progress.

You missed one. Use git svn (needs both installed) and you can work with git locally and commit to the svn repo with git svn dcommit.

Apart from that I believe we should just integrate a git sync with trac's post-commit-hook and be done with it.

Agreed.

Link to comment
Share on other sites

The problem with that approach, however, is that you now face the problem of sanely merging SVN branches that have had intermediate updates from trunk back into the trunk.

What's the important difference compared to commiting a local working-copy and compared to how git does it?

Basically you always have to keep in sync with the changes in trunk.

Link to comment
Share on other sites

What's the important difference compared to commiting a local working-copy and compared to how git does it?

Basically you always have to keep in sync with the changes in trunk.

The way merging branches works in SVN is by taking a diff between a starting and ending revision of your branch X, and applying it to your working copy that has a certain revision of the branch T checked out that you want to merge it into (for our purposes the head of the trunk). The tricky thing is that this diff must not include any changes that you have already previously merged back from the trunk into your branch. This effectively means that you need to keep track of the revision where you last did an intermediate merge from the trunk into your branch, or failing that, where you branched off from the trunk. SVN is not really helpful with helping you select the revision range to diff between, and if you select the wrong revision range, your merge has a non-zero chance of failing spectacularly. You're sort of on your own to come up with some scheme to create meaningful markers in your commit messages so you can easily find them. This requires discipline, which I suspect to become increasingly prone to forgetfulness/mistakes as a function of the amount of branches and developers.

Git explicitly tracks the parents of each commit, and hence doesn't have this problem. Every time you merge two branches in git, the resulting commit has the tips of both branches at that time set as its parents. When you merge a branch that has been intermediately synced with the trunk ("master") back into the trunk, it needs only trace the parent pointers backwards in time to find the "most recent" common ancestor, and take it from there.

Link to comment
Share on other sites

With respect to line-endings, have you tried the per-repo setting described here?

Hmm I don't see a .gitattributes in the 0ad repo, so that's worth investigating. That link says the text settings override the global core.autocrlf, but that setting doesn't do anything for me :) Nor do the "re-normalizing" instructions or various other hacks I've found mentioned.

Link to comment
Share on other sites

The way merging branches works in SVN is by taking a diff between a starting and ending revision of your branch X, and applying it to your working copy that has a certain revision of the branch T checked out that you want to merge it into (for our purposes the head of the trunk). The tricky thing is that this diff must not include any changes that you have already previously merged back from the trunk into your branch. This effectively means that you need to keep track of the revision where you last did an intermediate merge from the trunk into your branch, or failing that, where you branched off from the trunk

Isn't that what's described here under "Keeping a Branch in Sync Without Merge Tracking".

http://svnbook.red-b...ging.stayinsync

That shouldn't be a problem anymore since subversion 1.5 because it has merge tracking.

Link to comment
Share on other sites

Well, perhaps your changes aren't sizeable enough (with no disrespect meant) or they are too isolated to conflict with other changes. The risk of merge conflicts goes up with the number of contributors and the number of changes they make, but it is not something everyone experiences all the time. Conversely, outside contributors may be deterred from making significant contributions if the revision control system isn't capable of handling them. That way, a project can avoid merge conflicts for a long time - but only because no one is bothering to make the kind of substantial contributions that might cause such conflicts.

Link to comment
Share on other sites

With respect to line-endings, have you tried the per-repo setting described here?

I tried a different Git client and the problems have seemingly disappeared *fingers crossed* Which is strange, I kept the same local repo. Apparently GitHub for Windows is still experimental and buggy. I'm using TortoiseGit+msysgit and that works much better. I did my first fetch/merge from upstream last night and I'm adding local branches for testing patches :)

Link to comment
Share on other sites

Well this has been discussed to death, but perhaps a two-repository idea would with with git sub modules: http://git-scm.com/book/en/Git-Tools-Submodules .

  • Everything programers use is in one repository. A script clones each new revision of the master branch into a mirror branch of precompiled binaries.
  • Artists check out an art repository, which bulls the recompiled branch of the engine repository automatically.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...