Monday, September 8, 2008

Version control with Tortoise SVN (part - 1)

Software configuration management is the process of identifying and defining the configuration items in a software system, controlling the release, versioning and change of these items through out the software system life cycle, recording and reporting the status of configuration items and change requests, and verifying the completeness and correctness of configuration items.

Version Control or Revision Control or Source Control lets you track your files over time. So the idea is when you mess up you can easily get back to the previous working version.

Whenever you bring up subject of which version control system to use, there is always a list:
Microsoft Visual SourceSafe
SourceGear Vault
Perforce
VOODOO (Versions Of Outdated Documents Organized Orthogonally)
Borland StarTeam
BitKeeper
Monotone
OpenCM
GNU Arch
Serena PVCS Version Manager
MKS Source
CVS (Concurrent Version System) and TortoiseCVS
Subversion and TortoiseSVN
Microsoft Team Foundation System (TFS)
IBM Rational ClearCase

Why Use Subversion?
Subversion is a system designed to control your source code. You may occasionally see the acronym 'SCM' associated with Subversion and its like. 'SCM' stands for 'software configuration management', because Subversion is also very good at managing plaintext configuration files. However, I will be focusing on source control.
There are a number of reasons that you may want to use a piece of software to manage your source code. If you are working collaboratively on a project, letting each developer have their own copy of the code on their local machine is great. It will prevent you two from overwriting the other developer's changes. Of course, it will not stop you two from completely changing the API, so its worth noting that Subversion is not a replacement for communicating.
But what if you are working on a project alone? You can still use Subversion. Source control management software also tracks changes to your code. If you break your application, and you cannot figure out why, you will always have the older (and functional) version to compare your changes against.

Subversion is a centralized system for sharing information. At its core is a repository, which is a central store of data. The repository stores information in the form of a filesystem tree—a typical hierarchy of files and directories. Any number of clients connect to the repository, and then read or write to these files. By writing data, a client makes the information available to others; by reading data, the client receives information from others.

Subversion Architecture :
To use Subversion, each “set of files” is called a “repository”.  A centralized “Subversion server” must be used, and may contain any number of file repositories. To access these files, any number of “Subversion clients” may be used, typically from different machines.  Since Subversion is open-source, a considerable amount of effort has been dedicated to making the system cross-platform.  In general a Subversion server may be set up on Linux, Windows, or Mac OSX, and Subversion clients exist similarly for Linux, Windows, and Mac.

subArch

When files are retrieved from the server to the client, it is called an “update”, and when new versions of the files are sent to the server from the client, it is called a “commit”.

A typical repository will go through a continuous cycle of update-edit-commit.

subArch1

Features of Subversion

Directory versioning
Subversion implements a “virtual” versioned filesystem that tracks changes to whole directory trees over time. Files and directories are versioned. As a result, there are real client-side move and copy commands that operate on files and directories.
Atomic commits
A commit either goes into the repository completely, or not at all. This allows developers to construct and commit changes as logical chunks.
Versioned metadata
Each file and directory has an invisible set of “properties” attached. You can invent and store any arbitrary key/value pairs you wish. Properties are versioned over time, just like file contents.
Choice of network layers
Subversion has an abstracted notion of repository access, making it easy for people to implement new network mechanisms. Subversion's “advanced” network server is a module for the Apache web server, which speaks a variant of HTTP called WebDAV/DeltaV. This gives Subversion a big advantage in stability and interoperability, and provides various key features for free: authentication, authorization, wire compression, and repository browsing, for example. A smaller, standalone Subversion server process is also available. This server speaks a custom protocol which can be easily tunneled over ssh.
Consistent data handling
Subversion expresses file differences using a binary differencing algorithm, which works identically on both text (human-readable) and binary (human-unreadable) files. Both types of files are stored equally compressed in the repository, and differences are transmitted in both directions across the network.
Efficient branching and tagging
The cost of branching and tagging need not be proportional to the project size. Subversion creates branches and tags by simply copying the project, using a mechanism called Cheap Copies similar to a hard-links in Linux/UNIX .Thus these operations take only a very small, constant amount of time, and very little space in the repository.
Hackability
Subversion has no historical baggage; it is implemented as a collection of shared C libraries with well-defined APIs. This makes Subversion extremely maintainable and usable by other applications and languages.

SVN

TortoiseSVN
TortoiseSVN is a free open-source client for the Subversion version control system initiated in 2000 by CollabNet Inc. That is, TortoiseSVN manages files and directories over time. Files are stored in a central repository. The repository is much like an ordinary file server, except that it remembers every change ever made to your files and directories. This allows you to recover older versions of your files and examine the history of how and when your data changed, and who changed it. This is why many people think of Subversion and version control systems in general as a sort of “time machine”.
Some version control systems are also software configuration management (SCM) systems. These systems are specifically tailored to manage trees of source code, and have many features that are specific to software development - such as natively understanding programming languages, or supplying tools for building software. Subversion, however, is not one of these systems; it is a general system that can be used to manage any collection of files, including source code.


TortoiseSVN is a Windows shell extension that allows you to access SVN repositories within Windows Explorer. Basically, any folder on your hard drive can be turned into an SVN folder and used to store a revision of an SVN repository with just a few mouse clicks and some connections info.

Feature of Tortoise SVN

1. Shell integration
TortoiseSVN integrates seamlessly into the Windows shell (i.e. the explorer). This means you can keep working with the tools you're already familiar with. And you do not have to change into a different application each time you need functions of the version control!

And you are not even forced to use the Windows Explorer. TortoiseSVN's context menus work in many other file managers, and in the File/Open dialog which is common to most standard Windows applications. You should, however, bear in mind that TortoiseSVN is intentionally developed as extension for the Windows Explorer. Thus it is possible that in other applications the integration is not as complete and e.g. the icon overlays may not be shown.

2. Icon overlays
The status of every versioned file and folder is indicated by small overlay icons. That way you can see right away what the status of your working copy is.

image

image A fresh checked out working copy has a green checkmark as overlay. That means the Subversion status is normal.
image As soon as you start editing a file, the status changes to modified and the icon overlay then changes to a red exclamation mark. That way you can easily see which files were changed since you last updated your working copy and need to be committed.
image If during an update a conflict occurs then the icon changes to a yellow exclamation mark.

image If you have set the svn:needs-lock property on a file, Subversion makes that file read-only until you get a lock on that file. Read-only files have this overlay to indicate that you have to get a lock first before you can edit that file.

image If you hold a lock on a file, and the Subversion status is normal, this icon overlay reminds you that you should release the lock if you are not using it to allow others to commit their changes to the file.

image This icon shows you that some files or folders inside the current folder have been scheduled to be deleted from version control or a file under version control is missing in a folder.

image The plus sign tells you that a file or folder has been scheduled to be added to version control.

3. Easy access to Subversion commands
All Subversion commands are available from the explorer context menu. TortoiseSVN adds its own submenu there.


TortoiseSVN's History
In 2002, Tim Kemp found that Subversion was a very good version control system, but it lacked a good GUI client. The idea for a Subversion client as a Windows shell integration was inspired by the similar client for CVS named TortoiseCVS.
Tim studied the source code of TortoiseCVS and used it as a base for TortoiseSVN. He then started the project, registered the domain tortoisesvn.org and put the source code online. During that time, Stefan Küng was looking for a good and free version control system and found Subversion and the source for TortoiseSVN. Since TortoiseSVN was still not ready for use then he joined the project and started programming. Soon he rewrote most of the existing code and started adding commands and features, up to a point where nothingi of the original code remained.
As Subversion became more stable it attracted more and more users who also started using TortoiseSVN as their Subversion client. The user base grew quickly (and is still growing every day). That's when Lübbe Onken offered to help out with some nice icons and a logo for TortoiseSVN. And he takes care of the website and manages the translation.

Microsoft VSS vs tortoiseSVN
Subversion benefits over Visual Source Safe (VSS):
Database integrity
The Subversion developers place their highest emphasis on protecting data. VSS databases have a reputation for frequent corruption.
Locking of Database:
VSS uses Lock-Modify-Unlock Approach whilst Subversion uses Copy-Modify-Merge approach.
Security
Subversion is easy to deploy over an encrypted link. One can use svnserve over a secure shell (ssh) link, or the Apache Subversion module over Apache's SSL (HTTPS) protocol. This eliminates the need to use special VPN software to secure communication with the repository. Different parts of a repository can have different access policies. Multiple repositories can be served from a single Apache web server. For example, one could restrict commit rights to the main trunk to a small group of project leaders, while allowing each developer or team a separate branch to work within. Project leaders would review easily-identified changes in a team's branch and commit them to the trunk. This is in fact how Subversion itself is managed, ensuring high quality in an open source environment with many independent contributors.
Performance over WAN
VSS was designed for a LAN, and requires massive disk activity for even simple operations. Subversion was designed for global clients. It strives to minimize network traffic. Many common operations can be performed without connection to the repository, such as comparing one's working copy to the version that was checked out.
True client/server
VSS is a peer-to-peer system in which every client is really a server, requiring full access to the underlying database. Faults in any peer can damage the database, and this is known to happen frequently. Subversion is normally deployed as a client/server architecture, with a single server having access to the actual database. If a fault happens in a client during a transaction, the server will roll back the transaction, protecting all other clients from corruption.
Cheap copies (ie. branches and tags)
One can copy large parts of a repository to another path, and only the fact of the copy is stored, not the actual data. This makes it very cheap (almost free) to tag and branch. This in turn makes it cheap for developers to create private version-controlled "sandboxes" where large features can be developed without the need to coordinate with other groups. Only once the entire feature is tested is it merged into the trunk. The developer can merge well-tested trunk developments into her branch.
No downtime for "maintenance"
Normal maintenance is just backup. This is done with an administrative command ("svnadmin dump") that performs a normal database lock, so the repository remains highly-available.
Disconnected development
One doesn't have to be connected to the repository to start work, as long as one has a working copy. If you're on a plane or waiting for your train to the office, you can power up your laptop and immediately start editing. No need to lock files before you begin. Subversion makes this particularly easy because your working copy contains a pristine copy of the original checked-out files, before you started making changes. (This copy is normally kept in the .svn subdirectory
under each working directory.) This makes it easy to monitor your own edits without using the network. Subversion developers "eat their own dog food"
Microsoft does not use VSS internally. It's not an actively-maintained product. The Subversion developers use Subversion to manage the development of
Subversion. It's in their own interest to make it the best system it can be.

 

Versioning Models for Version Control Systems:
All version control systems have to solve the same fundamental problem: how will the system allow users to share information, but prevent them from accidentally stepping on each other's feet? It's all too easy for users to accidentally overwrite each other's changes in the repository.
The Problem of File-Sharing
Consider this scenario: suppose we have two co-workers, Harry and Sally. They each decide to edit the same repository file at the same time. If Harry saves his changes to the repository first, then it's possible that (a few moments later) Sally could accidentally overwrite them with her own new version of the file. While Harry's version of the file won't be lost forever (because the system remembers every change), any changes Harry made won't be present in Sally's newer version of the file, because she never saw Harry's changes to begin with. Harry's work is still effectively lost - or at least missing from the latest version of the file - and probably by accident. This is definitely a situation we want to avoid!

image
The Lock-Modify-Unlock Solution
Many version control systems use a lock-modify-unlock model to address this problem, which is a very simple solution. In such a system, the repository allows only one person to change a file at a time. First Harry must "lock" the file before he can begin making changes to it. Locking a file is a lot like borrowing a book from the library; if Harry has locked a file, then Sally cannot make any changes to it. If she tries to lock the file, the repository will deny the request. All she can do is read the file, and wait for Harry to finish his changes and release his lock. After Harry unlocks the file, his turn is over, and now Sally can take her turn by locking and editing.

image

The problem with the lock-modify-unlock model is that it's a bit restrictive, and often becomes a roadblock for users:

• Locking may cause administrative problems. Sometimes Harry will lock a file and then forget about it. Meanwhile, because Sally is still waiting to edit the file, her hands are tied. And then Harry goes on vacation. Now Sally has to get an administrator to release Harry's lock. The situation ends up causing a lot of unnecessary delay and wasted time.
• Locking may cause unnecessary serialization. What if Harry is editing the beginning of a text file, and Sally simply wants to edit the end of the same file? These changes don't overlap at all. They could easily edit the file simultaneously, and no great harm would come, assuming the changes were properly merged together. There's no need for them to take turns in this situation.
• Locking may create a false sense of security. Pretend that Harry locks and edits file A, while Sally simultaneously locks and edits file B. But suppose that A and B depend on one another, and the changes made to each are semantically incompatible. Suddenly A and B don't work together anymore. The locking system was powerless to prevent the problem - yet it somehow provided a sense of false security. It's easy for Harry and Sally to imagine that by locking files, each is beginning a safe, insulated task, and thus inhibits them from discussing their incompatible changes early on.
The Copy-Modify-Merge Solution
Subversion, CVS, and other version control systems use a copy-modify-merge model as an alternative to locking. In this model, each user's client reads the repository and creates a personal working copy of the file or project. Users then work in parallel, modifying their private copies. Finally, the private copies are merged together into a new, final version. The version control system often assists with the merging, but ultimately a human being is responsible for making it happen correctly. Here's an example. Say that Harry and Sally each create working copies of the same project, copied from the repository. They work concurrently, and make changes to the same file "A" within their copies. Sally saves her changes to the repository first. When Harry attempts to save his changes later, the repository informs him that his file A is out-of-date. In other words, that file A in the repository has somehow changed since he last copied it. So Harry asks his client to merge any new changes from the repository into his working copy of file A. Chances are that Sally's changes don't overlap with his own; so once he has both sets of changes integrated, he saves his working copy back to the repository.image 

image
But what if Sally's changes do overlap with Harry's changes? What then? This situation is called a conflict, and it's usually not much of a problem. When Harry asks his client to merge the latest repository changes into his working copy, his copy of file A is somehow flagged as being in a state of conflict: he'll be able to see both sets of conflicting changes, and manually choose between them. Note that software can't automatically resolve conflicts; only humans are capable of understanding and making the necessary intelligent choices. Once Harry has manually resolved the overlapping changes (perhaps by discussing the conflict with Sally!), he can safely save the merged file back to
the repository. The copy-modify-merge model may sound a bit chaotic, but in practice, it runs extremely smoothly. Users can work in parallel, never waiting for one another. When they work on the same files, it turnsout that most of their concurrent changes don't overlap at all; conflicts are infrequent. And the amount of time it takes to resolve conflicts is far less than the time lost by a locking system. In the end, it all comes down to one critical factor: user communication. When users communicate
poorly, both syntactic and semantic conflicts increase. No system can force users to communicate perfectly, and no system can detect semantic conflicts. So there's no point in being lulled into a false promise that a locking system will somehow prevent conflicts; in practice, locking seems to inhibit productivity more than anything else. There is one common situation where the lock-modify-unlock model comes out better, and that is where you have un-mergeable files. For example if your repository contains some graphic images, and two people change the image at the same time, there is no way for those changes to be merged together. Either Harry or Sally will lose their changes.

What does Subversion Do?
Subversion uses the copy-modify-merge solution by default, and in many cases this is all you will ever need. However, as of Version 1.2, Subversion also supports file locking, so if you have unmergeable files, or if you are simply forced into a locking policy by management, Subversion will still provide the features you need.


Checkouts and Commits in SVN
When a developer wishes to work with SVN version-controlled source code, he or she must first 'check out' the current version of the code (or possibly an older version, if necessary). 'Check out' describes the process of the TortoiseSVN client connecting to the SVN server, and downloading a version of the code in a repository. Once the code is checked out, it can be worked with just like un-versioned code. After some milestone has been reached (or the workday has ended), the updated code can then be 'committed' back to the SVN repository as a new version of the source code, and subsequent attempts to check out the latest version of the code will acquire this newer, updated version.

 

Branching / Tagging in Subversion
One of the features of version control systems is the ability to isolate changes onto a separate line of development. This line is known as a branch. Branches are often used to try out new features without disturbing the main line of development with compiler errors and bugs. As soon as the new feature is stable enough then the development branch is merged back into the main branch (trunk).
Another feature of version control systems is the ability to mark particular revisions (e.g. a release version), so you can at any time recreate a certain build or environment. This process is known as tagging.
Subversion does not have special commands for branching or tagging, but uses so-called cheap copies instead. Cheap copies are similar to hard links in Unix, which means that instead of making a complete copy in the repository, an internal link is created, pointing to a specific tree/revision.As a result branches and tags are very quick to create, and take up almost no extra space in the repository.

 

Creating The Repository With TortoiseSVN
1. Open the windows explorer
2. Create a new folder and name it e.g. SVNRepository
3. Right-click on the newly created folder and select TortoiseSVN ® Create Repository
here....

image

 

Accessing the Repository

Right click on the desktop and from the menu select TortoiseSVN->Repo Browser as shown in the figure

Repo1

In the coming screen type the url of the repository say [http://MyServerName/svn/MyRepos or svn://MyServerName/MyRepos] and clicks OK.

repoURL

It will display an authentication screen as shown below with provision to provide user id and password to login to repository. The subversion administrator will provide a user id and password for your repository access.

repoAuthentication

Check save authentication for saving the user name and password and click on [OK]. Now the following screen will appear displaying the repository contents.

RepoBrowser

 

(cont...)

2 comments:

Amir said...

thanx so much ! wonderful explanation

zaka ur rehman khan said...
This comment has been removed by a blog administrator.