This article was published as a part of the Data Science Blogathon
Welcome to the world of collaborative coding! Ever wondered how multiple people can work on the same project without chaos? Enter Git and GitHub, the dynamic duo of version control. Learn the basics of Git, a Version Control System (VCS) that tracks changes in code. Discover GitHub, a platform that takes collaboration to the next level. This article explores the differences between Git and GitHub, walks you through Git installation, and introduces key operations and commands to empower your coding journey. Let’s dive in!
Git is a tool that helps software developers work together on projects. It keeps track of changes made to the code, allowing multiple people to collaborate without messing up each other’s work. Git also lets developers create separate branches to work on specific tasks and then merge their changes back together. It’s like a smart way to save and organize different versions of a project. Popular platforms like GitHub make it easier for developers to share their code and collaborate using Git.
GitHub is a web-based platform that uses Git for version control. In simpler terms, it’s like a place on the internet where people can store and share their code with others. Here’s what GitHub does:
In essence, GitHub is a hub for hosting and collaborating on software projects. It makes it easier for individuals and teams to work together, share code, and contribute to open-source projects.
Git:
GitHub:
Version Control Systems are the software tools for tracking/managing all the changes made to the source code during the project development. It keeps a record of every single change made to the code. It also allows us to turn back to the previous version of the code if any mistake is made in the current version. Without a VCS in place, it would not be possible to monitor the development of the project.
The three types of VCS are:
Local Version Control System is located in your local machine. If the local machine crashes, it would not be possible to retrieve the files, and all the information will be lost. If anything happens to a single version, all the versions made after that will be lost.
Also, with the Local Version Control System, it is not possible to collaborate with other collaborators.
To collaborate with other developers on other systems, Centralized Version Control Systems are developed.
In the Centralized Version Control Systems, there will be a single central server that contains all the files related to the project, and many collaborators checkout files from this single server (you will only have a working copy). The problem with the Centralized Version Control Systems is if the central server crashes, almost everything related to the project will be lost.
To overcome all the above problems, Distributed Version Control Systems are developed.
In a distributed version control system, there will be one or more servers and many collaborators similar to the centralized system. But the difference is, not only do they check out the latest version, but each collaborator will have an exact copy (mirroring) of the main repository(including its entire history) on their local machines.
Each user has their own repository and a working copy. This is very useful because even if the server crashes we would not lose everything as several copies are residing in several other computers.
Before deep-diving into Git operations and commands, create an account for yourself on GitHub if you don’t have it already.
Create a remote central repository on GitHub.
https://docs.github.com/en/get-started/quickstart/create-a-repo
create a local repository using git (I am using Git software on Windows 10)
Open your file explorer, navigate to the working directory, right-click and select “Git Bash Here”. This opens the Git terminal. To create a new local repository use the command git init and it creates a folder .git.
git init to create a new Git repository
$ git init
(master) is the default branch of the local repository.
Next, we need to sync the local and the central repositories.
git remote add to add a new remote repository.
To get the URL of the central repo, open your repository in GitHub and copy the link.
Run the below command,
$ git remote add origin "https://github.com/harika-bonthu/git-github-tutorial.git"
Generally, Origin is the shorthand name of the remote repository that we are cloning.
After adding, we need to pull the files from the remote repo.
git pull to download all the content from the remote repo
$ git pull origin main
(main is the branch in our central/remote repository. Kindly check the branch name before pull request)
With just adding the origin, we do not have any files. After pulling from the main branch, we now have a README.md file in the local repository.
Now, if you again try to pull, it says “Already up to date.”
$ git pull origin main
From https://github.com/harika-bonthu/git-github-tutorial
* branch main -> FETCH_HEAD
Already up to date.
Next, if you want to check if any files are modified or to be committed, use the below command.
git status to check the status of the working directory and the staging area.
Working directory – It is the place where we make changes to the existing files or create new files.
Staging area – It is the place where the files are ready to be committed.
$ git status
On branch master
nothing to commit, working tree clean
Since the last pull, we haven’t made any changes in the working directory. So it says “nothing to commit, working tree clean)
Now the question is, how do we add files to the staging area.
git add to add files to the index or the staging area.
To demonstrate it with an example, I am modifying the README.md file and creating two more text files “file1.txt”, “file2.txt”
If you wish to use the command line for creating or modifying files, please refer to the video: https://www.youtube.com/watch?v=UeF4ZhnPzZQ
After making changes in the working directory, once again check the status using the command git status.
It shows that the files file1.txt, file2.txt are untracked and README.md is modified.
Next, we will see how to add the README.md file to the staging area.
$ git add README.md
Below is the status after adding it to the staging area.
The next step is to commit these changes to the local repository.
git commit to save the changes to the local repository.
$ git commit -m "Initial commit"
-m in the above command stands for the message. The message lets other developers know what changes have been made.
Don’t forget we still have two files in the working directory that are to be committed.
Now, I am going to modify file1.txt, file2.txt files using the “nano” command.
To add multiple files to the staging area, we can simply use -A flag in the git add command.
$ git add -A
Then check the status and commit them.
$ git status
$ git commit -m "Committed txt files"
Now, what if you want to undo staging? Let’s see how it is done.
For that, I am creating another file named “file3.txt” and add it to the staging area and check the status.
$ touch file3.txt
$ git add file3.txt
$ git status
To undo it, use the below command.
git restore --staged file3.txt
To see all the commits that are made till now, check the log.
git log to see all the commits
$ git log
Once you get familiar with the concepts that are discussed now, we will move to the topic branches.
A branch in Git is an independent line of work(a pointer to a specific commit). It allows users to create a branch from the original code (master branch) and isolate their work.
git branch to create a new branch
$ git branch branch1
To see all the branches used git branch -a
master is highlighted as we are currently working in the master branch. To switch to another branch we need to checkout.
git checkout to switch to another branch
$ git checkout branch1
$ git branch -a
branch1 will have all the files of the master branch as it is originated from the master.
$ ls
README.md file1.txt file2.txt file3.txt
In branch1, I would like to make changes to file1.txt and create another text file names file4.txt
Now add these files to the staging area and commit. If you now check the master branch, these changes are not yet made there.
To make these changes to the master branch, we need to merge branch1 with master.
$ git checkout master
$ git merge branch1
To revert to a particular commit, we can use the first 8 digits of the hexadecimal code of a respective commit
git checkout 8digitcode file1.txt
git checkout f3c0884b file1.txt Updated 1 path from 32610ca
Once we are done working, we need to push all these code files to the central/remote repository.
git push to send all files to the remote repository.
$ git push origin main
If you encounter such a problem, use the below command.
$ git push origin HEAD:main
Now go to your GitHub and TADA your files are hosted on the central repository.
In conclusion, Git is a version control system that tracks changes in code, while GitHub is a platform for hosting and collaborating on Git repositories. Version Control System (VCS) manages code versions, and Git and GitHub differ as local and remote repositories. Git installation is straightforward, and key commands enable efficient version control.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.