Command Line Basics

1. How is Debian GNU/Linux different from Ubuntu? Name two aspects.

Ubuntu is based on a snapshot of Debian, therefore there are many similarities between them.

However, there are still significant differences between them. The first one would be the

applicability for beginners. Ubuntu is recommended for beginners because of its ease of use and

on the other hand Debian is recommended for more advanced users. The major difference is the

complexity of the user configuration that Ubuntu doesn’t require during the installation process.

Another difference would be the stability of each distribution. Debian is considered to be more

stable compared to Ubuntu. This is because Debian receives fewer updates that are tested in

detail and the entire operating system is more stable. On the other hand, Ubuntu enables the

user to use the latest releases of software and all the new technologies.

2. What are the most common environments/platforms Linux is used for? Name three different

environments/platforms and name one distribution you can use for each.

A few of the common environments/platforms would be smartphone, desktop and server. On

smartphones, it can be used by distributions such as Android. On desktop and server, it can be

used by any distribution that is mostly suitable with the functionality of that machine, from

Debian, Ubuntu to CentOS and Red Hat Enterprise Linux.

3. You are planning to install a Linux distribution in a new environment. Name four things that

you should consider when choosing a distribution.

When choosing a distribution, a few of the main things that should be considered is cost,

performance, scalability, how stable it is and the hardware demand of the system.

4. Name three devices that the Android OS runs on, other than smartphones.

Other devices that Android runs on are smart TVs, tablet computers, Android Auto and

smartwatches.

5. Explain three major advantages of cloud computing.

The major advantages of cloud computing are flexibility, easy to recover and low use cost. Cloud

based services are easy to implement and scale, depending on the business requirements. It has a

major advantage in backup and recovery solutions, as it enables businesses to recover from

incidents faster and with less repercussions. Furthermore, it reduces operation costs, as it allows

to pay just for the resources that a business uses, on a subscription-based model.

Answers to Explorational Exercises

1. Considering cost and performance, which distributions are mostly suitable for a business that

aims to reduce licensing costs, while keeping performance at its highest? Explain why.

One of the most suitable distributions to be used by business is CentOS. This is because it

incorporates all Red Hat products, which are further used within their commercial operating

system, while being free to use. Similarly, Ubuntu LTS releases guarantee support for a longer

period of time. The stable versions of Debian GNU/Linux are also often used in enterprise

environments.

2. What are the major advantages of the Raspberry Pi and which functions can they take in

business?

Raspberry Pi is small in size while working as a normal computer. Furthermore, it is low cost

and can handle web traffic and many other functionalities. It can be used as a server, a firewall

and can be used as the main board for robots, and many other small devices.

3. What range of distributions does Amazon Cloud Services and Google Cloud offer? Name at least three common ones and two different ones.

The common distributions between Amazon and Google Cloud Services are Ubuntu, CentOS and

Red Hat Enterprise Linux. Each cloud provider also offers specific distributions that the other

one doesn’t. Amazon has Amazon Linux and Kali Linux, while Google offers the use of FreeBSD

and Windows Servers.

Debian, Ubuntu and Linux Mint use the dpkg, apt-get and apt tools to install software

packages, generally referred as DEB packages. Distributions such as Red Hat, Fedora and CentOS use

the rpm, yum and dnf commands instead, which in turn install RPM packages. As the application

packaging is different for each distribution family

If your distribution works with

DEB packages, you can search its repositories using apt-cache search package_name or apt search

package_name. The apt-cache command is used to search for packages and to list information about

available packages. The following command looks for any occurrences of the term “figlet” in the

package’s names and descriptions:

The

installation and removal of a package require special permissions granted only to the system’s

administrator: the user named root. On desktop systems, ordinary users can install or remove

packages by prepending the command sudo to the installation/removal commands. That will require

you to type your password to proceed. For DEB packages, the installation is performed with the

command apt-get install package_name or apt install package_name

In distributions based on RPM packages, searches are performed using yum search package_name or

dnf search package_name. Let’s say you want to display some text in a more irreverent way,

followed by a cartoonish cow, but you are not sure about the package that can perform that task. As

with the DEB packages, the RPM search commands accept descriptive terms:

$ yum search speaking cow

$ cowsay "Brought to you by yum"

The same commands used to install packages are used to remove them. All the commands accept the

remove keyword to uninstall an installed package: apt-get remove package_name or apt remove

package_name for DEB packages and yum remove package_name or dnf remove package_name for

RPM packages. The sudo command is also needed to perform the removal. For example, to remove

the previously installed package figlet from a DEB-based distribution:

$ sudo apt-get remove figlet

A similar procedure is performed on an RPM-based system. For example, to remove the previously

installed package cowsay from an RPM-based distribution:

$ sudo yum remove cowsay

The two projects have the same basic features and are compatible with the document formats from

Microsoft Office. However, the preferred document format is the Open Document Format, a fully open

and ISO standardized file format. The use of ODF files ensures that documents can be transferred

between operating systems and applications from different vendors, such as Microsoft Office. The

main applications offered by OpenOffice/LibreOffice are:

Writer

Text editor

Calc

Spreadsheets

Impress

Presentations

Draw

Vector drawing

Math

Math formulas

Base

Database

Both LibreOffice and Apache OpenOffice are open source software, but LibreOffice is licensed under

LGPLv3 and Apache OpenOffice is licensed under Apache License 2.0. The licensing distinction

implies that LibreOffice can incorporate improvements made by Apache OpenOffice, but Apache

OpenOffice cannot incorporate improvements made by LibreOffice. That, and a more active

community of developers, are the reason most distributions adopt LibreOffice as their default office

suite.

One of the main differences between a GNU General Public License (GNU GPL or simply GPL), the most common type of license for free software, and a BSD license is that GPL are copyleft, meaning any derivative code produced from the original free open source code must remain free. BSD licenses may allow for derivative works to become paid software.

Linux has a modular approach where different parts of the system are developed by different

projects and developers, each one filling a specific need or objective. Because of that, there are

several options of desktop environments to choose from and together with package managers, the

default desktop environment is one of the main differences among the many distributions out there.

Unlike proprietary operating systems like Windows and macOS, where the users are restricted to the

desktop environment that comes with their OS, there is the possibility to install multiple

environments and pick the one that adapts the most to you and your needs.

Basically, there are two major desktop environments in the Linux world: Gnome and KDE. They are

both very complete, with a large community behind them and aim for the same purpose but with

slightly divergent approaches. In a nutshell, Gnome tries to follow the KISS (“keep it simple stupid”)

principle, with very streamlined and clean applications. On the other hand, KDE has another

perspective with a larger selection of applications and giving the user the opportunity to change

every configuration setting in the environment.

While Gnome applications are based on the GTK toolkit (written in the C language), KDE

applications make use of the Qt library (written in C++). One of the most practical aspects of writing

applications with the same graphical toolkit is that applications will tend to share a similar look and

feel, which is responsible for giving the user a sense of unity during their experience. Another

important characteristic is that having the same shared graphical library for many frequently used

applications may save some memory space at the same time that it will speed up loading time once

the library has been loaded for the first time.

Industry Uses of Linux

Linux is heavily used among the software and Internet industries. Sites like W3Techs report that

about 68% of the website servers on the Internet are powered by Unix and the biggest portion of

those are known to be Linux.

This large adoption is given not only for the free nature of Linux (as both in free beer and in freedom

of speech) but also for its stability, flexibility and performance. These characteristics allow vendors

to offer their services with a lower cost and a greater scalability. A significant portion of Linux

systems nowadays run in the cloud, either on a IaaS (Infrastructure as a service), PaaS (Platform as a

Service) or SaaS (Software as a Service) model.

IaaS is a way to share the resources of a large server by offering them access to virtual machines that

are, in fact, multiple operating systems running as guests on a host machine over an important piece

of software that is called a hypervisor. The hypervisor is responsible for making it possible for these

guest OSs to run by segregating and managing the resources available on the host machine to those

guests. That’s what we call virtualization. In the IaaS model, you pay only for the fraction of

resources your infrastructure uses.

Linux has three well know open source hypervisors: Xen, KVM and VirtualBox. Xen is probably the oldest of them. KVM ran out Xen as the most prominent Linux Hypervisor. It has its development

sponsored by RedHat and it is used by them and other players, both in public cloud services and in

private cloud setups. VirtualBox belongs to Oracle since its acquisition of Sun Microsystems and is

usually used by end users because of its easiness of use and administration.

PaaS and SaaS, on the other hand, build up on the IaaS model, both technically and conceptually. In

PaaS instead of a virtual machine, the users have access to a platform where it will be possible to

deploy and run their application. The goal here is to ease the burden of dealing with system

administration tasks and operating systems updates. Heroku is a common PaaS example where

program code can just be run without taking care of the underlying containers and virtual machines.

Lastly, SaaS is the model where you usually pay for a subscription in order to just use a software

without worrying about anything else. Dropbox and Salesforce are two good examples of SaaS. Most

of these services are accessed through a web browser.

A project like OpenStack is a collection of open source software that can make use of different

hypervisors and other tools in order to offer a complete IaaS cloud environment on premise, by leveraging the power of computer cluster on your own datacenter. However, the setup of such

infrastructure is not trivial.

Privacy Issues when using the Internet

The web browser is a fundamental piece of software on any desktop these days, but some people still

lack the knowledge to use it securely. While more and more services are accessed through a web

browser, almost all actions done through a browser are tracked and analyzed by various parties.

Securing access to internet services and preventing tracking is an important aspect of using the

internet in a safe manner.

Cookie Tracking

Let’s assume you have browsed an e-commerce website, selected a product you wanted and placed

that in the shopping cart. But at the last second, you have decided to give it a second thought and

think a little longer if you really needed that. After a while, you start seeing ads of that same product

following you around the web. When clicking on the ads, you are immediately sent to the product

page of that store again. It’s not uncommon that the products you placed in the shopping cart are

still there, just waiting for you to decide to check them out. Have you ever wondered how they do

that? How they show you the right ad at another web page? The answer for these questions is called

cookie tracking.

Cookies are small files a website can save on your computer in order to store and retrieve some kind

of information that can be useful for your navigation. They have been in use for many years and are

one of the oldest ways to store data on the client side. One good example of their use are unique

shopping card IDs. That way, if you ever come back to the same website in a few days, the store can

remember you the products you’ve placed in your cart during your last visit and save you the time

to find them again.

That’s usually okay, since the website is offering you a useful feature and not sharing any data with

third parties. But what about the ads that are shown to you while you surf on other web pages?

That’s where the ad networks come in. Ad networks are companies that offer ads for e-commerce

sites like the one in our example on one side, and monetization for websites, on the other side.

Content creators like bloggers, for example, can make some space available for those ad networks on

their blog, in exchange for a commission related to the sales generated by that ad.

But how do they know what product to show you? They usually do that by saving also a cookie

from the ad network at the moment you visited or searched for a certain product on the e-commerce

website. By doing that, the network is able to retrieve the information on that cookie wherever the

network has ads, making the correlation with the products you were interested. This is usually one

of the most common ways to track someone over the Internet. The example we gave above makes

use of an e-commerce to make things more tangible, but social media platforms do the same with

their “Like” or “Share” buttons and their social login.

One way you can get rid of that is by not allowing third party websites to store cookies on your

browser. This way, only the website you visit can store their cookies. But be aware that some

“legitimate” features may not work well if you do that, because many sites today rely on third party

services to work. So, you can look for a cookie manager at your browser’s add-on repository in order

to have a fine-grained control of which cookies are being stored on your machine.

Do Not Track (DNT)

Another common misconception is related to a certain browser configuration better known as DNT.

That’s an acronym for “Do Not Track” and it can be turned on basically on any current browser.

Similarly to the private mode, it’s not hard to find people that believe they will not be tracked if they

have this configuration on. Unfortunately, that’s not always true. Currently, DNT is just a way for

you to tell the websites you visit that you do not want them to track you. But, in fact, they are the

ones who will decide if they will respect your choice or not. In other words, DNT is a way to opt-out

from website tracking, but there is no guarantee on that choice.

Technically, this is done by simply sending an extra flag on the header of the HTTP request protocol

(DNT: 1) upon requesting data from a web server. If you want to know more about this topic, the

website https://allaboutdnt.com is good starting point.

“Private” Windows

You might have noticed the quotes in the heading above. This is because those windows are not as

private as most people think. The names may vary but they can be called “private mode”, “incognito”

or “anonymous” tab, depending on which browser you are using.

In Firefox, you can easily use it by pressing Ctrl + Shift + P keys. In Chrome, just press Ctrl + Shift +

N . What it actually does is open a brand new session, which usually doesn’t share any configuration

or data from your standard profile. When you close the private window, the browser will

automatically delete all the data generated by that session, leaving no trace on the computer used.

This means that no personal data, like history, passwords or cookies are stored on that computer.

Thus, many people misunderstand this concept by believing that they can browse anonymous on the

Internet, which is not completely true. One thing that the privacy or incognito mode does is avoid

what we call cookie tracking. When you visit a website, it can store a small file on your computer

which may contain an ID that can be used to track you. Unless you configure your browser to not

accept third-party cookies, ad networks or other companies can store and retrieve that ID and

actually track your browsing across websites. But since the cookies stored on a private mode session

are deleted right after you close that session, that information is forever lost.

Besides that, websites and other peers on the Internet can still use plenty other techniques in order

to track you. So, private mode brings you some level of anonymity but it’s completely private only

on the computer you are using. If you are accessing your email account or banking website from a

public computer, like in an airport or a hotel, you should definitely access those using your

browser’s private mode. In other situations, there can be benefits but you should know exactly what

risks you are avoiding and which ones have no effect. Whenever you use a public accessible

computer, be aware that other security threats such as malware or key loggers might exist. Be

careful whenever you enter personal information, including usernames and passwords, on such

computers or when you download or copy confidential data.

Choosing the Right Password

One of the most difficult situations any user faces is choosing a secure password for the services

they make use of. You have certainly heard before that you should not use common combinations

like qwerty, 123456 or 654321, nor easily guessable numbers like your (or a relative’s) birthday or zip

code. The reason for that is because those are all very obvious combinations and the first attempts an

invader will try in order to gain access to your account.

There are known techniques for creating a safe password. One of the most famous is making up a

sentence which reminds you of that service and picking the first letters of each word. Let’s assume I

want to create a good password for Facebook, for example. In this case, I could come up with a

sentence like “I would be happy if I had a 1000 friends like Mike”. Pick the first letter of each word

and the final password would be IwbhiIha1000flM. This would result in a 15 characters password

which is long enough to be hard to guess and easy to remember at the same time (as long as I can

remember the sentence and the “algorithm” for retrieving the password).

Sentences are usually easier to remember than the passwords but even this method has its

limitations. We have to create passwords for so many services nowadays and as we use them with

different frequencies, it will eventually be very difficult to remember all the sentences at the time we

need them. So what can we do? You may answer that the wisest thing to do in this case is creating a

couple good passwords and reuse them on similar services, right?

Unfortunately, that’s also not a good idea. You probably also heard you should not reuse the same

password among different services. The problem of doing such a thing is that a specific service may

leak your password (yes, it happens all the time) and any person who have access to it will try to use

the same email and password combination on other popular services on the Internet in hope you

have done exactly that: recycled passwords. And guess what? In case they are right you will end up

having a problem not only on just one service but on several of them. And believe me, we tend to

think it’s not going to happen to us until it’s too late.

So, what can we do in order to protect ourselves? One of the most secure approaches available today

is using what is called a password manager. Password managers are a piece of software that will

essentially store all your passwords and usernames in an encrypted format which can be decrypted

by a master password. This way you only need to remember one good password since the manager

will keep all the others safe for you.

KeePass is one of the most famous and feature rich open source password managers available. It will

store your passwords in an encrypted file within your file system. The fact it’s open source is an

important issue for this kind of software since it guarantees they will not make any use of your data

because any developer can audit the code and know exactly how it works. This brings a level of

transparency that’s impossible to reach with proprietary code. KeePass has ports for most operating

systems, including Windows, Linux and macOS; as well as mobile ones like iOS and Android. It also

includes a plugin system that is able to extend it’s functionality far beyond the defaults.

Bitwarden is another open source solution that has a similar approach but instead of storing your

data in a file, it will make use of a cloud server. This way, it’s easier to keep all your devices

synchronized and your passwords easily accessible through the web. Bitwarden is one of the few

projects that will make not only the clients, but also the cloud server available as an open source

software. This means you can host your own version of Bitwarden and make it available to anyone,

like your family or your company employees. This will give you flexibility but also total control over

how their passwords are stored and used.

One of the most important things to keep in mind when using a password manager is creating a

random password for each different service since you will not need to remind them anyway. It

would be worthless if you use a password manager to store recycled or easily guessable passwords.

Thus, most of them will offer you a random password generator you can use to create those for you.

Encryption

Whenever data is transferred or stored, precautions need to be taken to ensure that third parties may

not access the data. Data transferred over the internet passes by a series of routers and networks

where third parties might be able to access the network traffic. Likewise, data stored on physical

media might be read by anyone who comes into possession of that media. To avoid this kind of

access, confidential information should be encrypted before it leaves a computing device.

TLS

Transport Layer Security (TLS) is a protocol to offer security over network connections by making

use of cryptography. TLS is the successor of the Secure Sockets Layer (SSL) which has been

deprecated because of serious flaws. TLS has also evolved a couple of times in order to adapt itself

and become more secure, thus it’s current version is 1.3. It can provide both privacy, and

authenticity by making use of what is called symmetric and public-key cryptography. By saying

that, we mean that once in use, you can be sure that nobody will be able to eavesdrop or alter your

communication with that server during that session.

The most important lesson here is recognizing that a website is trustworthy. You should look for the

“lock” symbol on the browser’s address bar. If you desire, you can click on it to inspect the certificate

that plays an important role in the HTTPS protocol.

TLS is what is used on the HTTPS protocol (HTTP over TLS) in order to make it possible to send

sensitive data (like your credit card number) through the web. Explaining how TLS works goes way

beyond the purpose of this article, but you can find more information on the Wikipedia and at the

Mozilla wiki.

File and E-mail Encryption With GnuPG

There are plenty of tools for securing emails but one of the most important of them is certainly

GnuPG. GnuPG stands for GNU Privacy Guard and it is an open source implementation of OpenPGP

which is an international standard codified within RFC 4880.

GnuPG can be used to sign, encrypt, and decrypt texts, e-mails, files, directories, and even whole

disk partitions. It works with public-key cryptography and is widely available. In a nutshell GnuPG

creates a pair of files which contain your public and private keys. As the name implies, the public

key can be available to anyone and the private key needs to be kept in secret. People will use your

public key to encrypt data which only your private key will be able to decrypt.

You can also use your private key to sign any file or e-mail which can be validated against the

corresponding public key. This digital signage works analogous to the real world signature. As long

as you are the only one who posses your private key, the receiver can be sure that it was you who

have authored it. By making use of the cryptographic hash functionality GnuPG will also guarantee

no changes have been made after the signature because any changes to the content would invalidate

the signature.

GnuPG is a very powerful tool and, in a certain extent, also a complex one. You can find more

information on its website and on Archlinux wiki (Archlinux wiki is a very good source of

information, even though you don’t use Archlinux).

Disk Encryption

A good way to secure your data is to encrypt your whole disk or partition. There are many open

source softwares you can use to achieve such a purpose. How they work and what level of

encryption they offer also varies significantly. There are basic two methods available: stacked and

block device encryption.

Stacked filesystem solutions are implemented on top of existing filesystem. When using this method,

the files and directories are encrypted before being stored on the filesystem and decrypted after

reading them. This means the files are stored on the host filesystem in an encrypted form (meaning

that their contents, and usually also their file/folder names, are replaced by random-looking data),

but other than that, they still exist in that filesystem as they would without encryption, as normal

files, symlinks, hardlinks, etc.

On the other hand, block device encryption happens below the filesystem layer, making sure

everything that is written to a block device is encrypted. If you look to the block while it’s offline, it

will look like a large section of random data and you won’t even be able to tell what type of

filesystem is there without decrypting it first. This means you can’t tell what is a file or directory;

how big they are and what kind of data it is, because metadata, directory structure and permissions

are also encrypted.

Both methods have their own pros and cons. Among all the options available, you should take a look

at dm-crypt, which is the de-facto standard for block encryption for Linux systems, since it’s native

in the kernel. It can be used with LUKS (Linux Unified Key Setup) extension, which is a specification

that implements a platform-independent standard for use with various tools.

If you want to try a stackable method, you should take a look at EncFS, which is probably the easiest

way to secure data on Linux because it does not require root privileges to implement and it can work

on an existing filesystem without modifications.

Finally, if you need to access data on various platforms, check out Veracrypt. It is the successor of a

Truecrypt and allows the creation of encrypted media and files, which can be used on Linux as well

as on macOS and Windows.

Use your web browser to navigate to https://haveibeenpwned.com/. Find out the purpose of the

website and check if your email address was included in some data leaks.

The website maintains a database of login information whose passwords were affected by a

password leak. It allows searching for an email address and shows if that email address was

included in a public database of stolen credentials. Chances are that your email address is

affected by one or the other leak, too. If that is the case, make sure you have updated your

passwords recently. If you don’t already use a password manager, take a look at the ones

recommended in this lesson.

Command Line Basics

The shell is a

program that enables text based communication between the operating system and the user. It is

usually a text mode program that reads the user’s input and interprets it as commands to the system.

There are several different shells on Linux, these are just a few:

• Bourne-again shell (Bash)

• C shell (csh or tcsh, the enhanced csh)

• Korn shell (ksh)

• Z shell (zsh)

On Linux the most common one is the Bash shell.

Option(s)/Parameter(s)

ls -l /home

-l is Option(s)/Parameter(s)

Argument(s)

Additional data that is required by the program, like a filename or path, such as /home in the

above example.

The shell supports two types of commands:

Internal

These commands are part of the shell itself and are not separate programs. There are around 30

such commands. Their main purpose is executing tasks inside the shell (e.g. cd, set, export).

command

compgen

complete

compopt

continue

declare

dirs

disown

echo

enable

eval

exec

exit

export

false

getopts

hash

help

history

jobs

kill

let

local

logout

mapfile

popd

printf

pushd

pwd

read

readarray

readonly

return

set

shift

shopt

External

These commands reside in individual files. These files are usually binary programs or scripts.

When a command which is not a shell builtin is run, the shell uses the PATH variable to search for

an executable file with same name as the command. In addition to programs which are installed

with the distribution’s package manager, users can create their own external commands as well.

The command type shows what type a specific command is:

$ type echo

echo is a shell builtin

$ type man

man is /usr/bin/man

Quoting

As a Linux user, you will have to create or manipulate files or variables in various ways. This is easy

when working with short filenames and single values, but it becomes more complicated when, for

example, spaces, special characters and variables are involved. Shells provide a feature called quoting

which encapsulates such data using various kinds of quotes (" ", ' '). In Bash, there are three types of

quotes:

• Double quotes

• Single quotes

• Escape characters

For example, the following commands do not act in the same way due to quoting:

$ TWOWORDS="two words"

$ touch $TWOWORDS

$ ls -l

-rw-r--r-- 1 carol carol 0 Mar 10 14:56 two

-rw-r--r-- 1 carol carol 0 Mar 10 14:56 words

$ touch "$TWOWORDS"

$ ls -l

-rw-r--r-- 1 carol carol 0 Mar 10 14:56 two

-rw-r--r-- 1 carol carol 0 Mar 10 14:58 'two words'

-rw-r--r-- 1 carol carol 0 Mar 10 14:56 words

$ touch '$TWOWORDS'

$ ls -l

-rw-r--r-- 1 carol carol 0 Mar 10 15:00 '$TWOWORDS'

-rw-r--r-- 1 carol carol 0 Mar 10 14:56 two

-rw-r--r-- 1 carol carol 0 Mar 10 14:58 'two words'

-rw-r--r-- 1 carol carol 0 Mar 10 14:56 words

Double Quotes

Double quotes tell the shell to take the text in between the quote marks ("...") as regular characters.

All special characters lose their meaning, except the $ (dollar sign), \ (backslash) and ` (backquote).

This means that variables, command substitution and arithmetic functions can still be used.

For example, the substitution of the $USER variable is not affected by the double quotes:

$ echo I am $USER

I am tom

$ echo "I am $USER"

I am tom

A space character, on the other hand, loses its meaning as an argument separator:

$ touch new file

$ ls -l

-rw-rw-r-- 1 tom students 0 Oct 8 15:18 file

-rw-rw-r-- 1 tom students 0 Oct 8 15:18 new

$ touch "new file"

$ ls -l

-rw-rw-r-- 1 tom students 0 Oct 8 15:19 new file

As you can see, in the first example, the touch command creates two individual files, the command

interprets the two strings as individual arguments. In the second example, the command interprets

both strings as one argument, therefore it only creates one file. It is, however, best practice to avoid

the space character in filenames. Instead, an underscore (_) or a dot (.) could be used.

Single Quotes

Single quotes don’t have the exceptions of the double quotes. They revoke any special meaning from

each character. Let’s take one of the first examples from above:

$ echo I am $USER

I am tom

When applying the single quotes you see a different result:

$ echo 'I am $USER'

I am $USER

The command now displays the exact string without substituting the variable.

Escape Characters

We can use escape characters to remove special meanings of characters from Bash. Going back to the

$USER environment variable:

$ echo $USER

carol

We see that by default, the contents of the variable are displayed in the terminal. However, if we

were to precede the dollar sign with a backslash character (\) then the special meaning of the dollar

sign will be negated. This in turn will not let Bash expand the variable’s value to the username of the

person running the command, but will instead interpret the variable name literally:

$ echo \$USER

$USER

If you recall, we can get similar results to this using the single quote, which prints the literal

contents of whatever is between the single quotes. However the escape character works differently

by instructing Bash to ignore whatever special meaning the character it precedes may possess.

In most Linux shells, there are two types of variables:

Local variables

These variables are available to the current shell process only

Environment variables

These variables are available both in a specific shell session and in sub processes spawned from

that shell session. Theses variables can be used to pass configuration data to commands which

are run. Because these programs can access these variables, they are called environment variables.

Working with Local Variables

You can set up a local variable by using the = (equal) operator. A simple assignment will create a

local variable:

$ greeting=hello

NOTE -> Don’t put any space before or after the = operator

In order to remove a variable, you will need to use the command unset:

Working with Global Variables

To make a variable available to subprocesses, turn it from a local into an environment variable. This

is done by the command export. When it is invoked with the variable name, this variable is added to

the shell’s environment:

$ greeting=hello

$ export greeting

An easier way to create the environment variable is to combine both of the above methods, by

assigning the variable value in the argument part of the command.

$ export greeting=hey

The PATH Variable

The PATH variable is one of the most important environment variables in a Linux system. It stores a

list of directories, separated by a colon, that contain executable programs eligible as commands from

the Linux shell.

$ echo $PATH

/home/user/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/ga

mes

To append a new directory to the variable, you will need to use the colon sign (:).

$ PATH=$PATH:new_directory

Here an example:

$ echo $PATH

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

$ PATH=$PATH:/home/user/bin

$ echo $PATH

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/user/bin

As you see, $PATH is used in the new value assigned to PATH. This variable is resolved during the

command execution and makes sure that the original content of the variable is preserved. Of course,

you can use other variables in the assignment as well:

$ mybin=/opt/bin

$ PATH=$PATH:$mybin

$ echo $PATH

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/user/bin:/opt/bin

The PATH variable needs to be handled with caution, as it is crucial for working on the command line.

Let’s consider the following PATH variable:

$ echo $PATH

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

To find out how the shell invokes a specific command, which can be run with the command’s name

as argument. We can, for example, try to find out where nano is stored:

$ which nano

/usr/bin/nano

$ today1=$(date)

$ echo $today1

Thu 31 Jan 10:07:35 EST 2019

Getting Help on the Command Line

Built-in Help

When started with the --help parameter, most commands display some brief instructions about

their usage. Although not all commands provide this switch, it is still a good first try to learn more

about the parameters of a command. Be aware that the instructions from --help often are rather

brief compared to the other sources of documentation which we will discuss in the rest of this

lesson.

Man Pages

Most commands provide a manual page or “man” page. This documentation is usually installed with

the software and can be accessed with the man command. The command whose man page should be

displayed is added to man as an argument:

$ man mkdir

Info Pages

Another tool that will help you while working with the Linux system are the info pages. The info

pages are usually more detailed than the man pages and are formatted in hypertext, similar to web

pages on the Internet.

$ info mkdir

Locating files

The locate command

A Linux system is built from numerous directories and files. Linux has many tools to locate a

particular file within a system. The quickest one is the command locate.

locate searches within a database and then outputs every name that matches the given string

$ sudo updatedb

The file is newly created, therefore there is no record of it in the database

The find command

find is another very popular tool that is used to search for files. This command has a different

approach, compared to the locate command. find command searches a directory tree recursively,

including its subdirectories. find does such a search at each invocation, it does not maintain a

database like locate. Similar to locate, find also supports wildcards and regular expressions.

File and Directory Names

Filenames can contain a suffix which comes after the

period (.). Unlike Windows, this suffix has no special meaning in Linux; it is there for human

understanding. In our example .txt indicates to us that this is a plaintext file, although it could

technically contain any kind of data.

2.3 Lesson 2

One last thing to note: we can specify the home directories of other users by specifying the username

after the tilde.

root@Quantiphi-2550:~# cd ~vicky

root@Quantiphi-2550:/home/vicky# ll

with the recursive option, we get a far longer list of files

ls -R ~

The find program is usually used to search for files and directories, but without any options, it will

show you a listing of all the files, directories, and sub-directories of your current directory.

Since touch creates empty files, you should get no output. You can use echo with > to create simple

text files. Try it:

$ echo hello > question15

$ cat question15

hello

Be careful when using >! If the named file already exists, it will be overwritten!

Use the -i option to make mv prompt you if you are about to overwrite an existing file.

The rm command can delete files and directories, while the rmdir command can only delete directories

By default rmdir can only delete empty directories, therefore we had to use rm to delete a regular file

Globbing

Matches any number of any character, including no characters

Matches any one character

[]

Matches a class of characters

The [] brackets are used to match ranges or classes of characters. The [] brackets work like they do

in POSIX regular expressions except with globs the ^ is used instead of !.

The ? expands to any single character. Try the following commands to see for yourself

$ ls

question1 question14 question2012 star10 star2002

question13 question15 question23 star1100 star2013

$ ls question?

question1

$ ls question1?

question13 question14 question15

$ ls question?3

question13 question23

$ ls question13?

ls: cannot access question13?: No such file or directory

Ranges within [] brackets are expressed using a -:

$ ls

file1 file2 file3 file4 file5 file6 file7 filea fileb filec

$ ls file[1-2]

file1 file2

$ ls file[1-3]

file1 file2 file3

$ ls file[1-25-7]

file1 file2 file5 file6 file7

$ ls file[1-35-6a-c]

file1 file2 file3 file5 file6 filea fileb filec

You can also use the ^ character as the first character to match everything except certain characters.

$ ls file[^a]

file1 file2 file3 file4 file5 file6 file7 fileb filec

$ ls file[[:digit:]a]

file1 file2 file3 file4 file5 file6 file7 filea

$ ls file[[:digit:]]a

file1a

Archiving Files on the Command Line

Compression is used to reduce the amount of space a specific set of data consumes. Compression is

commonly used for reducing the amount of space that is needed to store a file. Another common use

is to reduce the amount of data sent over a network connection.

Compression works by replacing repetitive patterns in data. Suppose you have a novel. Some words

are extremely common but have multiple characters, such as the word “the”.

Compression comes in two varieties, lossless and lossy. Things compressed with a lossless algorithm

can be decompressed back into their original form. Data compressed with a lossy algorithm cannot

be recovered. Lossy algorithms are often used for images, video, and audio where the quality loss is

imperceptible to humans, irrelevant to the context, or the loss is worth the saved space or network

throughput.

Archive and compression are commonly used together. Some archiving tools even compress their

contents by default. Others can optionally compress their contents. A few archive tools must be used

in conjunction with stand-alone compression tools if you wish to compress the contents.

The most common tool for archiving files on Linux systems is tar. Most Linux distributions ship

with the GNU version of tar, so it is the one that will be covered in this lesson. tar on its own only

manages the archiving of files but does not compress them.

There are lots of compression tools available on Linux. Some common lossless ones are bzip2, gzip,

and xz. You will find all three on most systems. You may encounter an old or very minimal system

where xz or bzip is not installed. If you become a regular Linux user, you will likely encounter files

compressed with all three of these. All three of them use different algorithms, so a file compressed

with one tool can’t be decompressed by another. Compression tools have a trade off. If you want a

high compression ratio, it will take longer to compress and decompress the file. This is because

higher compression requires more work finding more complex patterns. All of these tools compress

data but can not create archives containing multiple files.

Stand-alone compression tools aren’t typically available on Windows systems. Windows archiving

and compression tools are usually bundled together. Keep this in mind if you have Linux and

Windows systems that need to share files.

Linux systems also have tools for handling .zip files commonly used on Windows system. They are

called zip and unzip. These tools are not installed by default on all systems, so if you need to use

them you may have to install them. Fortunately, they are typically found in distributions' package

repositories

Compression Tools

How much disk space is saved by compressing files depends on a few factors. The nature of the data

you are compressing, the algorithm used to compress the data, and the compression level. Not all

algorithms support different compression levels.

Some compression tools support different compression levels. A higher compression level usually

requires more memory and CPU cycles, but results in a smaller compressed file. The opposite is true

It is not necessary to decompress a file every time you use it. Compression tools typically come with

special versions of common tools used to read text files. For example, gzip has a version of cat, grep,

diff, less, more, and a few others. For gzip, the tools are prefixed with a z, while the prefix bz exists

for bzip2 and xz exists for xz. Below is an example of using zcat to read display a file compressed

with gzip:

$ cp /etc/hosts ./

$ gzip hosts

$ zcat hosts.gz

127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts

::1 localhost ip6-localhost ip6-loopback

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

Archiving Tools

The tar program is probably the most widely used archiving tool on Linux systems. In case you are

wondering why it is named how it is, it as an abbreviation for “tape archive”. Files created with tar

are often called tar balls. It is very common for applications distributed as source code to be in tar

balls.

The tar program can also manage compression and decompression of archives on the fly. tar does

so by calling one of the compression tools discussed earlier in this section. It is as simple as adding

the option appropriate to the compression algorithm. The most commonly used ones are j, J, and z

for bzip2, xz, and gzip, respectively. Below are examples using the aforementioned algorithms:

$ cd ~/linux_essentials-3.1/compression

$ ls

bigfile bigfile3 bigfile-gz1.gz bigfile-xz1.xz hosts.gz

bigfile2 bigfile4 bigfile-gz9.gz bigfile-xz9.xz

$ tar -czf gzip.tar.gz bigfile bigfile2 bigfile3

$ tar -cjf bzip2.tar.bz2 bigfile bigfile2 bigfile3

$ tar -cJf xz.tar.xz bigfile bigfile2 bigfile3

$ ls -l | grep tar

-rw-r--r-- 1 emma emma 450202 Jun 27 05:56 bzip2.tar.bz2

-rw-r--r-- 1 emma emma 548656 Jun 27 05:55 gzip.tar.gz

-rw-r--r-- 1 emma emma 147068 Jun 27 05:56 xz.tar.xz

Compression is a process of taking some input data, and by using some sophisticated algorithm, compressing it (transform the bits, effectively), in order to have the same entity that weighs less size.

This is useful if you want to keep more data in a less space (space is always limited resource), or if you just want to have a faster file-transfer throughout networks.

Popular compression utility programs, on Linux distributions, are:

gzip (frequently used);

bzip2 (less frequently used, yet produces smaller output file than gzip);

xz (most space-efficient tool, in Linux, so far)

zip (often used for decompressing data, that was compressed on other systems using zip, like Windows OS).

Note, that generally, more efficient compression method is, more time it takes.

Archiving, on the other hand, can be thought of like putting some different files into one box. If you have 5 files, each of a size of 10kb, archiving those will give you 5 x 10 = 50kb, and that is it.

Note, that on Linux, we have a very good program tar, which, when given an input, does both:

archives the input (first step);

and then compresses that archive.

The zip and unzip programs can be used to work with ZIP files on Linux systems.

zip -r zipfile.zip dir

unzip zipfile.zip

Searching and Extracting Data from Files

I/O Redirection

I/O redirection enables the user to redirect information from or to a command by using a text file. As

described earlier, the standard input, output and error output can be redirected, and the information

can be taken from text files.

Redirecting Standard Output

To redirect standard output to a file, instead of the screen, we need to use the > operator followed by

the name of the file. If the file doesn’t exist, a new one will be created, otherwise, the information

will overwrite the existing file.

Redirecting Standard Error

In order to redirect just the error messages, a user will need to employ the 2> operator followed by

the name of the file in which the errors will be written. If the file doesn’t exist, a new one will be

created, otherwise the file will be overwritten.

Redirecting Standard Input

This type of redirection is used to input data to a command, from a specified file instead of a

keyboard. In this case the < operator is used as shown in the example:

$ cat < text

Hello!

Hello to you too!

References:

https://linkedin.github.io/school-of-sre/level101/linux_basics/command_line_basics/

Ticker

AWS and Python Blog

Command Line Basics

Industry Uses of Linux

Privacy Issues when using the Internet

Cookie Tracking

Do Not Track (DNT)

“Private” Windows

Choosing the Right Password

TLS

File and E-mail Encryption With GnuPG

The shell supports two types of commands:

Internal

External

Quoting

Double Quotes

Single Quotes

In most Linux shells, there are two types of variables:

Getting Help on the Command Line

Archiving Files on the Command Line

I/O Redirection

Post a Comment

0 Comments

Report Abuse

Tags

Featured Post

AWS EC2

Search This Blog

About Me

Popular Posts

How to change the default timezone and nginx configuration in AWS Elasticbeanstalk

Categories

Recent Posts

Categories

Tags

Menu Footer Widget