cybersec-k

View on GitHub

Regular Expressions

Learn and practise using regular expressions

[TASK 2] Charsets

Match all of the following characters: c, o, g [cog]

An characters from the character set enclosed with [ ].

Match all of the following words: cat, fat, hat [cfh]at

An character from the character set and then the word should end with at.

Match all of the following words: Cat, cat, Hat, hat [CcHh]at

It took me a while to understand the format accepted for this. It is the same as above.

Match all of the following filenames: File1, File2, file3, file4, file5, File7, file9 [Ff]ile[1-9]

Start with letter f or F, then it should have ile and ends with a digit between 1 and 9.

Match all of the filenames of question 4, except “File7” (use the hat symbol) [Ff]ile[^7]

Same as above but not digit 7, so we use ^.

[TASK 3] Wildcards and optional characters

Match all of the following words: Cat, fat, hat, rat. .at

This can solved using [Cfhr]at. But the solution requires three letter word and we learned something new. Why not use them!

Match all of the following words: Cat, cats [Cc]ats?

So a lower or upper case C, then at and s with an ? which is it’s okay if s is not there.

Match the following domain name: cat.xyz cat\.xyz

Nothing special. Just a tricky one. Author wants us to try ** i.e., escape character.

Match all of the following domain names: cat.xyz, cats.xyz, hats.xyz [ch]ats?\.xyz

It a combination of last two, using an escape character with a ? for s saying its okay if its there or not.

Match every 4-letter string that doesn’t end in any letter from n to z ..[n-z]

So any 4 letter word, two dots and anything between n and z.

Match bat, bats, hat, hats, but not rat or rats (use the hat symbol) [^r]ats?

Author explicitly said to use ^. So this regex will capture all the words at in between except when there is r is in the beginning.

[TASK 4] Metacharacters and repetitions

Match the following word: catssss cats{4}

So ends with 4-s, use curly braces and problem is solved.

Match all of the following words (use the * sign): Cat, cats, catsss [Cc]ats*

Match all of the following sentences (use the + sign): regex go br, regex go brrrrrr regex go br+

Match all of the following filenames: ab0001, bb0000, abc1000, cba0110, c0000 (don’t use a metacharacter) [abc]{1,3}[01]{4}

So 1 to 3 alphabet from abc. Then 0s and 1s, 4 times.

Match all of the following filenames: File01, File2, file12, File20, File99 [Ff]ile\d{1,2}

Lower or upper case F with ile and then a digit (occurence can be 1 or 2).

Match all of the following folder names: kali tools, kali   tools kali\s+tools

Spaces in between, can be 1 or more. Two words can be together therefore +.

Match all of the following filenames: notes~, stuff@, gtfob#, lmaoo! \w{5}\W

There are five letter in every word with a special character at the end.

Match the string in quotes (use the * sign and the \s, \S metacharacters): “2f0h@f0j0%!    a)K!F49h!FFOK” \S*\s*\S*

Anything but no spaces, then any number of spaces, and again anything except spaces.

Match every 9-character string (with letters, numbers, and symbols) that doesn’t end in a “!” sign \S{8}[^!]

Nine character string, so for first 8 characters anything i.e., ! can be used but not ends with ! so the ^ symbol. Trick part here was using 8 instead of 9.

Match all of these filenames (use the + symbol): .bash_rc, .unnecessarily_long_filename, and note1 \.?\w+

File name can start with . or not i.e., file can be hidden or not. And file name can not have special characters.

[TASK 5] Starts with/ ends with, groups, and either/ or

Match every string that starts with “Password:” followed by any 10 characters excluding “0” Password:[^0]{10}

Match “username: “ in the beginning of a line (note the space!) ^username:\s

NOTE the space

Match every line that doesn’t start with a digit (use a metacharacter) ^\D

Match this string at the end of a line: EOF$ EOF\$$

Only trick here was to use escape character i.e. backward slash.

Match all of the following sentences: I use nano, I use vim I use (nano|vim)

Match all lines that start with $, followed by any single digit, followed by $, followed by one or more non-whitespace characters \$\d\$\S+

Match every possible IPv4 IP address (use metacharacters and groups) (\d{1,3}\.){3}\d{1,3}

So 1-3 digit and . needs to appear three times. i.e., XXX.X.XX. or any other form. Then a set of digit which can be 1-3.

Match all of these emails while also adding the username and the domain name (not the TLD) in separate groups (use \w): hello@tryhackme.com, username@domain.com, dummy_email@xyz.com (\w+)@(\w+)\.com

Using this can give you the email_id and the domain in two different variable, which can be useful in many scenarios.