Sri's World | Square Ctf 2018

I love square ctfs and the way they do security. This was one of the highlight ctf last year which kept me excited for the week.

1. Misc – MATH category: Captcha

Question

Concept

Solve math, Identify Characters, Work with Fonts Style!

Given

A web page to solve mathematical expression and answer to get the flag (sounds easy!)

Thoughts –> Think, Think, Think

But, when the expression or characters are copied, they appear as garbage, page source show different mapping –> This is because of the custom font style use
d which is base64 encoded within the page
Oldschool, tried solving the challenge by manually typing as fast as I can, but it was clever to change in a few seconds to a different mapping and captch, th
e page is linked with a token that keeps track of the current captcha… (To probably work and solve it in under 4 seconds else it reset everything!!!!)
Obtaining the web page with captcha programmatically –> make sure you set the user agent for http proper, else the page doesnt respond

Tried studying font styles and cmap tables

**- IDEA1:** a proper mapping from font style to the characters visible would help to reconstruct the expression (Font tools in python was helpful, but studying the different fonts was not exhaustive, it involved how they are drawn and different tables that define fonts)
  
**- IDEA2:** use character recognition using OCR, and construct the expression to solve
  
**- IDEA3:** Try extract the base64 encoded font style into a ttf file and try processing in the current operating system to recognize, (the reverse) to rebuild the expression fron the webpage becomes tough again

Expression could be solved easily as a string with python eval
Moving forward with the IDEA2 after a number of fails
- Take screenshot of the browser loading the expression - (by python selenium lib)
- Testing OCR capabilities in online ocr tools worked 100% accurate, translating the expression
  - Approach1: Take screenshot of the browser loading the expression –> send to a online OCR website (API) –> Fetch the expression text –> Solve
    - Some website work for sure (manual upload), while the ones that offer API key access do not work successful
  - Approach2: Use python OCR - PYTESSERACT to perform OCR and obtain the text (This seems to not be accurate, some characters are not properly recognized ao it cant be relied upon)
    - Refactoring this logic helped me solve the problem, look at the steps
      - Fails: Using pillow to increase resolution, contrast, tryc cropping and enhance image did not help. Still OCR was inaccurate!
      - Success: Using a hybrid method (@steps), making OCR a separate process to map the characters to text

Steps

1. OBTAIN DATA (SCREENSHOT + HTML) - Use Python Selenium to do this

Goto the captcha webpage
Take screenshot of the captcha page and save as png
Also, store the page source html

2. OCR (Create a mapping for fonts)

We already know performing OCR on the screenshot is not accurate using pytesseract
We know that the expression could contain only ‘1234567890-+xX()’
- xX for multiply, should be replaced by ‘*’ once the expression is obtained
Replace the content of the html source (ie the expression) with the string ‘unique characters in expression that would map to 1234567890-+xX()’ with proper spaces for the OCR to work (REASON: The spacing and the way the characters are displayed affect the OCR process a lot!!!!!!)
- This process can also be improved by making fewer character mapping (as we have only less characters to map) repeatedly to be more accurate (this was not needed for this challenge though)
Now again repeat the process step 1 and OBTAIN DATA with the new html created, this time take a new screenshot with the known characters and order you have placed.
The OCR now seems to be more accurate and the mapping between the font and characters could be performed easily.

3. CONSTRUCT EXPRESSION

Construct the expression now, replace ‘x’ with ‘*’ and execute it with eval

4. SUBMIT CAPTCHA RESPONSE

Obtain the token and the answer, create a JSON and make a POST request to submit the answer

– It is noted that I have added one round trip of selenium action for the new html with added characters to get the mapping with better OCR. Hope this doesnt cost be enough time!!

Program:

crack_structured.py
crack.py - initially worked on code with different pocs

import os, re, sys
import requests
import pytesseract
from selenium import webdriver

### STEP1 - Obtain Data

# Use selenium to grab a screen shot of the webpage
driver = webdriver.Firefox()
driver.get('https://hidden-island-93990.squarectf.com/ea6c95c6d0ff24545cad')
element = driver.find_elements_by_tag_name('p')

# Html source, token and expression
htmls = driver.page_source
text = element[0].text
t = "".join(list(text))
tok = driver.find_element_by_name("token")
token = tok.get_attribute("value")
var = list(set(t))
vars = []
for ch in var:
    if ch.strip():
       vars.append(ch)
print vars
print htmls

### STEP2 - OCR - Recogize and Map

html = htmls.replace(text, " ".join(vars))
#print html
new_html = open("new.html","w")
new_html.write(str(html))
new_html.close()
alt_html = "file://"+os.path.abspath("new.html")
driver.get(alt_html)
screenshot = driver.save_screenshot('expression.png')
driver.quit()
expression = pytesseract.image_to_string(Image.open("expression.png"))
expression =  expression.split()[1]
expression = list(expression)
print vars, expression


### STEP3 - Construct expression

for k,v in zip(vars, expression):
    text = text.replace(k, v)
print text
# Replace x or X and solve
#expr = expression.split("\n")[1]
expr = text.replace("x","*")
expr = expr.replace("X","*")
print expr
ans =  eval(expr)
print ans


### STEP4 - Submit the answer

url = "https://hidden-island-93990.squarectf.com/ea6c95c6d0ff24545cad"
data = dict(token=token, answer=str(ans))
r = requests.post(url, data=data, allow_redirects=True)
print r.content

Terminal Output showing the work

Program

Flag Obtained

Program

RESULT

The first try failed, second try failed too with incorrect answer response
The third try was successfull!!!!

Concept

In GDPR, anonymization is when the privacy of the user is protected by anonymizing the data such that nothing is derived about any person.

Given

Five csv data sets containing different parameters like Firstname, email, 4 digits of SSN, Role, Pay, State, Street Address.
A web portal which has a login and reset password page
Says you have to find details about the user Yakubovics who is the captain to login the system

Think, Think, Think

Looking at the portal it is very intriguing to perform a sql injection or admin login BUT we have the datasets and a hint name.
It is obvious, from the details of the dataset and the reset password form that we should find the data from the datasets and fill in reset password to reset it and then login

Steps

Start from the name we have Yakubovics and boil down to get the firstname, ssn, street address, state
From all possible sets obtained from the above filter, use these in reset password form
Get or change the password (The final result ought to be just viewing previous password in the reset password page)
Login

Details:

Start with the given name we have Yakubovics
Check the dataset1 –> We obtain email with the last name
Check the dataset2 –> Use the email and last name obtained from dataset1 to obtain the STATE
Check the dataset3 –> With the State, obtain the ssn and street address
Check the dataset4 –> Get income and postal code with the state obtained
Check the dataset5 –> From the email we know the first character of name is e, use this to filter first name in the fifth dataset
As we progress delete the non matched sets

Program

reader.py
- This program throws the final set of data from all the filter through the dataset csv 1 -5. From this set using Elyssa gives the answer

import json
names = []
yaku = {}

# File 1: Fetch all the existence of the names of the Captain
for i in range(1,2):
    name = str(i)+".csv"
    file = open(name,"r")
    for line in file.readlines():
        if "Yakubovics" in line.strip() or "Yakubovics".upper() in line.strip() or "Yakubovics".lower() in line.strip():
           l = line.strip().split(",")
           yaku["email"] = l[0]
           yaku["role"] = l[1]
           yaku["income"] = l[2]
    file.close()


# File 2
for i in range(2,3):
    name = str(i)+".csv"
    file = open(name,"r")
    for line in file.readlines():
        if "Yakubovics" in line.strip() or "Yakubovics".upper() in line.strip() or "Yakubovics".lower() in line.strip():
           l = line.strip().split(",")
           yaku["state"] = l[1]
    file.close()

# doc has the ssn, address --> Fetch all florida addresses and ssn
name = str(3)+".csv"
file = open(name,"r")
for line in file.readlines():
    if "Florida" in line.strip() or "Florida".lower() in line.strip() or "Florida".upper() in line.strip():
       l = line.strip().split(",")
       # ssn
       yaku[l[2]] = {}
       # street
       yaku[l[2]]["ssn"] = l[0]
file.close()

# Fourth
name = str(4)+".csv"
file = open(name,"r")
for line in file.readlines():
    l = line.strip().split(",")
    if " ".join(l[2:-1]) in yaku and ("Florida" in l[1] or "Florida".upper() in l[1] or "Florida".lower() in l[1]):
       yaku[" ".join(l[2:-1])]["income"] = l[0]
       yaku[" ".join(l[2:-1])]["postal"] = l[-1]
file.close()

# Fifth
name = str(5)+".csv"
file = open(name,"r")
for line in file.readlines():
    l = line.strip().split(",")
    if " ".join(l[1:]) in yaku:
       yaku[" ".join(l[1:])]["name"] = l[0]
       # This assumption doesnt work?
       if l[0][0] != "e".upper() and l[0][0] != "e".lower():
          del yaku[" ".join(l[1:])]
file.close()

print json.dumps(yaku, sort_keys=True, indent=4)

Output Flag - Hidden password can be obtained by looking at the source for the masked/hidden field

Flag

Program output - Based on the relation boil the data down to possible sets

Program

Result

At reset password page, Using details of Elyssa throws the answer

`
“4 Magdeline”: {
“income”: “96605”,
“name”: “Elyssa”,
“postal”: “33421”,
“ssn”: “4484”
},
`

The previous password appears masked with ‘*’, viewing page source gives out the password

3. Programming category: dot-n-dash

A puzzle by Alok himself and the first puzzle. Proved to be challenging!

Concept

Encode/Decode. Program by Reversing Logic.

Given

Encoder/Decoder written in Javascript, with decoder code missing. There are instructions provided (with flag obviously) in encoded format. Complete the decoder code! (Reverse Encoder Code.)

Think, Think, Think

With a bunch of debug statements, analyze the encoder code.
Trying to literally following through the encoder program to write the decoder functions
Possible pattern recognition that can be leveraged to reverse.?

Steps

This problem had me in a confused state for a long time, trying to dig through JS, analyzing and reversing the code.
After some crazy haul on this, while I was just reading the console, a pattern struck!…implemented the same in the below code to decode.
- Convert dot and dashes back to its respective integer
- reverse math
- covert back to ascii

Code

</head>
<body>
<p>It is a known fact that space travelers love to devis unique encoding and decoding methods...</p>
  <textarea id="input" placeholder="type something here..."></textarea>
  <div>
    <button onclick="return encode();">Encode</button>
    <button onclick="return decode();">Decode</button>
  </div>
<script>
function encode() {
  var t = input.value;
  if (/^[-.]+$/.test(t)) {
    alert("Your text is already e'coded!");
  } else {
    input.value = _encode(t);
  }
  return false;
}

function decode() {
  var t = input.value;
  if (/^[-.]*$/.test(t)) {
    input.value = _decode(t);
  } else {
    alert("Your text is not e'coded!");
  }
  return false;
}

function _encode(input) {
  var a=[];
  for (var i=0; i<input.length; i++) {
    var t = input.charCodeAt(i);
    console.log(t);
    for (var j=0; j<8; j++) {
      //console.log(t >> j);
      //console.log((t >> j) & 1)
      //console.log(1 + j + (input.length - 1 - i) * 8)
      if ((t >> j) & 1) {
        console.log(t >> j);
        console.log((t >> j) & 1)
        console.log(1 + j + (input.length - 1 - i) * 8)
        a.push(1 + j + (input.length - 1 - i) * 8);
      }
    }
  }
 
  console.log(a);

  var b = [];
  while (a.length) {
    var t = (Math.random() * a.length)|0;
    b.push(a[t]);
    a = a.slice(0, t).concat(a.slice(t+1));
  }

  console.log(b);

  var r = '';
  while (b.length) {
    var t = b.pop();
    r = r + "-".repeat(t) + ".";
  }
  return r;
}
 
// Everything below this line was lost due to cosmis radiation. The engineer who knows
// where the backups are stored already left.
function _decode(input) {
  var b = [];
  
  // Reverse r logic
  dot_split = input.split(".")
  console.log(dot_split);
  for (var i=0; i<dot_split.length; i++) {
      if (dot_split[i].length) {
         b.push(dot_split[i].match(/-/g).length);
      }
  }

  input = ["0","1","2","3","4","5","6","7","8","9","a","b","c","d","e","f","g","h","i","j","k","l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "-"];
  input = input.join('');
  var dick={};
  for (var i=0; i<input.length; i++) {
    var t = input.charCodeAt(i);
    var a=[];
    console.log(input[i],t);
    for (var j=0; j<8; j++) {
      //console.log(t >> j);
      //console.log((t >> j) & 1)
      //console.log(1 + j + (input.length - 1 - i) * 8)
      if ((t >> j) & 1) {
        //console.log(t >> j);
        //console.log((t >> j) & 1)
        //console.log(1 + j + (input.length - 1 - i) * 8)
        a.push(1 + j + (1 - 1 - 0) * 8);
      }
    }
    dick[a.join('')] = input.charAt(i);
    //console.log(a);
  }
  console.log(dick);

  b = b.sort(function(a, b){return a-b});
  console.log(b);
  var output = [];
  while (b.length) {
     var less_than_8 = [];
     var stop = 0;
     for (var p=0; p<b.length; p++) {
         if (b[p] > 8) {
             b[p] = b[p] - 8;
         } else {
             less_than_8.push(b[p]);
             stop = p;
         }
     }
     b = b.slice(stop+1);
     console.log(less_than_8);
     console.log(b);
     output.push(dick[less_than_8.sort(function(a, b){return a-b}).join("")]);
  }
  console.log(output);

  return output.reverse().join("");

}
</script>

Result

Program

Square Ctf 2018

1. Misc – MATH category: Captcha

Concept

Given

Thoughts –> Think, Think, Think

Steps

Program:

Terminal Output showing the work

Flag Obtained

RESULT

Concept

Given

Think, Think, Think

Steps

Details:

Program

Output Flag - Hidden password can be obtained by looking at the source for the masked/hidden field

Program output - Based on the relation boil the data down to possible sets

Result

3. Programming category: dot-n-dash

Concept

Given

Think, Think, Think

Steps

Code

Result

Recent Posts

Tags

Square Ctf 2018

1. Misc – MATH category: Captcha

Concept

Given

Thoughts –> Think, Think, Think

Steps

Program:

Terminal Output showing the work

Flag Obtained

RESULT

2. GDPR category: deAnonymization

Concept

Given

Think, Think, Think

Steps

Details:

Program

Output Flag - Hidden password can be obtained by looking at the source for the masked/hidden field

Program output - Based on the relation boil the data down to possible sets

Result

3. Programming category: dot-n-dash

Concept

Given

Think, Think, Think

Steps

Code

Result

Recent Posts

Tags