Work with dynamic bypasses. Parsing. Regular expressions

Nowadays administrators of servers invent various methods of counteracting bots. In this article, we will consider what protection methods are used in dialogs for making bots life harder.

While I was developing and testing the Adrenalin script for completing quests for TT recipes, I came across several ways, which will be described below. For convenience, I will consider a dialogue with Jeremy, who takes these quests. For convenient viewing of dialogues in the process of work, you can use the Script Recorder.

1. Dynamically changing bypass. The simplest case is that the phrase we want to choose does not change, but with each new opening of the dialogue, the bypass that needs to be sent changes. Usually, this method is combined with some other:

<a action="bypass -h  0-739772929">[Delivery of Special Liquor(In progress)]</a><br>  // 0-739772929
<a action="bypass -h  01784106480">[Egg Delivery (In progress)]</a><br>   // 01784106480

<a action="bypass -h  0-739859206">[Delivery of Special Liquor (In progress)]</a><br>   // 0-739859206
<a action="bypass -h  01784133759">[Egg Delivery (In progress)]</a><br>    // 01784133759

2. Replacing one or more letters in the line. The length of the string does not change. As a rule, a couple of letters are replaced in the source line, so if you glance at it you may not even notice. For example, instead of "Eggs delivery" there will be "Eggs dilivery". This method is used in addition to the previous one, i.e. each time you open a dialog, the answer line, and bypass change. This method prevents us from searching for a bypass based on a previously known substring:

<a action="bypass -h  01736714927">[Delivery of Speciel Liquor]</a><br>   // 01736714927
<a action="bypass -h  0-1931249600">[Egh Delivery]</a><br>   // 0-1931249600

<a action="bypass -h 01736767322">[Deliveri of Special Liquor]</a><br>   // 01736767322
<a action="bypass -h 0-1931186782">[Egg Dellvery]</a><br>   // 0-1931186782

3. The use of "invisible" us special characters. And both inside the dialogue and inside the bypass. In fact, they are present in the dialogue, but when shielding (that is when this dialogue is shown to us) they are not displayed. This is due to non-standard encodings, who are very interested in Google. For the same reason, I cannot, for example, put the code of these lines on the site (these special characters will be automatically deleted), so we admire screenshots:

In a regular notebook:

In the Sublime Text:

Thus, the length of the desired substring changes every time, however, the sequence of characters in this substring is completely preserved.

Perhaps there are still other types of protection at the level of various modifications of the dialogues (captcha getting out when opening a dialogue is a separate conversation), but I have not met others yet.

So, now that the task is clear, you can think about how to find the cherished bypass's to send. Again there are several options, you can solve this without going beyond the Pos, Copy, Delete, Trim etc. However, it will be easier and more beautiful to use regular expressions (регулярные выражения). The fact that this and how they are compiled can google. In short, this is a search by templates, which allows us to parse the dialogues many times simpler.

How to use regular expressions?
Everything is simple, connect the module RegExpr in the uses section and use.

Example:

uses SysUtils, Classes, RegExpr;  // plug in module RegExpr

procedure PrintAllTags();   // print all designs with bypasses
var RegExp: TRegExpr; 
begin  
  RegExp:= TRegExpr.Create;
  RegExp.Expression:= '(<a *(.+?)</a>)|(<button *(.+?)>)';   
  if RegExp.Exec(Engine.DlgText) then 
    repeat Print(RegExp.Match[0]);
    until (not RegExp.ExecNext);  
  RegExp.Free;
end;

begin
  PrintAllTags();
end.

This simple code will find and print out to us all interesting bypasses:

And with the help of this function you can choose dialogs bypassing the 1st type of protection:

uses SysUtils, Classes, RegExpr;  // plug in module RegExpr

function Bypass(dlg: string): boolean;
var
  RegExp: TRegExpr;
  SL: TStringList;
  i: integer;
  bps: string;
begin
  Result:= true;                                            // set the default result
  RegExp:= TRegExpr.Create;                                 // initialize objects for further work
  SL:= TStringList.Create;
  
  RegExp.Expression:= '(<a *(.+?)</a>)|(<button *(.+?)>)';  // set regexp to search for all possible bypasses
  if RegExp.Exec(Engine.DlgText) then                       // if you found the desired pattern, then
    repeat SL.Add(RegExp.Match[0]);                         // fill our list with such matches
    until (not RegExp.ExecNext);                            // until the patterns run out

  for i:= 0 to SL.Count-1 do begin                          // now go over our list
    if (Pos(dlg, SL[ i ]) > 0) then begin                   // if the i-th line contains the required text, then
      RegExp.Expression:= '"bypass -h *(.+?)"';             // look for a text template with bypass
      if RegExp.Exec(SL[ i ]) then                          // and if found, then copy from it a piece of interest to us
        bps:= TrimLeft(Copy(RegExp.Match[0], 12, Length(RegExp.Match[0])-12));
    end;
  end;
  
  Print(bps);                                               // print out the final bypass
  if (Length(bps) > 0) then Engine.BypassToServer(bps);     // if its length is > 0, then send to server
  
  RegExp.Free;                                              // do not forget to free up memory
  SL.Free;
end;

begin
  Bypass('Fighter Scheme');   // directly the function call, it will find the corresponding bypass and send it
end.

On the rules for creating regular expressions is full of information on the internet, for example here

There are many possibilities, you can read the source code

I hope that parsing the received lines will not be difficult.