Weaknesses in the use of SQL

The main problem that leads to code injection – and obviously SQL injection too – is the way programming (and query) languages themselves inherently work.

Since commands are just strings of characters that are interpreted as code, and user input is made of text, we could, in principle, insert code syntax within user input. If not correctly validated and simply accepted without us applying any control, this injected code could result in the execution of arbitrary commands that have been manually inserted by a malicious user.

This is because a naïve string reader does not make any distinction between text and code as it is essentially binary data coded as text – the same is done from the standpoint of a computer program or an application. Usually, in order to inject specific instructions or code objects, specific characters are used to trick the parser – the software component in charge of reading the text input – into interpreting the inserted code as unintended commands. Traditionally, the most trivial way to inject code is by inserting the line termination character – the semicolon in most programming languages – so that, besides the intended operation, the new one is considered as an entirely different instruction. Other characters can be used to manipulate the application's behavior, such as the comment separator, which is used to exclude altogether parts of code following the instruction.

SQL is no exception to this: many techniques used in code injection also apply to SQL. In fact, this vulnerability was discovered over 20 years ago by commands being injected into SQL queries, resulting in unintended operations. We will see specific forms of this exploitation in later chapters, all of which can be used to cause damage to applications or to help the attacker gain strategic advantage, both in terms of data and in some cases accessing otherwise restricted systems.

Luckily, SQL injection only applies to applications that are poorly coded. Adding specific controls for the user-provided input – and inner application streams – can prevent this problem altogether. Besides improving the security controls on the input, dropping suspicious web traffic could also help avoid the exploitation of the vulnerability. Ideally, this being a coding error, you should develop secure code in accordance with the best practices available. Here are some general suggestions that will be further explored later in this book:

  • Do not allow unnecessary special characters in queries: Usually, it's through the use of special characters that SQL injection is enabled. If special characters are allowed in queries, those could also be encoded in a way that is not interpreted by SQL, thus foiling SQL injection attempts based on special characters such as string separators (single or double quote), instruction separators (semicolon), and comment separators.
  • Do not allow specific suspicious commands: Some commands are often used in SQL injection attacks. Allowing specific authorized commands only, through the means of a whitelist, helps us avoid the insertion of arbitrary commands within an application, according to the expected behavior of the software component.
  • Do not give carte blanche to the user: While we would love users to be respectful and responsible, to us, they could be anybody – even malicious users as far as we know. It's a good idea to limit their actions as much as possible, thereby never trusting user input. Query input should always be converted into parameters and serialized accordingly.

These points help in protecting against SQL injection, at least as a guideline. The topic of defending against SQL injection with a more low-level and specific meaning will be thoroughly examined in later chapters and sections of this book. In general, it's by enabling a security-driven approach to application coding that most vulnerabilities and security issues can be solved altogether. Also, including security controls during development can help save time and effort as reworking code can be much harder than writing the code from scratch with such controls that are included by design.