Malware Analysis

Flawfinder v2.0.7 – Searches through C/C++ source code looking for potential security flaws

To run flawfinder, simply give flawfinder a list of directories or files. For each directory given, all files that have C/C++ filename extensions in that directory (and its subdirectories, recursively) will be examined.

Thus, for most projects, simply give flawfinder the name of the source code’s topmost directory (use ‘‘.’’ for the current directory), and flawfinder will examine all of the project’s C/C++ source code. Flawfinder does not require that you be able to build your software, so it can be used even with incomplete source code. If you only want to have changes reviewed, save a unified diff of those changes (created by GNU “diff -u” or “svn diff” or “git diff”) in a patch file and use the −−patch (−P) option.

Flawfinder will produce a list of ‘‘hits’’ (potential security flaws, also called findings), sorted by risk; the riskiest hits are shown first. The risk level is shown inside square brackets and varies from 0, very little risk, to 5, great risk. This risk level depends not only on the function but on the values of the parameters of the function. For example, constant strings are often less risky than fully variable strings in many contexts, and in those contexts, the hit will have a lower risk level. It knows about gettext (a common library for internationalized programs) and will treat constant strings passed through gettext as though they were constant strings; this reduces the number of false hits in internationalized programs. Flawfinder will do the same sort of thing with _T() and _TEXT(), common Microsoft macros for handling internationalized programs. It correctly ignores text inside comments and strings. Normally flawfinder shows all hits with a risk level of at least 1, but you can use the −−minlevel option to show only hits with higher risk levels if you wish. Hit descriptions also note the relevant Common Weakness Enumeration (CWE) identifier(s) in parentheses, as discussed below. Flawfinder is officially CWE-Compatible. Hit descriptions with “[MS-banned]” indicate functions that are on the banned list of functions released by Microsoft; see http://msdn.microsoft.com/en-us/library/bb288454.aspx for more information about banned functions.

How does Flawfinder Work?

Flawfinder works by using a built-in database of C/C++ functions with well-known problems, such as buffer overflow risks (e.g., strcpy(), strcat(), gets(), sprintf(), and the scanf() family), format string problems ([v][f]printf(), [v]snprintf(), and syslog()), race conditions (such as access(), chown(), chgrp(), chmod(), tmpfile(), tmpnam(), tempnam(), and mktemp()), potential shell metacharacter dangers (most of the exec() family, system(), popen()), and poor random number acquisition (such as random()). The good thing is that you don’t have to create this database – it comes with the tool.

It then takes the source code text, and matches the source code text against those names, while ignoring text inside comments and strings (except for flawfinder directives). Flawfinder also knows about gettext (a common library for internationalized programs), and will treat constant strings passed through gettext as though they were constant strings; this reduces the number of false hits in internationalized programs.

Flawfinder produces a list of “hits” (potential security flaws), sorted by risk; by default, the riskiest hits are shown first. This risk level depends not only on the function but on the values of the parameters of the function. For example, constant strings are often less risky than fully variable strings in many contexts. In some cases, flawfinder may be able to determine that the construct isn’t risky at all, reducing false positives.

It gives better information – and better prioritization – than simply running “grep” on the source code. After all, it knows to ignore comments and the insides of strings, and it will also examine parameters to estimate risk levels. Nevertheless, flawfinder is fundamentally a naive program; it doesn’t even know about the data types of function parameters, and it certainly doesn’t do control flow or data flow analysis (see the references below to other tools, like SPLINT, which do deeper analysis). I know how to do that, but doing that is far more work; sometimes all you need is a simple tool. Also, because it’s simple, it doesn’t get as confused by macro definitions and other oddities that more sophisticated tools have trouble with. It can analyze software that you can’t build; in some cases, it can analyze files you can’t even locally compile.

Installation

pip install flawfinder

Usage

flawfinder [--help | -h] [--version] [--listrules]
  [--allowlink] [--followdotdir] [--nolink]
           [--patch filename | -P filename]
  [--inputs | -I] [--minlevel X | -m X]
           [--falsepositive | -F] [--neverignore | -n]
  [--context | -c] [--columns | -C] [--dataonly | -D]
           [--html | -H] [--immediate | -i] [--singleline | -S]
           [--omittime] [--quiet | -Q]
  [--loadhitlist F] [--savehitlist F] [--diffhitlist F]
  [--] [source code file or source root directory]+

  The options cover various aspects of flawfinder as follows.

  Documentation:
  --help | -h Show this usage help.
  --version   Show version number.
  --listrules List the rules in the ruleset (rule database).

  Selecting Input Data:
  --allowlink Allow symbolic links.
  --followdotdir
              Follow directories whose names begin with ".".
              Normally they are ignored.
  --nolink    Skip symbolic links (ignored).
  --patch F | -P F
              Display information related to the patch F
              (patch must be already applied).

  Selecting Hits to Display:
  --inputs | -I
              Show only functions that obtain data from outside the program;
              this also sets minlevel to 0.
  -m X | --minlevel=X
              Set minimum risk level to X for inclusion in hitlist.  This
              can be from 0 (``no risk'')  to  5  (``maximum  risk'');  the
              default is 1.
  --falsepositive | -F
              Do not include hits that are likely to be false  positives.
              Currently,  this  means  that function names are ignored if
              they're not followed by "(", and that declarations of char-
              acter  arrays  aren't noted.  Thus, if you have use a vari-
              able named "access" everywhere, this will eliminate  refer-
              ences  to  this ordinary variable.  This isn't the default,
              because this  also  increases  the  likelihood  of  missing
              important  hits;  in  particular, function names in #define
              clauses and calls through function pointers will be missed.
  --neverignore | -n
              Never ignore security issues, even if they have an ``ignore''
              directive in a comment.
  --regex PATTERN | -e PATTERN
              Only report hits that match the regular expression PATTERN.

  Selecting Output Format:
  --columns | -C
              Show  the  column  number  (as well as the file name and
              line number) of each hit; this is shown after the line number
              by adding a colon and the column number in the line (the first
              character in a line is column number 1).
  --context | -c
              Show context (the line having the "hit"/potential flaw)
  --dataonly | -D
              Don't display the headers and footers of the analysis;
              use this along with --quiet to get just the results.
  --html | -H
              Display as HTML output.
  --immediate | -i
              Immediately display hits (don't just wait until the end).
  --singleline | -S
              Single-line output.
  --omittime  Omit time to run.
  --quiet | -Q
              Don't display status information (i.e., which files are being
              examined) while the analysis is going on.
  --error-level=LEVEL
              Return a nonzero (false) error code if there is at least one
              hit of LEVEL or higher.  If a diffhitlist is provided,
              hits noted in it are ignored.
              This option can be useful within a continuous integration script,
              especially if you mark known-okay lines as "flawfinder: ignore".
              Usually you want level to be fairly high, such as 4 or 5.
              By default, flawfinder returns 0 (true) on a successful run.

  Hitlist Management:
  --savehitlist=F
              Save all hits (the "hitlist") to F.
  --loadhitlist=F
              Load hits from F instead of analyzing source programs.
  --diffhitlist=F
              Show only hits (loaded or analyzed) not in F.


  For more information, please consult the manpage or available
  documentation.

To Top

Pin It on Pinterest

Share This